Woorpje is a string solver for bounded word equations, i.e., the length of the strings substituting each variable is upper bounded by a given integer. Using the existence of these bounds, and the fact that the alphabet of the strings is finite, allows us to reformulate the question is this equation solvable into a reachability check in a finite automaton. Preliminary experiments with a simplistic reachability search revealed, however, that the search space quickly becomes intractable. Thus, Woorpje translates the Bounded Word Equation problem further into a satisfiability problem of propositional logic, and uses the awardwinning SATsolver Glucose as its SATSolving backend.
Using Woorpje:
Woorpje is invoked from the command line with ./woorpje equation.eq
where equation.eq
is the configuration file for Woorpje.
Simple Example
Using Woorpje to solve the equation aX = Yb
is as simple as providing the below configuration file to Woorpje.
Variables {XY} Terminals {ab} Equation: aX = Yb SatGlucose(10)
The lines Variables
Terminals
and Equation
must be written in
that order, and they have the obvious semantics. The last line
SatGlucose(10)
invokes the solver and pass the number 10 as
parameter to the solver. The 10
informs the solver to try increase
the bound exponentially ten times. If the parameter is 0
, Woorpje will guess a suitable value.
Tip

It is possible to specify a system of equations, where all equations must be true, by specifying several Equation lines

As can be seen from the output below, Woorpje finds a solution to the equation where it sets X = b
and Y = a
.
Woorpje also provides a proof that the substitution is correct by outputting the left hand side and right hand side
with substituted variables.
woorpje 0.1 (Mar 4 2019) Revision: 7621932 woorpje is a word equation solver based on Glucose. Dependable Systems Group, Kiel Univeristy Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers  {mku,dbp,the}@informatik.unikiel.de SatSolver Ready c last restart ## conflicts : 0 2 Substitution: X: b Y: a Equations after substition ab == ab Timing info Encoding : 0 milliseconds Solving : 0 milliseconds Overall Solving Time : 0 milliseconds Found a solution
Linear Constraints
Woorpje supports constraining the set of potential solutions by specifying constraints between linear arithmetic expressions
over the length of the variables. As an example, consider that want solution to aX = Yb
where X
is longer than 5,
we can add the constraint LinConstraint: [ >= +1X, +5]
to the configuration file.
Linear constraints must be specified after the word equations. Giving us the configuration file
Variables {XY} Terminals {ab} Equation: aX = Yb LinConstraint: [ >= +1X, +5] SatGlucose(10)
Tip

Linear Constraints are specified in prefix form where the comparison operator (>=,⇐, >, < ) is first specified and
after that the left hand side and then the right hand side of comparison.

The output from Woorpje
woorpje 0.1 (Mar 4 2019) Revision: 7621932 woorpje is a word equation solver based on Glucose. Dependable Systems Group, Kiel Univeristy Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers  {mku,dbp,the}@informatik.unikiel.de SatSolver Ready c last restart ## conflicts : 0 12 Substitution: X: baaaaaaab Y: abaaaaaaa Equations after substition abaaaaaaab == abaaaaaaab Timing info Encoding : 0 milliseconds Encoding : 0 milliseconds Encoding : 0 milliseconds Solving : 0 milliseconds Overall Solving Time : 0 milliseconds Found a solution
reveals that there is indeed a solution. It also shows, that Woorpje encoded the word equation problem into SAT three times.
Benchmarks
For evaluating the performance of Woorpje, we generated five collections of word equations benchmarks that exhibit a special structure. The equations are solved with Woorpje and state of the art string solvers (CVC4, Z3str, Z3Seq and Norn, Sloth). The benchmarks are available in our Availability section.
For each word equation we ran the solvers and noted if they gave the correct result, if they timed out and we measured the total time for solving all equations. The solvers were allowed to run for 30 seconds.
First Track
The first track (I) is based on completely random generated string, where randomly picked string of arbitrary length were replaced by variables. This guarantees the existence of a solution.The generated word equations have an upper bound of 15 variables, 10 constants, and length 300.
Program 
Satis 
NSatis 
Error 
Unknown 
Timeout 
Total 
Time 
TimeWO 
woorpje 
200 
0 
0 
0 
0 
200 
8.10 
8.10 
cvc4 
182 
0 
0 
0 
18 
200 
543.32 
3.32 
z3str3 
197 
0 
0 
1 
2 
200 
105.07 
45.07 
z3seq 
183 
0 
0 
0 
17 
200 
545.24 
35.24 
norn 
176 
4 
4 
0 
20 
200 
1037.63 
437.63 
sloth 
101 
0 
0 
0 
99 
200 
3658.34 
688.34 
Second Track
The second track (II) is based on the idea in Proposition 1 of MFCS2017, where the equation
\(X_naX_nbX_{n1}bX_{n2}\cdots bX_1 = aX_nX_{n1}X_{n1}bX_{n2}X_{n2}b \cdots b X_{1}X_{1}baa\) is shown to have a minimal solution of exponential length w.r.t. the length of the equation.
Program 
Satis 
NSatis 
Error 
Unknown 
Timeout 
Total 
Time 
TimeWO 
woorpje 
5 
0 
0 
0 
4 
9 
123.85 
3.85 
cvc4 
1 
0 
0 
0 
8 
9 
240.03 
0.03 
z3str3 
0 
0 
0 
9 
0 
9 
0.33 
0.33 
z3seq 
9 
0 
0 
0 
0 
9 
1.81 
1.81 
norn 
0 
0 
0 
0 
9 
9 
270.00 
0.00 
sloth 
7 
0 
0 
0 
2 
9 
124.56 
64.56 
Third Track
The third track (III) is based on the second track, but each letter $b$ is replaced by the left hand side or the right hand side of some randomly generated word equation (e.g., with the methods from track (I))
Program 
Satis 
NSatis 
Error 
Unknown 
Timeout 
Total 
Time 
TimeWO 
woorpje 
147 
42 
0 
0 
11 
200 
341.74 
11.74 
cvc4 
121 
44 
0 
0 
35 
200 
1055.24 
5.24 
z3str3 
51 
48 
6 
43 
58 
200 
2089.92 
349.92 
z3seq 
82 
45 
1 
0 
73 
200 
2199.99 
9.99 
norn 
60 
12 
1 
0 
128 
200 
4038.83 
198.83 
sloth 
126 
7 
12 
0 
67 
200 
2808.48 
798.48 
Fourth Track
The fourth track (IV) consists of a system of 100 small random word equations with at most 6 constant letters, 10 variables and length 60. The hard aspect of this track is solving multiple equations at the same time.
Program 
Satis 
NSatis 
Error 
Unknown 
Timeout 
Total 
Time 
TimeWO 
woorpje 
104 
92 
0 
2 
2 
200 
136.20 
76.20 
cvc4 
76 
96 
0 
0 
28 
200 
925.13 
85.13 
z3str3 
79 
99 
3 
10 
12 
200 
490.23 
130.23 
z3seq 
97 
97 
1 
0 
6 
200 
200.09 
20.09 
norn 
16 
112 
68 
0 
72 
200 
3216.95 
1056.95 
sloth 
15 
1 
0 
0 
184 
200 
5615.81 
95.81 
Fifth Track
The fifth track (V) enriches a system of 30 word equations by suitable linear constraints, as presented in this paper. This track is influenced by the Kaluza benchmark set.
Program 
Satis 
NSatis 
Error 
Unknown 
Timeout 
Total 
Time 
TimeWO 
woorpje 
164 
14 
0 
9 
13 
200 
399.22 
9.22 
cvc4 
156 
23 
0 
0 
21 
200 
635.97 
5.97 
z3str3 
175 
23 
0 
1 
1 
200 
41.52 
11.52 
z3seq 
170 
23 
0 
0 
7 
200 
217.35 
7.35 
norn 
88 
103 
79 
0 
9 
200 
742.34 
472.34 
sloth 
11 
0 
2 
2 
187 
200 
5750.79 
140.79 