Woorpje is a string solver for bounded word equations, i.e., the length of the strings substituting each variable is upper bounded by a given integer. Using the existence of these bounds, and the fact that the alphabet of the strings is finite, allows us to reformulate the question is this equation solvable into a reachability check in a finite automaton. Preliminary experiments with a simplistic reachability search revealed, however, that the search space quickly becomes intractable. Thus, Woorpje translates the Bounded Word Equation problem further into a satisfiability problem of propositional logic, and uses the award-winning SAT-solver Glucose as its SAT-Solving backend.

Using Woorpje:

Woorpje is invoked from the command line with `./woorpje equation.eq` where `equation.eq` is the configuration file for Woorpje.

Simple Example

Using Woorpje to solve the equation `aX = Yb` is as simple as providing the below configuration file to Woorpje.

```Variables {XY}
Terminals {ab}
Equation: aX = Yb
SatGlucose(10)```

The lines `Variables` `Terminals` and `Equation` must be written in that order, and they have the obvious semantics. The last line `SatGlucose(10)` invokes the solver and pass the number 10 as parameter to the solver. The `10` informs the solver to try increase the bound exponentially ten times. If the parameter is `0`, Woorpje will guess a suitable value.

 Tip It is possible to specify a system of equations, where all equations must be true, by specifying several `Equation` lines

As can be seen from the output below, Woorpje finds a solution to the equation where it sets `X = b` and `Y = a`. Woorpje also provides a proof that the substitution is correct by outputting the left hand side and right hand side with substituted variables.

```woorpje 0.1 (Mar  4 2019)
Revision: 7621932
woorpje is a word equation solver based on Glucose.
Dependable Systems Group, Kiel Univeristy
Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers - {mku,dbp,the}@informatik.uni-kiel.de

SatSolver Ready
c last restart ## conflicts  :  0 2

Substitution:
X: b
Y: a

Equations after substition
ab == ab

Timing info
Encoding  : 0 milliseconds
Solving : 0 milliseconds
Overall Solving Time : 0 milliseconds

Found a solution```

Linear Constraints

Woorpje supports constraining the set of potential solutions by specifying constraints between linear arithmetic expressions over the length of the variables. As an example, consider that want solution to `aX = Yb` where `X` is longer than 5, we can add the constraint `LinConstraint: [ >= +1|X|, +5]` to the configuration file. Linear constraints must be specified after the word equations. Giving us the configuration file

```Variables {XY}
Terminals {ab}
Equation: aX = Yb
LinConstraint: [ >= +1|X|, +5]
SatGlucose(10)```
 Tip Linear Constraints are specified in prefix form where the comparison operator (`>=,⇐, >, <`) is first specified and after that the left hand side and then the right hand side of comparison.

The output from Woorpje

```woorpje 0.1 (Mar  4 2019)
Revision: 7621932
woorpje is a word equation solver based on Glucose.
Dependable Systems Group, Kiel Univeristy
Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers - {mku,dbp,the}@informatik.uni-kiel.de

SatSolver Ready
c last restart ## conflicts  :  0 12

Substitution:
X: baaaaaaab
Y: abaaaaaaa

Equations after substition
abaaaaaaab == abaaaaaaab

Timing info
Encoding  : 0 milliseconds
Encoding  : 0 milliseconds
Encoding  : 0 milliseconds
Solving : 0 milliseconds
Overall Solving Time : 0 milliseconds

Found a solution```

reveals that there is indeed a solution. It also shows, that Woorpje encoded the word equation problem into SAT three times.

Benchmarks

For evaluating the performance of Woorpje, we generated five collections of word equations benchmarks that exhibit a special structure. The equations are solved with Woorpje and state of the art string solvers (CVC4, Z3str, Z3Seq and Norn, Sloth). The benchmarks are available in our Availability section.

For each word equation we ran the solvers and noted if they gave the correct result, if they timed out and we measured the total time for solving all equations. The solvers were allowed to run for 30 seconds.

First Track

The first track (I) is based on completely random generated string, where randomly picked string of arbitrary length were replaced by variables. This guarantees the existence of a solution.The generated word equations have an upper bound of 15 variables, 10 constants, and length 300.

 Program Satis NSatis Error Unknown Timeout Total Time TimeWO woorpje 200 0 0 0 0 200 8.10 8.10 cvc4 182 0 0 0 18 200 543.32 3.32 z3str3 197 0 0 1 2 200 105.07 45.07 z3seq 183 0 0 0 17 200 545.24 35.24 norn 176 4 4 0 20 200 1037.63 437.63 sloth 101 0 0 0 99 200 3658.34 688.34

Second Track

The second track (II) is based on the idea in Proposition 1 of MFCS2017, where the equation

$X_naX_nbX_{n-1}bX_{n-2}\cdots bX_1 = aX_nX_{n-1}X_{n-1}bX_{n-2}X_{n-2}b \cdots b X_{1}X_{1}baa$ is shown to have a minimal solution of exponential length w.r.t. the length of the equation.

 Program Satis NSatis Error Unknown Timeout Total Time TimeWO woorpje 5 0 0 0 4 9 123.85 3.85 cvc4 1 0 0 0 8 9 240.03 0.03 z3str3 0 0 0 9 0 9 0.33 0.33 z3seq 9 0 0 0 0 9 1.81 1.81 norn 0 0 0 0 9 9 270.00 0.00 sloth 7 0 0 0 2 9 124.56 64.56

Third Track

The third track (III) is based on the second track, but each letter $b$ is replaced by the left hand side or the right hand side of some randomly generated word equation (e.g., with the methods from track (I))

 Program Satis NSatis Error Unknown Timeout Total Time TimeWO woorpje 147 42 0 0 11 200 341.74 11.74 cvc4 121 44 0 0 35 200 1055.24 5.24 z3str3 51 48 6 43 58 200 2089.92 349.92 z3seq 82 45 1 0 73 200 2199.99 9.99 norn 60 12 1 0 128 200 4038.83 198.83 sloth 126 7 12 0 67 200 2808.48 798.48

Fourth Track

The fourth track (IV) consists of a system of 100 small random word equations with at most 6 constant letters, 10 variables and length 60. The hard aspect of this track is solving multiple equations at the same time.

 Program Satis NSatis Error Unknown Timeout Total Time TimeWO woorpje 104 92 0 2 2 200 136.20 76.20 cvc4 76 96 0 0 28 200 925.13 85.13 z3str3 79 99 3 10 12 200 490.23 130.23 z3seq 97 97 1 0 6 200 200.09 20.09 norn 16 112 68 0 72 200 3216.95 1056.95 sloth 15 1 0 0 184 200 5615.81 95.81

Fifth Track

The fifth track (V) enriches a system of 30 word equations by suitable linear constraints, as presented in this paper. This track is influenced by the Kaluza benchmark set.

 Program Satis NSatis Error Unknown Timeout Total Time TimeWO woorpje 164 14 0 9 13 200 399.22 9.22 cvc4 156 23 0 0 21 200 635.97 5.97 z3str3 175 23 0 1 1 200 41.52 11.52 z3seq 170 23 0 0 7 200 217.35 7.35 norn 88 103 79 0 9 200 742.34 472.34 sloth 11 0 2 2 187 200 5750.79 140.79

Availability

Binary

We distribute the precompiled binary of Woorpje 0.2 for Linux here.

Benchmarks

The benchmarks in SMT2 and woorpje format are available here: