Woorpje is a string solver for bounded word equations, i.e., the length of the strings substituting each variable is upper bounded by a given integer. Using the existence of these bounds, and the fact that the alphabet of the strings is finite, allows us to reformulate the question is this equation solvable into a reachability check in a finite automaton. Preliminary experiments with a simplistic reachability search revealed, however, that the search space quickly becomes intractable. Thus, Woorpje translates the Bounded Word Equation problem further into a satisfiability problem of propositional logic, and uses the award-winning SAT-solver Glucose as its SAT-Solving backend.

Using Woorpje:

Woorpje is invoked from the command line with ./woorpje equation.eq where equation.eq is the configuration file for Woorpje.

Simple Example

Using Woorpje to solve the equation aX = Yb is as simple as providing the below configuration file to Woorpje.

Variables {XY}
Terminals {ab}
Equation: aX = Yb
SatGlucose(10)

The lines Variables Terminals and Equation must be written in that order, and they have the obvious semantics. The last line SatGlucose(10) invokes the solver and pass the number 10 as parameter to the solver. The 10 informs the solver to try increase the bound exponentially ten times. If the parameter is 0, Woorpje will guess a suitable value.

Tip
It is possible to specify a system of equations, where all equations must be true, by specifying several Equation lines

As can be seen from the output below, Woorpje finds a solution to the equation where it sets X = b and Y = a. Woorpje also provides a proof that the substitution is correct by outputting the left hand side and right hand side with substituted variables.

woorpje 0.1 (Mar  4 2019)
Revision: 7621932
woorpje is a word equation solver based on Glucose.
Dependable Systems Group, Kiel Univeristy
Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers - {mku,dbp,the}@informatik.uni-kiel.de

SatSolver Ready
c last restart ## conflicts  :  0 2


Substitution:
X: b
Y: a

Equations after substition
ab == ab

Timing info
Encoding  : 0 milliseconds
Solving : 0 milliseconds
Overall Solving Time : 0 milliseconds

Found a solution

Linear Constraints

Woorpje supports constraining the set of potential solutions by specifying constraints between linear arithmetic expressions over the length of the variables. As an example, consider that want solution to aX = Yb where X is longer than 5, we can add the constraint LinConstraint: [ >= +1|X|, +5] to the configuration file. Linear constraints must be specified after the word equations. Giving us the configuration file

Variables {XY}
Terminals {ab}
Equation: aX = Yb
LinConstraint: [ >= +1|X|, +5]
SatGlucose(10)
Tip
Linear Constraints are specified in prefix form where the comparison operator (>=,⇐, >, <) is first specified and after that the left hand side and then the right hand side of comparison.

The output from Woorpje

woorpje 0.1 (Mar  4 2019)
Revision: 7621932
woorpje is a word equation solver based on Glucose.
Dependable Systems Group, Kiel Univeristy
Mitja Kulczynski, Danny Boegsted Poulsen and Thorsten Ehlers - {mku,dbp,the}@informatik.uni-kiel.de

SatSolver Ready
c last restart ## conflicts  :  0 12


Substitution:
X: baaaaaaab
Y: abaaaaaaa

Equations after substition
abaaaaaaab == abaaaaaaab

Timing info
Encoding  : 0 milliseconds
Encoding  : 0 milliseconds
Encoding  : 0 milliseconds
Solving : 0 milliseconds
Overall Solving Time : 0 milliseconds

Found a solution

reveals that there is indeed a solution. It also shows, that Woorpje encoded the word equation problem into SAT three times.

Benchmarks

For evaluating the performance of Woorpje, we generated five collections of word equations benchmarks that exhibit a special structure. The equations are solved with Woorpje and state of the art string solvers (CVC4, Z3str, Z3Seq and Norn, Sloth). The benchmarks are available in our Availability section.

For each word equation we ran the solvers and noted if they gave the correct result, if they timed out and we measured the total time for solving all equations. The solvers were allowed to run for 30 seconds.

First Track

The first track (I) is based on completely random generated string, where randomly picked string of arbitrary length were replaced by variables. This guarantees the existence of a solution.The generated word equations have an upper bound of 15 variables, 10 constants, and length 300.

Track I

Program

Satis

NSatis

Error

Unknown

Timeout

Total

Time

TimeWO

woorpje

200

0

0

0

0

200

8.10

8.10

cvc4

182

0

0

0

18

200

543.32

3.32

z3str3

197

0

0

1

2

200

105.07

45.07

z3seq

183

0

0

0

17

200

545.24

35.24

norn

176

4

4

0

20

200

1037.63

437.63

sloth

101

0

0

0

99

200

3658.34

688.34

Second Track

The second track (II) is based on the idea in Proposition 1 of MFCS2017, where the equation

\(X_naX_nbX_{n-1}bX_{n-2}\cdots bX_1 = aX_nX_{n-1}X_{n-1}bX_{n-2}X_{n-2}b \cdots b X_{1}X_{1}baa\) is shown to have a minimal solution of exponential length w.r.t. the length of the equation.

Track 2

Program

Satis

NSatis

Error

Unknown

Timeout

Total

Time

TimeWO

woorpje

5

0

0

0

4

9

123.85

3.85

cvc4

1

0

0

0

8

9

240.03

0.03

z3str3

0

0

0

9

0

9

0.33

0.33

z3seq

9

0

0

0

0

9

1.81

1.81

norn

0

0

0

0

9

9

270.00

0.00

sloth

7

0

0

0

2

9

124.56

64.56

Third Track

The third track (III) is based on the second track, but each letter $b$ is replaced by the left hand side or the right hand side of some randomly generated word equation (e.g., with the methods from track (I))

Track 3

Program

Satis

NSatis

Error

Unknown

Timeout

Total

Time

TimeWO

woorpje

147

42

0

0

11

200

341.74

11.74

cvc4

121

44

0

0

35

200

1055.24

5.24

z3str3

51

48

6

43

58

200

2089.92

349.92

z3seq

82

45

1

0

73

200

2199.99

9.99

norn

60

12

1

0

128

200

4038.83

198.83

sloth

126

7

12

0

67

200

2808.48

798.48

Fourth Track

The fourth track (IV) consists of a system of 100 small random word equations with at most 6 constant letters, 10 variables and length 60. The hard aspect of this track is solving multiple equations at the same time.

Track 4

Program

Satis

NSatis

Error

Unknown

Timeout

Total

Time

TimeWO

woorpje

104

92

0

2

2

200

136.20

76.20

cvc4

76

96

0

0

28

200

925.13

85.13

z3str3

79

99

3

10

12

200

490.23

130.23

z3seq

97

97

1

0

6

200

200.09

20.09

norn

16

112

68

0

72

200

3216.95

1056.95

sloth

15

1

0

0

184

200

5615.81

95.81

Fifth Track

The fifth track (V) enriches a system of 30 word equations by suitable linear constraints, as presented in this paper. This track is influenced by the Kaluza benchmark set.

Track 5

Program

Satis

NSatis

Error

Unknown

Timeout

Total

Time

TimeWO

woorpje

164

14

0

9

13

200

399.22

9.22

cvc4

156

23

0

0

21

200

635.97

5.97

z3str3

175

23

0

1

1

200

41.52

11.52

z3seq

170

23

0

0

7

200

217.35

7.35

norn

88

103

79

0

9

200

742.34

472.34

sloth

11

0

2

2

187

200

5750.79

140.79

Availability

Binary

We distribute the precompiled binary of Woorpje 0.2 for Linux here.

Benchmarks

The benchmarks in SMT2 and woorpje format are available here:

People involved