Rosette is a solver-aided language. Its documentation has good overview of how it can be used. It has been used to find bug in eBPF, reasoning about SQL equivalence and in many other ways. I have covered Rosette previously in this blog by using it to solve a puzzle. Rosette is implemented in Racket which has extensive support for building programming languages. In this blog, we will try to peek into how Rosette works through following code.
#lang rosette/safe
(define-symbolic x y integer?)
(define (double x) (+ x x))
(define lst
(filter even? (list x y)))
(define sol
(solve (begin (assert (< 0 x))
(assert (< (double x) y))
(assert (= 1 (length lst))))))
(when (sat? sol)
(evaluate (list x y) sol))
Syntax of Rosette is similar to Racket with additional definitions
like define-symbolic
, solve
,
assert
, evaluate
etc. Creating language which
looks like it is breeze in Racket with help of syntax/module-reader,
which Rosette makes use of. Rosette overrides various definitions of
Racket to work with symbolic values. Internally these are defined with
name identical to their Racket counterpart but prefixed with
@
. Above code will look like following after removing that
masking.
#lang racket
(require rosette/base/adt/list)
(require rosette/base/core/bool)
(require rosette/base/core/numerics)
(require rosette/base/core/real)
(require rosette/base/form/define)
(require rosette/query/form)
(require rosette/query/eval)
(require rosette/solver/solution)
(define-symbolic x y @integer?)
(define (double x) (@+ x x))
(define lst
(@filter @even? (@list x y)))
(define sol
(solve (begin (@assert (@< 0 x))
(@assert (@< (double x) y))
(@assert (@= 1 (@length lst))))))
(when (sat? sol)
(evaluate (@list x y) sol))
Rosette lifts some types (integer?
,
boolean?
etc) so that symbolic constant can be defined for
those. Inside Rosette these types (@integer?
,
@boolean?
etc) are represented as struct
which
holds type specific methods like type-cast
,
type-name
and solver specific methods like
solveable-default
, solveable-range
. Since
@integer?
is exposed as integer?
at surface
syntax, it should work as predicate that is (integer? 1)
or
(integer? x)
(x
is symbolic integer) should
result into true. This is achieved through struct-index-field
which allows struct
to be used as function.
To introduce symbolic constant in Rosette
define-symbolic
macro is used.
(define-symbolic x integer?)
expands to
(define-values (x) ((make-const #'x integer?))
. All lifted
definitions do as much as concrete evaluation as possible. For example
(+ 1 2)
evaluates to 3
whereas
(+ 1 x 3)
evaluates to (+ 4 x)
. Further it
reduces symbolic expression in canonical form by sorting subexpressions
based on term order. Term order is natural number which accounts for
term introduction order in program. In above program term order of
x
is 0 whereas it is 1 for y
, making
(+ x y)
canonical form of (+ y x)
. This makes
SMT-Lib encoding small. While generating SMT-Lib Rosette will only
output (define-fun e1 () Int (+ 4 c0))
(c0
is
symbolic integer representing x
in SMT-Lib) even when
surface syntax contains both expressions (+ x 4)
and
(+ 4 x)
. Another way to think about it is reasoning about
their equivalence is pushed up at Rosette level instead of pushing down
to SMT solvers. This allows Rosette to do some optimizations. For
example (if (= (+ x 4) (+ 4 x)) x 4)
reduces to
x
contrast that to naive SMT-Lib encoding which will
generate something like following
(declare-fun c0 () Int)
(define-fun e1 () Int (+ c0 4))
(define-fun e2 () Int (+ 4 c0))
(define-fun e3 () Int (ite (= e1 e2) c0 4))
Practically it means SMT-Lib generated by Rosette is easy to read and understand as it strips away various artificial encodings.
In presence of if
expression program can branch into
different paths depending on predicate. If predicate is concrete value
or can be optimized away to it, it is not problem otherwise Rosette uses
union
to encode it. union
is list of predicate
and expression pair. In our code lst
is a union as
filter
uses if
internally, even?
being its predicate. This is how lst union
looks like.
(union-contents lst)
>> (list
(list (&& (= 0 (remainder x 2)) (= 0 (remainder y 2))) x y)
(list (&& (! (= 0 (remainder x 2))) (! (= 0 (remainder y 2)))))
(list
(|| (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2)))
(&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2)))))
(ite* (⊢ (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2))) y)
(⊢ (&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2)))) x))))
In plain english, it is saying that
x
and y
are both even then
lst
is (list x y)
x
and y
are both odd then
lst
is (list)
x
is even and y
is odd then
lst
is (list x)
x
is odd and y
is even then
lst
is (list y)
Rosette maintains that predicates of union are mutually exclusive and
union length is within polynomial bound of program size. From user
perspective lst
is list so they should able to apply list
functions on it.
> (union-contents (cons 1 lst))
(list
(list (&& (= 0 (remainder x 2)) (= 0 (remainder y 2))) 1 x y)
(list (&& (! (= 0 (remainder x 2))) (! (= 0 (remainder y 2)))) 1)
(list
(|| (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2)))
(&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2)))))
(ite* (⊢ (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2))) 1 y)
(⊢ (&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2)))) 1 x))))
Sweet, Rosette goes under each guards (predicates) of
union
and applies cons
to underlying
expression. Finally we move to solve
part of program.
Whenever Rosette sees assert
statement it adds
corresponding boolean expression in vc
parameter. In our
program after last assert
statement, vc
contains following asserts
that need to be checked. Since
we haven’t used any assume
in our program
assumes
in vc
is trivial expression
#t
.
(&& (&& (|| (&& (= 0 (remainder x 2)) (= 0 (remainder y 2)))
(&& (! (= 0 (remainder x 2))) (! (= 0 (remainder y 2))))
(|| (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2)))
(&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2))))))
(&& (< 0 x) (< (+ x x) y)))
(= 1 (ite* (⊢ (&& (= 0 (remainder x 2)) (= 0 (remainder y 2))) 2)
(⊢ (&& (! (= 0 (remainder x 2))) (! (= 0 (remainder y 2)))) 0)
(⊢ (|| (&& (! (= 0 (remainder x 2))) (= 0 (remainder y 2)))
(&& (= 0 (remainder x 2)) (! (= 0 (remainder y 2))))) 1))))
Notice like cons
example above, Rosette strips away
length
function by applying it to expressions of
union
. Combination of union
and concrete
evaluation make it possible for Rosette to avoid encoding list and its
associated axioms in SMT-Lib. On syntax level, Rosette can handle
various ADT like struct
, box
,
vector
etc. But when Rosette emits SMT-Lib it only deals
with simple data types like Int
, Bool
,
BitVec
of SMT-Lib which is possible due to representation
of branch using union
and enough concrete evaluation of
definitions. Inside solve
, Rosette encodes vc
to SMT-Lib, send it to SMT solver and read result back.