-
Notifications
You must be signed in to change notification settings - Fork 3
Modeling partial maps
One of the main challenges in the design of the language is the semantics of maps and how to model them in an efficient and complete way in the SMT backend.
A map is a collection of (key,value) elements with a fixed domain syntactically determined by the type of the key, when where the permitted operations are:
- create v: creates a collection where every valid key is assigned the value v.
- get m k: k must be a constant, returns the value associated with key k in map m.
- set m k v: k must be a constant, sets the value associated with key k to v in map m.
- map f m: maps the function f over all values in m.
- mapIf p f m: for any (k,v) pair s.t. k satisfies p, the function f is applied to v.
- combine f m1 m2: creates a new map s.t. for any (k,v1) in m1, (k,v2) in m2 ==> (k, f v1 v2) in m.
The way these operations are currently implemented is through unrolling of maps to tuples. This has the advantage that we don't need to do any changes to the SMT encoding, but it leads to various inefficiencies. In particular, all operations require extensive unboxing/unrolling which creates huge expressions that slow down the various analyses NV performs (partial evaluation, compiling to SMT, etc.). For instance the operation m[k<-v] becomes:
match (m1,...,mk...,mn) with | (m1,...,mk,...,mn) -> (m1,...,v,...,mn)
Note the scrutinee in the match expression is unboxed (all tuples are in a normal-form, there are no variables x of type a*b). The expression is not reducible through partial evaluation because m1..mn are most likely variables, and although it would be possible to do this reduction it still costs. Other operations incur the same high cost of allocating long lists of NV expressions.
The current encoding requires that the get operation m[k] uses a constant as a key. It is translated as:
match (m1,...,mk,...,mn) with | (_,...,xk,_,...) -> xk
As the key is a constant, OCaml code computes the right index to project. However, this hinders expressivity. One way to work around is to emit NV code that performs this computation:
match m with
| (x0,...,xn) ->
match k with
| k0 -> x0
| k1 -> x1
...
where k0,k1... are the set of keys in the map. Again, this causes an explosion in the size of the map get expression.
- Use array theory: More suitable for total maps, map operations are natively supported and so expressions don't explode in size. But, complicates encoding of all expressions in the language, map operations cannot be performed over functions that have free variables (maybe this can be worked around?), mapIf will require a quantifier if modeling a total map.
- Use UF: Similar to the array theory in some sense, but we get more control over the representation and they may work better for partial maps.
The big drawback is that none of these can represent maps of complex types (i.e. tuples or options), unless they are broken apart piece-wise (which doesn't seem to scale). They can be used for maps where values are booleans though (i.e. for sets).
Sketch of potential implementation of partial maps using UFs, assuming a fixed domain:
- "Unroll" maps where necessary. For instance, it's not necessary when setting/getting a value but it is when mapping a function (because we have no way to apply a function in the SMT, so it must be inlined):
- create v ~> create v
- get m k ~> get m k
- set m k v ~> set m k v
- map f m ~> {(k1, f (m k1)), (k2, f (m k2)),...}
- mapIf p f m ~> {(k1, if p k1 then f (m k1) else m k1), ...}
- combine f m1 m2: {(k1, f (m k1) (m k2)),...}
- SMT compilation:
- [create v] = new uf g: key -> value; g k1 = v /\ g k2 = v /\ ...
- [get m x] = [m][x]
- [set m ki e] = new uf g: key -> value; g ki = e /\ g k1 = m k1 /\ ...
- [{(k1,e1),...}] = new uf g: key -> value; g k1 = e1 /\ ...
where {(k,v)} is a new map expression in the NV language. The values of the uninterpreted function outside the domain computed by NV are left unspecified to avoid using quantifiers.