Define models for set and multiset (#5963)

## Description This PR adds models for `set` and `multiset` and proves from those models the consistency of the `set`/`multiset` axioms in the Dafny prelude. This is accomplished using the extract-to-Boogie functionality that was added in #5621, which had also defined a model for `seq`. In addition to adding and using these models, the PR * Improves `Source/DafnyCore/Prelude/Makefile` - Put generated files into `Output` folder - Add `SourceModuleName` parameter to `dafny extract SourceModuleName TargetBoogieFile DafnyFiles` - Run extraction during CI and confirm that the checked-in `DafnyPrelude.bpl` has been generated. (This was missing from before, and when I started working on this PR, I found that Dafny had got out of synch with the models and was crashing on them.) * Fixes several occurrences in `DafnyPrelude.bpl` and the `BoogieGenerator` where a `Set#...` symbol was used instead of `ISet#...`. (This was previously undetected, because the `Set` and `ISet` types were both synonyms for `[Box]bool`. * Makes small changes to a few test files to make their proofs more stable. ## Details of the models In order to prove the extensionality property of `set` and `multiset`, it's important to have a canonical representation of sets and multisets in the model. To do this, I postulated (without a supporting model) a total order on the type `Box`. (Such postulates are justified by the Axiom of Choice. That seems okay, since Dafny's logic already allows you to prove the Axiom of Choice.) The aim was for the generated models to be exactly like the handwritten ones. This was not possible, for minor annoying reasons: - The exaction mechanism does not offer a way to output a Boogie type synonym (the handwritten prelude had contained `type Set = [Box]bool;`). - The handwritten declarations sometimes omitted parameter names and sometimes gave one `: Type` to two variables. Boogie allows these, but the extraction mechanism always generates `x: Type` for each parameter and bound variable. If token-by-token equality with the handwritten prelude was non-negotiable, then these minor annoyances could have been fixed by embellishing the extractor. However, there is a good reason to change this part of the axiomatizations. Previously, both `set` and `multiset` had been defined in terms of Boogie's built-in maps. Thus, to really prove the models consistent, one would have to prove that the additional axioms are a conservative extension of Boogie's maps. This is an unnecessary proof burden. So, instead, the new `set` and `multiset` models generate `Set` and `MultiSet` as uninterpreted types. There are three consequences of using uninterpreted types rather than Boogie's built-in map types: - This affects other places in the handwritten `DafnyPrelude.bpl` that assumed maps (for example, `Map#Domain(m)[x]` now is written `Set#IsMember(Map#Domain(m), x)`) and affects what `BoogieGenerator` emits. The PR includes these changes. They have been tested by running the test suite to make sure the correct symbols are emitted. - The `multiset` model used both Boogie's map-select (`_[_]`) operator and its map-store (`_[_:=_]`) operator. Thus, the "select of store" axioms had to be added to the model (where they are proved as lemmas). - The `set` model used only the map-select operator. However, the translation into Boogie also made use of Boogie's map comprehensions (`lambda` binders). This is convenient for `BoogieGenerator`, because Boogie performs lambda lifting on these `lambda` expressions. The right thing to do is to move this lambda lifting into Dafny instead of Boogie, and thus to not make use of Boogie's `lambda` expressions for Dafny set comprehensions. This PR does _not_ do that and instead leaves this as a `FIXME`. Instead, this PR adds a function `Set#FromBoogieMap` function, which converts a Boogie `[Box]bool` map to a `Set`. This is _not_ sound (indeed, this was not sound in the handwritten axioms, either, which I realized as I was going along). The reason is that one can construct a Boogie `[Box]bool` map that gives `true` for an infinite number of `Box`es. As a temporary solution (before we implement the necessary set-comprehension lifting directly in Dafny), this seems alright. First, this is no worse than before, since `lambda` expressions were used before, too. Second, to discover the unsoundness, one needs induction, and neither Boogie nor the SMT solver knows about induciton. Note that `iset` still uses the `[Box]bool` representation in Boogie, since `iset` is not part of this PR. ## Missed opportunities * The parameters and bound variables don't follow a consistent naming convention. When I started writing the models, I tried to follow exactly what was in the handwritten `DafnyPrelude.bpl`, rather than doing such clean-up. However, now that we're changing a lot of things anyway, it would have been nice to have cleaned up those names. * In a similar way, I'm thinking it would be nice to change the capitalization of `MultiSet` to `Multiset`. * There's a predicate `$IsGoodMultiSet` in the model. I _think_ it had been introduced in the handwritten `DafnyPrelude.bpl` to fix a soundness issue with the axioms. Surprisingly, I was able to define it as `true`. In other words, the predicate is effectively useless and the lemmas in the model all hold even if this predicate were removed. That would be nice, because it simplifies Dafny's axiomatization slightly. I did not think about this deeply (because I couldn't find the unsoundness Issue that I seem to recall that `$IsGoodMultiSet` was there to fix), but think the reason it's okay to leave it out now is that the new multiset axioms don't use Boogie's built-in map type and operators. ## Note about reviewing this PR The `set` and `multiset` models comprise more than 2000 lines of Dafny. However, they need not be reviewed line by line, since Dafny checks the proofs. It suffices to review that no cheating is going on (e.g., `assume` statements or body-less lemmas), and this is easy since the `Makefile` invokes `dafny extract` without `--allow-warnings` or the like. It would be nice if I could also say "The definition of the functions and the statement of the lemmas also don't have to be checked in details--it suffices to check that the extracted Boogie declarations are identical to the previous handwritten ones.". Alas, in addition to whitespace changes and the minor annoyances mentioned above, the new axioms use `Set#IsMember`, `MultiSet#Multiplicity`, and `MultiSet#UpdateMultiplicity` in lieu of `_[_]`, `_[_]`, and `_[_:=_]`, respectively. Regrettably, these changes make it much harder to compare the new definitions with the old. If it's of any consolation, I made the whitespace changes are very carefully as I could to avoid errors and did the renamings only after I had made the whitespace changes. Plus, the test suite still passes after these changes. A good thing to think about during the review is if it is really okay that `$IsGoodMultiSet` is defined to be `true`. It is not needed for proving the other axioms. So why had it been introduced in the first place? (I thought it was to fix a previously discovered unsoundness among the `multiset` axioms, but I didn't find the exact Issue and its PR fix.) <small>By submitting this pull request, I confirm that my contribution is made under the terms of the [MIT license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>
dafny-lang · Dec 9, 2024 · 0da35e1 · 0da35e1
1 parent db52871
commit 0da35e1
Show file tree

Hide file tree

Showing 43 changed files with 4,460 additions and 1,693 deletions.
diff --git a/.github/workflows/runtime-tests.yml b/.github/workflows/runtime-tests.yml
@@ -48,6 +48,7 @@ jobs:
           cd ./Source/DafnyCore
           make test
           make check-format
+          make check-extract
       - name: Test DafnyRuntime (C#, Rust)
         run: |
           cd ./Source/DafnyRuntime

diff --git a/Source/DafnyCore/CounterExampleGeneration/DafnyModel.cs b/Source/DafnyCore/CounterExampleGeneration/DafnyModel.cs
@@ -26,7 +26,7 @@ public class DafnyModel {
     public readonly List<PartialState> States = new();
     public static readonly UserDefinedType UnknownType =
       new(new Token(), "?", null);
-    private readonly ModelFuncWrapper fSetSelect, fSeqLength, fSeqIndex, fBox,
+    private readonly ModelFuncWrapper fSelect, fSetMember, fSeqLength, fSeqIndex, fBox,
       fDim, fIndexField, fMultiIndexField, fDtype, fCharToInt, fTag, fBv,
       fNull, fSetUnion, fSetIntersection, fSetDifference, fSetUnionOne,
       fSetEmpty, fSeqEmpty, fSeqBuild, fSeqAppend, fSeqDrop, fSeqTake,
@@ -50,7 +50,8 @@ public DafnyModel(Model model, DafnyOptions options) {
         CoreOptions.TypeEncoding.Arguments => 1,
         _ => 0
       };
-      fSetSelect = ModelFuncWrapper.MergeFunctions(this, new List<string> { "MapType0Select", "MapType1Select" }, 2);
+      fSelect = ModelFuncWrapper.MergeFunctions(this, new List<string> { "MapType0Select", "MapType1Select" }, 2);
+      fSetMember = new ModelFuncWrapper(this, "Set#IsMember", 2, 0);
       fSeqLength = new ModelFuncWrapper(this, "Seq#Length", 1, 0);
       fSeqBuild = new ModelFuncWrapper(this, "Seq#Build", 2, 0);
       fSeqAppend = new ModelFuncWrapper(this, "Seq#Append", 2, 0);
@@ -538,7 +539,7 @@ public void GetExpansion(PartialState state, PartialValue value) {
               return;
             }
 
-            foreach (var tpl in fSetSelect.AppsWithArg(0, value.Element)) {
+            foreach (var tpl in fSetMember.AppsWithArg(0, value.Element)) {
               var setElement = PartialValue.Get(UnboxNotNull(tpl.Args[1]), state);
               var containment = tpl.Result;
               if (containment is Model.Boolean boolean) {
@@ -584,8 +585,8 @@ public void GetExpansion(PartialState state, PartialValue value) {
             var mapDomain = fMapDomain.OptEval(current);
             var mapElements = fMapElements.OptEval(current);
             if (mapDomain != null && mapElements != null) {
-              foreach (var app in fSetSelect.AppsWithArg(0, mapDomain)) {
-                var valueElement = fSetSelect.OptEval(mapElements, app.Args[1]);
+              foreach (var app in fSetMember.AppsWithArg(0, mapDomain)) {
+                var valueElement = fSelect.OptEval(mapElements, app.Args[1]);
                 if (valueElement != null) {
                   valueElement = Unbox(valueElement);
                 }
@@ -615,7 +616,7 @@ public void GetExpansion(PartialState state, PartialValue value) {
             }
 
             var constantFields = GetDestructorFunctions(value.Element).OrderBy(f => f.Name).ToList();
-            var fields = fSetSelect.AppsWithArgs(0, heap, 1, value.Element).ToList();
+            var fields = fSelect.AppsWithArgs(0, heap, 1, value.Element).ToList();
 
             foreach (var fieldFunc in constantFields) {
               if (fieldFunc.OptEval(value.Element) is not { } modelElement) {
@@ -633,7 +634,7 @@ public void GetExpansion(PartialState state, PartialValue value) {
               return;
             }
 
-            foreach (var tpl in fSetSelect.AppsWithArg(0, fields.ToList()[0].Result)) {
+            foreach (var tpl in fSelect.AppsWithArg(0, fields.ToList()[0].Result)) {
               foreach (var fieldName in GetFieldNames(tpl.Args[1])) {
                 if (fieldName != "alloc") {
                   var field = PartialValue.Get(UnboxNotNull(tpl.Result), state);