Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make weak table more deterministic #563

Open
bclement-ocp opened this issue Apr 6, 2023 · 3 comments
Open

Make weak table more deterministic #563

bclement-ocp opened this issue Apr 6, 2023 · 3 comments
Assignees
Labels

Comments

@bclement-ocp
Copy link
Collaborator

Alt-ergo uses a weak hash table to store all the expressions that are created for hash consing.

Unfortunately, this means that the behavior of alt-ergo depends on the behavior of OCaml's GC: if we create term T, then forget about it, and later re-create it, the newly created term will have a different ID depending on whether the GC has run in between.

This manifests as strange bugs where simple code changes can cause issues on problems that are apparently unrelated, because they change the moment where a GC is made.

One suggestion to solve the issue is to keep all the terms in a separate array (with strong refs!), and regularly (e.g. when the number of live terms is "big enough" for some definition of "big enough" that must not be a constant value), clear the array, manually trigger a Gc.full_major () that collects all dead terms, and re-build the arrays with the remaining keys in the weak table.

This means we won't keep terms alive for longer than they need to (or at least, not too much longer), but at least we are in full control of when we allow the GC to collect the terms, and consequently we Alt-Ergo's behavior no longer depends on the GC behavior.

I tried to "deterministically reproduce the non-determinism" by forcing Gc.full_major () often (i.e. every term creation or every 10s of term creation), in the hope that it changes things up, but all I was able to do was make Alt-Ergo slow as mollasses. I plan on implementing the suggestion described above.

@bclement-ocp
Copy link
Collaborator Author

Discussing with @Gbury and @Halbaroth we decided to try the following:

Since one desirable effect of the weak table is for older terms to be sorted before younger terms within a decision branch we can add an equivalent "age" counter in the Union-Find data structure instead of using the tag from hash-consing. Hopefully, this should reduce the impact of the weak table.

Note that this will have different behavior because the weak table also causes terms discovered in a previously closed branch to be already considered old the first time we see them in the current branch, but there does not seem to be very good reasons to behave this way (although thinking about it more, it may be that it has e.g. the effect that often-instantiated terms are considered "old"… but nothing will beat actual perf measurements).

@bclement-ocp
Copy link
Collaborator Author

I thought that this would only impact timeouts but...

[nix-shell:~/Code/alt-ergo]$ ./src/bin/text/Main_text.exe --frontend dolmen /tmp/cos1.ae.zip -t 20 -o smtlib2

; File "/tmp/cos1.ae.zip", line 2789, characters 1-93: Valid (0.4798) (8195 steps) (goal WP_parameter_my_cos1_ensures_default)
; 
unsat

[nix-shell:~/Code/alt-ergo]$ ./src/bin/text/Main_text.exe --frontend dolmen /tmp/cos1.ae.zip -t 20 -o smtlib2 --disable-weak

; File "/tmp/cos1.ae.zip", line 2789, characters 1-93: I don't know (0.5178) (8876 steps) (goal WP_parameter_my_cos1_ensures_default)
; 
unknown

(cos1.ae.zip is approx_cosine_c-Jessie_program-WP_parameter_my_cos1_ensures_default1.ae and this was run on #750 )

@bclement-ocp
Copy link
Collaborator Author

We randomly made progress on this in #1025. Turns out that we use the "weak table" order to determine which variables to fix in the simplex, which can easily snowball by changing the values we select for case splits.

Using the structural ordering instead has mixed results:

  • On benchtop-tests, the difference is +31-66 (corrected to +3-25)
  • On ae-format, the difference is +14-15 (corrected to +3-6)

We need to investigate stability (compare full runs with 4.14 and 5.0); if it actually improves stability significantly, we might feel OK with making this change in spite of the small regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants