-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve UF implementation #798
Comments
I have spent some more time implementing a better version of this, but hit another blocker. The crux of it is that Alt-Ergo has many copies of the union-find data structure (up to 8 when using Tableaux-CDCL, although at most 5 are used at once), which is mostly not an issue (although it explains some weird behaviors I was seeing), except for one specific use. This use is
The issue with this approach is the part where we copy The other issue with this approach has nothing to do with persistent union-finds: indeed, it means that most of the theory work that Alt-Ergo does is performed twice — once in This poses the question of when to make boolean decisions vs theory decisions, and another host of subtle points to consider, so this becomes a more complex task than just changing the union-find implementation. There is one "quick fix" which could be to locally replay the choices when performing a case split and which I will investigate if it is acceptable performance-wise on the current code, which would allow to proceed — although in any case moving these decisions towards the SAT in some way seems like a good idea. |
The Union-Find module is implemented with a mapping from semantic values to their representative. This means that accessing the representative is an O(log n) operation, where
n
is the number of live semantic values (roughly, number of distinct terms).Benchmarking a few problems reveals that Alt-Ergo can easily spend 5-10% of its time just looking for representatives, and on long-running problems where we discover many terms, this can get upwards of 20%. Using a more efficient implementation of the Union-Find structure could bring tangible performance benefits: for instance, a properly implemented persistent version of Tarjan's Union-Find data structure would bring down the time to access the representatives to (amortized) O(α(n)) i.e. almost constant time.
(Note that these numbers include both the term → semantic value and semantic value → representative mappings, and are thus slightly inflated — in reality, they are probably closer to 3-8% and 15%. An improved union-find representation would only affect the semantic value → representative mapping, not the term → semantic value mapping. However, the term → semantic value mapping is independent from the environment (it always maps
t
toX.make t
, and may no longer be needed at all now that we have the global cache onX.make
).After trying such an implementation, there are a few caveats:
Ccx
andTheory
modules.The text was updated successfully, but these errors were encountered: