Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word eval refactor #1136

Merged
merged 46 commits into from
Apr 14, 2021
Merged

Word eval refactor #1136

merged 46 commits into from
Apr 14, 2021

Conversation

robdockins
Copy link
Contributor

Refactor various aspects of the evaluator, mostly dealing with the packed word representation, but also aiming to remove the eta-delay mechanism, which should no longer be required with the current evaluation semantics. The main trick, I think, is to find a way to implement sufficiently-lazy sequence primitives without also introducing unnecessary unpacking or thunking.

@robdockins
Copy link
Contributor Author

I think this refactoring has reached a pretty good point to merge. The main goals of this were to factor out the SeqMap and WordValue types into separable modules with abstract APIs, remove the legacy eta-delay mechanism, and to fix such strictness bugs as I could without harming performance. This should help enable the more aggressive performance-related refactoring stages I have in mind, as well as eventually allowing better reuse inside the SAWCore evaluator. As a side benefit, I was able to merge more of the primitive operations to use identical code paths; all the shift operations now use unified code paths. Among the core operations, only sequence indexing and update have separate definitions anymore.

With the performance considerations in mind, this PR contains a lot of small, self-contained commits and I spent a lot of time performance testing after each change. It should be relatively straightforward to bisect for performance problems inside this PR, as I believe each commit leaves the project in a testable state. At the end, the benchmarks I tried ended up performance neutral, or slightly faster than they started, with the exception of the Karatsuba example, which slowed down somewhat due to a lazier # operator.

Along the way, I generalized the type of the take operation as discussed in issue #1064. This required making take and drop primitive instead of splitAt, which will have minor downstream consequences for cryptol-saw-core.

The next stages of refactoring are quite a bit more speculative, but should hopefully be largely contained within the SeqMap and WordValue modules.

@robdockins robdockins marked this pull request as ready for review April 1, 2021 18:18
@robdockins robdockins requested review from brianhuffman and yav April 1, 2021 18:18
@robdockins robdockins force-pushed the word-eval-refactor branch from 0c4c025 to d842145 Compare April 1, 2021 18:30
@robdockins
Copy link
Contributor Author

I think this last series of patches lets us finally close out the remaining known strictness bugs in the evaluator, #640, #619, and #422.

@robdockins robdockins force-pushed the word-eval-refactor branch 2 times, most recently from b39bab1 to 7bcc47e Compare April 6, 2021 17:43
Instead of every `WordValue` being wrapped in the `SEval` monad,
create a separate `ThunkWordVal` constructor that captures exactly
when lazy behavior is needed.  This simplifies several things, and
makes it more clear some places where we can actually be stricter
in the spine of word values.
It is now lazier in its arguments when required.
This brings the evaluator more in line with the reference semantics,
although there may still be some differences.
into the `Value` module.  Remove most uses of the `WordValue` constructors
in other modules, but don't quite make the type abstract yet.
This allows us to generalize the type of `take` and simplifies
the implementations.
Make sure they are sufficently lazy in their arguments.
Join is still to eager in some cases, but that will require
a larger refactoring.
Introduce the concept of "index segments", which allows us to
delay deciding if index values should be treated as packed
or unpacked until they are consumed in an indexing operation.
Memoization now only occurs in the barrel shifter when a symbolic
bit is encountered, instead of at every shift.
This breaks the test for issue211, but we were cheating
on it anyway.
Update it's type to prepare for upcoming changes
We should either decide that the semantics of `#` requires it to be
strict in its arguments, or we should fix the definition.
Replace it with `takeWordVal` and `dropWordVal` instead.
This should eliminte the possiblity of building up chains
of thunked word values.
This lets us share work, and lets us optimistically treat the word
as packed if we know it has already been forced.  It also lets
us delay unpacking decisions.
With the lazier join operator, this squashes a space leak in examples that
do a lot of hashing.
Also, test for looping computations in addition to `error` computations.
This doesn't let us cheat by eta-expanding the error, as we were
doing before.
I think we've finally cracked the nut on this strictness bug.
Fixes #640
The main benefit of this reorganization is that it
notices when a memoized `SeqMap` has been forced
at all of its locations.  This allows us to
discard the underlying computation, which will
never need to be consulted again.  This, in turn,
allows the garbage collector to reclaim the associated
memory and help prevent certain classes of space leaks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant