-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements #253
base: ale/3.0
Are you sure you want to change the base?
Conversation
…new vector for each lookup
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## ale/3.0 #253 +/- ##
===========================================
- Coverage 81.12% 78.68% -2.44%
===========================================
Files 18 18
Lines 1499 1539 +40
===========================================
- Hits 1216 1211 -5
- Misses 283 328 +45 ☔ View full report in Codecov by Sentry. |
Benchmark Results
Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. |
src/EGraphs/egraph.jl
Outdated
@@ -119,13 +115,15 @@ mutable struct EGraph{ExpressionType,Analysis} | |||
uf::UnionFind | |||
"map from eclass id to eclasses" | |||
classes::Dict{IdKey,EClass{Analysis}} | |||
"vector of the original e-nodes" | |||
nodes::Vector{VecExpr} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if later on we can figure out something more efficient than vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this the vector is quite natural.
New enodes are only pushed at the end, and we can simply index enodes by their non-canonical index.
src/EGraphs/egraph.jl
Outdated
vec = get(g.classes_by_op, key, nothing) | ||
if isnothing(vec) | ||
vec = Id[eclass_id] | ||
g.classes_by_op[key] = vec | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not vec = get!(g.classes_by_op, key, Id[eclass_id])
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was that when Id[eclass_id] is supplied as an argument we always create a new vector that we immediately discard when it already exists.
Probably it would be best to use
get!(Vector{Id}, g.classes_by_op, key)
src/EGraphs/egraph.jl
Outdated
(node::VecExpr, eclass_id::Id) = pop!(g.pending) | ||
node = copy(node) | ||
enode_id = pop!(g.pending) | ||
node = copy(g.nodes[enode_id]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the copy necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. This will still undergo some changes.
src/EGraphs/egraph.jl
Outdated
@@ -42,7 +42,7 @@ they represent. The [`EGraph`](@ref) itself comes with pretty printing for human | |||
struct EClass{D} | |||
id::Id | |||
nodes::Vector{VecExpr} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not also have this be a vector of Id
, if we now store VecExpr
s in nodes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could be better. I tested this already, and reverted it again because it introduces an awkward indirection of g.nodes[class.nodes[i]]
in several places, most notably the ematching code. Local test showed no performance improvements of only storing the ids here. I'm not yet decided.
My current approach was to store the same nodes objects (VecExpr) in the nodes
vectors of eclasses and the egraph. As far as I understand, we do not need to keep the original (uncanonicalized) enodes.
Id[], | ||
UniqueQueue{Id}(), | ||
Vector{Id}(), | ||
Vector{Id}(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no more UniqueQueue
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore.
Reasoning:
- We need the nodes and eclass ids in the pending / analysis_pending vectors for rebuilding.
- Nodes may be canonicalized (while they are contained in the pending collections)
- UniqueQueue detects duplicates only when adding new entries, and it would not remove duplicates of nodes after canonicalization.
- the rewrite of the rebuilding code removes duplicates in the pending and analysis_pending vectors after each iteration instead.
@@ -297,14 +295,15 @@ function preprocess(e::Expr) | |||
end | |||
preprocess(x) = x | |||
|
|||
addexpr!(::EGraph, se::EClass) = se.id # TODO: why do we need this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for dynamic rules f(a,b) => a
. This will also work when a
is an EClass
@0x0f0f0f this PR is more exploratory, not yet ready for review. I'll prepare a new clean PR that contains only the necessary changes. However, it would be good if we can base this new PR on #249 which should be merged first. Could we please try to prepare #249 to merge it first, then we can continue with this branch. |
Follow up to #239.
The MT code deviates from most recent egg code slightly since the changes from egraphs-good/egg#291 which mentiones "There are also memory usage/performance advantages since EClass.parents, EGraph.pending, and EGraph.analysis_pending, no longer need to store nodes."
This PR is exploratory. We should split things up in smaller self-contained PRs later.