-
Notifications
You must be signed in to change notification settings - Fork 185
Reduced time & memory footprint for Tarjans algorithm, fixed a bug where it was O(E^2) on star graphs. #1559
base: master
Are you sure you want to change the base?
Conversation
I worked on improving the strongly_connected_components function in this graph. This PR eliminates several of the arrays that were used for auxiliary data using various tricks inspired by https://homepages.ecs.vuw.ac.nz/~djp/files/IPL15-preprint.pdf , which reduces memory usage & allocations and improves cache efficiency. This seems to lead to a speedup in every category of random graphs in benchmarking. I would also subjectively claim that this version of the algorithm is more easily readable than Tarjan's original version. Also put up a gist of a few alternatives I tried out: https://gist.github.com/saolof/7b5d26a41e6a34ff2b3e76d3fc5541da
spelled function name correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Forgot to change the name of the method back to strongly_connected_components after working on it on my own machine with lightgraphs loaded.
Codecov Report
@@ Coverage Diff @@
## master #1559 +/- ##
==========================================
- Coverage 99.44% 99.31% -0.13%
==========================================
Files 106 106
Lines 5551 5568 +17
==========================================
+ Hits 5520 5530 +10
- Misses 31 38 +7 |
Counting downwards instead of upwards has the advantage that rindex becomes a lookup table for components, if we ever decide to return both. Also makes the algorithm invariant crystal clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it so that the visitation count is downwards and component_count is upwards. This has the advantage that v is in components[rindex[v]] (so rindex is a lookup table for components) if we decide to add a flag to return rindex. Also makes the algorithm invariant clearer.
Hi, thanks for your contribution - It's been a few years since I last looked at Tarjans algorithm, so I will first have to figure out again how it works to do some complete code review. In the meantime, could you add the code you used for benchmarking and the results that you got as a comment to this PR? |
src/connectivity.jl
Outdated
is_unvisited(data::AbstractVector,v::Integer) = iszero(data[v]) | ||
is_unvisited(data::Dict,v) = !haskey(data,v) | ||
|
||
@traitfn function strongly_connected_components(g::AG::IsDirected) where {T, AG <: AbstractGraph{T}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't hurt that you removed the condition that T
is a subtype of Integer
but currently the AbstractGraph
interface does not allow any other eltype
than some Integer
(and they are consecutive from 1 to nv(g)
), so you probably do not need to use a Dict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The dict is only ever used as a fallback if T is not an Integer (for example, if someone abused the interface by inheriting from AbstractGraph with T = Symbol), otherwise the code still uses vectors by default and does assume that the nodes are consecutive. The extra lines with the fallbacks are optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I that case, I would add the {T <: Integer}
back - T don't think we should handle the case that the interface is incorrectly implemented as it is not clear, where something could go wrong.
src/connectivity.jl
Outdated
component_count = 1 # Index of the current component being discovered. | ||
# Invariant 1: count is always smaller than component_count. | ||
# Invariant 2: if rindex[v] < component_count, then v is in components[rindex[v]]. | ||
# This trivially lets us tell if a node belongs to a previously discovered scc without any extra bits, just inequalities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# This trivially lets us tell if a node belongs to a previously discovered scc without any extra bits, just inequalities. | |
# This trivially lets us tell if a vertex belongs to a previously discovered scc without any extra bits, just inequalities. |
For consistency we should always use the term vertex
, although they are synonyms in the literatore
src/connectivity.jl
Outdated
is_unvisited(data::AbstractVector,v::Integer) = iszero(data[v]) | ||
is_unvisited(data::Dict,v) = !haskey(data,v) | ||
|
||
@traitfn function strongly_connected_components(g::AG::IsDirected) where {T, AG <: AbstractGraph{T}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mention some paper where you code your algorithm from. Lets add a reference to that paper to the docstring of that function.
src/connectivity.jl
Outdated
# Invariant 2: if rindex[v] < component_count, then v is in components[rindex[v]]. | ||
# This trivially lets us tell if a node belongs to a previously discovered scc without any extra bits, just inequalities. | ||
|
||
component_root = empty_graph_data(Bool,g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
component_root = empty_graph_data(Bool,g) | |
is_component_root = empty_graph_data(Bool,g) |
a Dict
of type Dict{T, Bool}
can usually be replaced by a Set{T}
. In this case, you could also think about using a Vector{Bool}
or a BitVector}
although with these you will waste some space in case we only have a few strongly connected components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's using a Vector{Bool} for now whenever T is an Integer. I'm planning to do some benchmarking with bitvector once I have a decent solution to make the loop nonquadratic.
The main issue with using anything that doesn't give you a pointer to a bool is that the boolean gets manipulated in a tight loop which is fast because of branch prediction/speculative execution. But of course adding a variable outside the loop and then storing its result is an option
src/connectivity.jl
Outdated
@@ -252,66 +253,63 @@ function strongly_connected_components end | |||
v = dfs_stack[end] #end is the most recently added item | |||
u = zero_t | |||
@inbounds for v_neighbor in outneighbors(g, v) | |||
if index[v_neighbor] == zero_t | |||
# unvisited neighbor found | |||
if is_unvisited(rindex,v_neighbor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if is_unvisited(rindex,v_neighbor) | |
if is_unvisited(rindex, v_neighbor) |
in general, would suggest to add a space after a comma in arguments lists for better readability and consistency with the other code.
Okay, here's some benchmark results. The first one is the previous library function. The second is what I currently have in this PR. Third one saves the iteration state when vertices become large enough, at the cost of a few extra operations for small vertices. Should I commit the third version since it is still quite competitive but solves the Star graph problem?
|
Ah, I hadn't reenabled
|
Fixed O(|E|^2) performance bug that used to be an issue for star graphs. Minimal change in performance for large random graphs, but significant speedup for graphs that have both lots of SCCs and high node orders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Committed the version that fixes star graph performance. The algorithm should now be provably O(|V| + |E|) for all graphs. Also included the changes raised during the previous review, and added a general description of the algorithm at the top.
Co-authored-by: Simon Schoelly <[email protected]>
Previous commit removed the if in front of iszero somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous commit replacing the u== zero_t with iszero(u) removed the if somehow
Slightly simplified logic and removed the need for zero_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly simplified the logic and removed the need for zero_t.
Trying to figure out what broke. Can elements of outneighbours be equal to nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugging latest commit.
testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testing
Set correct name on method
Okay, seems fixed. The only non-passing test in the last run is the diffusion simulation at https://github.com/JuliaGraphs/LightGraphs.jl/blob/master/test/traversals/diffusion.jl#L155 which seems to give false negatives occasionally according to comment. |
Added comments.
src/connectivity.jl
Outdated
|
||
# Required to prevent quadratic time in star graphs without causing type instability. Returns the type of the state object returned by iterate that may be saved to a stack. | ||
neighbor_iter_statetype(::Type{AG}) where {AG <: AbstractGraph} = Any # Analogous to eltype, but for the state of the iterator rather than the elements. | ||
neighbor_iter_statetype(::Type{AG}) where {AG <: LightGraphs.SimpleGraphs.AbstractSimpleGraph} = Int # Since outneighbours is an array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be done more cleanly by making this a @traitfn
and adding a trait like say RandomAccessNeighbours
to denote that the outnodes can be accessed randomly and that state when iterate()
-ing over them is an int.
We could also add a layer of dispatch to strongly connected components so we can feed the graph to the function that infers the state type, and let it attempt to iterate on the first node to get the correct type.
Added a dispatch to infer the types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the fallback more generic by using Base.Iterators.approx_iter_type().
https://gist.github.com/saolof/7b5d26a41e6a34ff2b3e76d3fc5541da
In general, the speedup seems to be particularly big whenever the weak connectivity of the digraph is large (i.e. depending on width vs height of the DFS forest). |
Some more benchmarks with more test cases (updated gist with them):
|
Changed everything to be T-valued. Partly to save space for small graphs represented using smaller types, and partly for correctness on machines where Int = Int32 and where the graph is large enough to require Int64s
I am very ok with this as long as it passes @simonschoelly 's muster. Thank you. |
I stopped reviewing this PR, as you where pushing quite a lot of commits, so I wanted to wait until you get that done. I will try to continue the review then this week. |
Hi all, Thanks for the PR and for the comprehensive review! I'm tracking - let me know when it's ready for a final review and merge. |
Hi all, Just wanted to ping on this. Is it ready for final review? |
Yes. Any further additions I may want to suggest (possibly including making it stable by doing the output in the exterior loop instead of the interior one, or adding some flag to return the rootindex array) would be in a separate PR. This is just a performance improvement that strictly avoids making any user-visible changes other than improving performance in the expensive cases, while enabling later PR's by ensuring that the data structures used internally are also useful to return. |
I worked on improving the performance of the strongly_connected_components function, and on fixing #1560 .
This PR eliminates several of the arrays that were used for auxiliary data using various tricks that were directly inspired by David J. Pearce's preprint over at https://homepages.ecs.vuw.ac.nz/~djp/files/IPL15-preprint.pdf , which reduces memory usage & allocations and improves cache efficiency. In particular, this lead to a significant speedup for dense random graphs in benchmarking. I would also subjectively claim that this version of the algorithm is more easily readable than Tarjan's original version.
In addition to this, it adds a new stack to keep track of the iteration state for nodes with a large number of outneighbours, specifically to fix #1560 . Interestingly, benchmarking revealed that the size threshold above which saving the iteration state is worth doing is surprisingly large (on the order of a thousand outneighbours), though of course that still applies to many real-world graphs.
I also put up a gist that anyone can use for some quick benchmarks:
https://gist.github.com/saolof/7b5d26a41e6a34ff2b3e76d3fc5541da
Furthermore, since it also computes a component table at the same time, adding a flag to return the component table would allow saving some work in other functions that compute one for the SCCs such as https://github.com/JuliaGraphs/LightGraphs.jl/blob/2a644c2b15b444e7f32f73021ec276aa9fc8ba30/src/connectivity.jl#L541 and places where that is called such as when computing transitive closures.