-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom iterator class for BFS successors return #185
Conversation
This commit changes the return type of the bfs_successors function to be a custom class BFSSuccessors. This new return class implements both the sequence protocol and iterator protocol. This means that aside from explicit type checking it should be backwards compatible with the list being previously returned. It can be used with either index based access or iterated over. This should be more efficient for large graphs because instead of doing the copy and type conversion and iterating over the entire nested Vec of results it instead does it per access (either via __getitem__ or __next__). It does add a small amount of overhead for smaller graphs but it is minimal since the function returns in microseconds in such cases so a 10-20% overhead is not a big deal. It's worth noting while this defers the type conversion, it does not defer execution like most python iterators normally do. When bfs_successors is called it will still always fully traverse the graph. However, in practice the bottleneck for the bfs_successor function wasn't actually the graph traversal, but instead the type conversion. Related to Qiskit#71
84da212
to
2e1a76a
Compare
Pull Request Test Coverage Report for Build 358619898
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Out of curiosity, would it be more natural to implement either the sequence protocol (keeping the lazy copy and type conversion) or the iterator protocol? Implementing both seems better for backwards compatibility, but it seems like the differences in the two protocols might lead non-obvious behavior (see one comment below.)
That was my first thought on this PR too, and I started with just that. But IIRC the issue wass that just the sequence protocol doesn't actually work with |
Ok, yeah I took a look at this locally and yeah we can remove the iter protocol implementation and it will still work. You'll need to call |
Using the sequence protocol we can still get an implicit iterator by just casting it on the python side. This will still get us the lazy type conversion but simplify the api and also make the behavior more consistent. At the same time to ensure we're handling negative indices correctly a test method is added to verify that a negative index access to the sequence works as expected.
Co-authored-by: Kevin Krsulich <[email protected]>
This commit adds a new return type NodeIndices that is used as the return type for functions that return Vec<usize> when its used as the return type to return a list of node indices. This custom type implements the Python sequence protocol and can be used as the list which was previously returned except for where explicit type checking was used. Just as in Qiskit#185 the advantage here is that for large lists of node indices the type conversion from a Vec<usize> to list(int) becomes a large bottleneck. This avoids that conversion and only does a usize to python int conversion on access. Related to Qiskit#71
* Add NodeIndices return type This commit adds a new return type NodeIndices that is used as the return type for functions that return Vec<usize> when its used as the return type to return a list of node indices. This custom type implements the Python sequence protocol and can be used as the list which was previously returned except for where explicit type checking was used. Just as in #185 the advantage here is that for large lists of node indices the type conversion from a Vec<usize> to list(int) becomes a large bottleneck. This avoids that conversion and only does a usize to python int conversion on access. Related to #71 * Add tests for comparison * Apply suggestions from code review Co-authored-by: Lauren Capelluto <[email protected]>
This commit adds 2 new return types EdgeList and WeightedEdgeList that are used as the return type for functions that return Vec<(usize, usize)> and Vec<(usize, usize, PyObject)> respectively. This custom type implements the Python sequence protocol and can be used as the list which was previously returned except for where explicit type checking was used. Just as in Qiskit#185 and Qiskit#198 the advantage here is that for large edge lists the conversion from Rust to a Python type becomes a large bottleneck. This avoids that conversion and only does it per element on access.
* Add EdgeList and WeightedEdgeList return types This commit adds 2 new return types EdgeList and WeightedEdgeList that are used as the return type for functions that return Vec<(usize, usize)> and Vec<(usize, usize, PyObject)> respectively. This custom type implements the Python sequence protocol and can be used as the list which was previously returned except for where explicit type checking was used. Just as in #185 and #198 the advantage here is that for large edge lists the conversion from Rust to a Python type becomes a large bottleneck. This avoids that conversion and only does it per element on access. * Add new return types to docs * Apply suggestions from code review Co-authored-by: Lauren Capelluto <[email protected]> * Fix docstring mistakes * Update in_edges and out_edges to use WeightedEdgeList
This commit changes the return type of the bfs_successors function to be
a custom class BFSSuccessors. This new return class implements both the
sequence protocol and iterator protocol. This means that aside from
explicit type checking it should be backwards compatible with the list
being previously returned. It can be used with either index based access
or iterated over.
This should be more efficient for large graphs because instead of doing
the copy and type conversion and iterating over the entire nested Vec of
results it instead does it per access (either via
__getitem__
or__next__
). It does add a small amount of overhead for smaller graphs butit is minimal since the function returns in microseconds in such cases
so a 10-20% overhead is not a big dea).
It's worth noting while this defers the type conversion, it does not
defer execution like most python iterators normally do. When
bfs_successors is called it will still always fully traverse the graph.
However, in practice the bottleneck for the bfs_successor function
wasn't actually the graph traversal, but instead the type conversion.
Related to #71