-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow some graph and digraph methods to take iterables/generators #1292
Allow some graph and digraph methods to take iterables/generators #1292
Conversation
Pull Request Test Coverage Report for Build 11279812325Details
💛 - Coveralls |
For consistency I made the same modification to I tweaked the docstrings maybe not enough--they still say |
From an ergonomics point of view, this PR is very welcome! From an implementation point of view… I think we need to see if there are performance regressions. It might be faster, it might be slower. We’ll have to check |
Makes sense to do more comprehensive testing for performance. In my one-off test*, it looks like this version is the same performance when given an existing list, and using a generator is considerably faster (~2/3 of the time) than instantiating a temporary list. * comparing the time of |
That is a good starting point, I'd say even One thing we have to make sure is separating the time Python take to create import rustworkx as rx
import timeit
# create this outside the loop
data = [(i,j) for i in range(1000) for j in range(1000)]
timeit.timeit('g = rx.Graph(); g.extend_from_edge_list(data)', number=100) If you move the data creation to inside the benchmark, you're checking how much time Python takes to build a list which for some of our users is inevitable if it is an API response or reading from a file. |
let mut out_list: Vec<usize> = Vec::with_capacity(obj_list.len()); | ||
for obj in obj_list { | ||
let mut out_list: Vec<usize> = Vec::new(); | ||
for py_obj in obj_list.iter()? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main other alternatives I want to test are:
- Calling https://docs.rs/pyo3/latest/pyo3/struct.Python.html#method.is_instance first and downcasting to list as an special case
- Try extracting to a
Vec
first and then handling the case where it fails with an iterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean with the first alternative. The second one seems unnecessary given the current performance results. I suspect that PyO3
is basically doing all of this under the hood when you tell it to convert a list into a Vec<(usize, usize>)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm the is_instance
method is just the equivalent of isintanceof(x, list)
in Python. For some reason the documentation link I copied was wrong
Ah yes, I should have been clearer about the testing (I didn't realize your "we need to check for performance" comment included me 😅). I realized I didn't have matching python versions so I had to do it again anyway. I looked at a few versions with
For 1 and 2, I tried it with this PR and the current release (
I see some variance on my machine when I do this multiple times (I think from multitasking etc) but it's in this ballpark. So it doesn't look like existing code would be negatively affected, and there's an additional capability that can be more efficient. |
Oh either of us could have done the testing, sorry for the miscommunication. Regardless, the numbers look great! We need to add tests but that should be straightforward, the hard part is covering each endpoint. One tip I will give is using methods like |
Okay added some tests, I think that should cover everything. There are a lot of possible inputs at this point but hopefully they all act the same when it asks for an iterator.
I didn't see a ton of re-use of static objects in the tests, I basically just repeated the patterns I saw there. I'm more familiar with |
This is excellent, I will take it for here. I will add a release notes & refactor the tests slightly in a future PR, but looking forward to release this in 0.16! |
* Add release notes for #1292 * Add release notes for degree centrality * Add centrality entries to the docs * Update releasenotes/notes/accept-generators-31f080871015233c.yaml Co-authored-by: Matthew Treinish <[email protected]> * Update releasenotes/notes/accept-generators-31f080871015233c.yaml --------- Co-authored-by: Matthew Treinish <[email protected]>
I was using the library and found myself wanting to add nodes and edges from a generator, e.g.
g.add_edges_from((i,j,w) for i in ...)
. It feels wasteful, and just unpythonic/unusual, to have to create a list for that type of thing. So I thought I would try modifying the code to make this possible, and it wasn't too bad.I tweaked the following methods to take an
Iterable
rather than aSequence
, which allows the use of generators, including things likezip
andmap
:PyGraph.add_nodes_from
PyGraph.add_edges_from
PyGraph.add_edges_from_no_data
PyGraph.extend_from_edge_list
PyGraph.extend_from_weighted_edge_list
PyDiGraph
I didn't change the name of the argument, although
obj_list
isn't really accurate anymore. I figured it might desirable to keep it the same for compatibility, although it's a positional-only argument it doesn't matter too much?If this change is desired, I'll add some tests and update the documentation. I'm also happy to accept advice on how to implement this better--there might be a better way to write the signature such that
pyo3
does more of the work for me, but I didn't see it.