Allow some graph and digraph methods to take iterables/generators #1292

jamestwebber · 2024-10-08T04:04:16Z

I was using the library and found myself wanting to add nodes and edges from a generator, e.g. g.add_edges_from((i,j,w) for i in ...). It feels wasteful, and just unpythonic/unusual, to have to create a list for that type of thing. So I thought I would try modifying the code to make this possible, and it wasn't too bad.

I tweaked the following methods to take an Iterable rather than a Sequence, which allows the use of generators, including things like zip and map:

PyGraph.add_nodes_from
PyGraph.add_edges_from
PyGraph.add_edges_from_no_data
PyGraph.extend_from_edge_list
PyGraph.extend_from_weighted_edge_list
all of these again, for PyDiGraph

I didn't change the name of the argument, although obj_list isn't really accurate anymore. I figured it might desirable to keep it the same for compatibility, although it's a positional-only argument it doesn't matter too much?

I ran rustfmt locally
I have added the tests to cover my changes.
I have updated the documentation accordingly. (only the docstrings right now)
I have read the CONTRIBUTING document.

If this change is desired, I'll add some tests and update the documentation. I'm also happy to accept advice on how to implement this better--there might be a better way to write the signature such that pyo3 does more of the work for me, but I didn't see it.

CLAassistant · 2024-10-08T04:04:22Z

All committers have signed the CLA.

coveralls · 2024-10-08T14:23:39Z

Pull Request Test Coverage Report for Build 11279812325

Details

61 of 61 (100.0%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.002%) to 95.818%

Totals
Change from base Build 11172805539:	0.002%
Covered Lines:	18009
Relevant Lines:	18795

💛 - Coveralls

jamestwebber · 2024-10-08T14:55:44Z

For consistency I made the same modification to remove_nodes_from and remove_edges_from.

I tweaked the docstrings maybe not enough--they still say param list in them.

IvanIsCoding · 2024-10-08T16:40:13Z

From an ergonomics point of view, this PR is very welcome!

From an implementation point of view… I think we need to see if there are performance regressions. It might be faster, it might be slower. We’ll have to check

jamestwebber · 2024-10-08T20:07:13Z

Makes sense to do more comprehensive testing for performance. In my one-off test*, it looks like this version is the same performance when given an existing list, and using a generator is considerably faster (~2/3 of the time) than instantiating a temporary list.

* comparing the time of g.extend_from_edge_list([(i,j) for i in range(1000) for j in range(1000)]) to the generator equivalent.

IvanIsCoding · 2024-10-09T02:38:39Z

Makes sense to do more comprehensive testing for performance. In my one-off test*, it looks like this version is the same performance when given an existing list, and using a generator is considerably faster (~2/3 of the time) than instantiating a temporary list.

comparing the time of g.extend_from_edge_list([(i,j) for i in range(1000) for j in range(1000)]) to the generator equivalent.

That is a good starting point, I'd say even timeit.timeit with a Jupyter notebook get us going. I just want to avoid a situation like #1090 over again

One thing we have to make sure is separating the time Python take to create [(i,j) for i in range(1000) for j in range(1000)] from the benchmark, there might be a syntax error but it looks roughly like this:

import rustworkx as rx
import timeit

# create this outside the loop
data = [(i,j) for i in range(1000) for j in range(1000)]

timeit.timeit('g = rx.Graph(); g.extend_from_edge_list(data)', number=100)

If you move the data creation to inside the benchmark, you're checking how much time Python takes to build a list which for some of our users is inevitable if it is an API response or reading from a file.

IvanIsCoding · 2024-10-09T02:43:10Z

src/graph.rs

-        let mut out_list: Vec<usize> = Vec::with_capacity(obj_list.len());
-        for obj in obj_list {
+        let mut out_list: Vec<usize> = Vec::new();
+        for py_obj in obj_list.iter()? {


The main other alternatives I want to test are:

Calling https://docs.rs/pyo3/latest/pyo3/struct.Python.html#method.is_instance first and downcasting to list as an special case

Try extracting to a Vec first and then handling the case where it fails with an iterator

I'm not sure what you mean with the first alternative. The second one seems unnecessary given the current performance results. I suspect that PyO3 is basically doing all of this under the hood when you tell it to convert a list into a Vec<(usize, usize>).

Hmmm the is_instance method is just the equivalent of isintanceof(x, list) in Python. For some reason the documentation link I copied was wrong

jamestwebber · 2024-10-09T03:30:25Z

If you move the data creation to inside the benchmark, you're checking how much time Python takes to build a list which for some of our users is inevitable if it is an API response or reading from a file.

Ah yes, I should have been clearer about the testing (I didn't realize your "we need to check for performance" comment included me 😅). I realized I didn't have matching python versions so I had to do it again anyway.

I looked at a few versions with timeit, all preceded with g = rx.PyGraph() or g = rx.PyDiGraph():

g.extend_from_edge_list([(i,j) for i in range(1000) for j in range(1000)])
g.extend_from_edge_list(edge_list) when edge_list is already defined.
g.extend_from_edge_list((i,j) for i in range(1000) for j in range(1000))

For 1 and 2, I tried it with this PR and the current release (0.15.1) on 3.11.6. The results for PyGraph (same thing for the other one):

	0.15.1	this PR
1	157 ms ± 3.45 ms	154 ms ± 3.98 ms
2	58.7 ms ± 504 μs	52.1 ms ± 3.18 ms
3	n/a	105 ms ± 2.61 ms

I see some variance on my machine when I do this multiple times (I think from multitasking etc) but it's in this ballpark. So it doesn't look like existing code would be negatively affected, and there's an additional capability that can be more efficient.

IvanIsCoding · 2024-10-09T11:53:30Z

Oh either of us could have done the testing, sorry for the miscommunication.

Regardless, the numbers look great! We need to add tests but that should be straightforward, the hard part is covering each endpoint. One tip I will give is using methods like iter() and reversed() to get iterators from objects we already have in the tests

jamestwebber · 2024-10-10T18:27:28Z

Okay added some tests, I think that should cover everything. There are a lot of possible inputs at this point but hopefully they all act the same when it asks for an iterator.

One tip I will give is using methods like iter() and reversed() to get iterators from objects we already have in the tests

I didn't see a ton of re-use of static objects in the tests, I basically just repeated the patterns I saw there. I'm more familiar with pytest fixtures for that kind of thing--I imagine the tests could be much more compact with those but it's a big refactor.

IvanIsCoding · 2024-10-10T23:59:58Z

This is excellent, I will take it for here. I will add a release notes & refactor the tests slightly in a future PR, but looking forward to release this in 0.16!

* Add release notes for #1292 * Add release notes for degree centrality * Add centrality entries to the docs * Update releasenotes/notes/accept-generators-31f080871015233c.yaml Co-authored-by: Matthew Treinish <[email protected]> * Update releasenotes/notes/accept-generators-31f080871015233c.yaml --------- Co-authored-by: Matthew Treinish <[email protected]>

jamestwebber added 3 commits October 7, 2024 16:30

graph and digraph methods can use iterables

5542716

ran cargo fmt

299897e

linting

032fa6a

jamestwebber changed the title ~~Allow some graph and digraph methods can to iterables/generators~~ Allow some graph and digraph methods to take iterables/generators Oct 8, 2024

remove_edges_from and remove_nodes_from

0f1beec

cleaned up docs

ca0c8f2

jamestwebber marked this pull request as ready for review October 9, 2024 01:55

IvanIsCoding reviewed Oct 9, 2024

View reviewed changes

added tests

218d35a

IvanIsCoding mentioned this pull request Oct 10, 2024

Update PyO3 to 0.22 #1293

Merged

IvanIsCoding approved these changes Oct 11, 2024

View reviewed changes

IvanIsCoding added this pull request to the merge queue Oct 11, 2024

Merged via the queue into Qiskit:main with commit a38ed21 Oct 11, 2024
26 checks passed

jamestwebber deleted the allow-iterable-arguments branch October 11, 2024 02:46

IvanIsCoding mentioned this pull request Nov 18, 2024

Add release notes for 0.16 (and some better documentation) #1319

Closed

IvanIsCoding mentioned this pull request Nov 25, 2024

Expanded grid_graph functionality #1327

Open

IvanIsCoding added a commit to IvanIsCoding/rustworkx that referenced this pull request Dec 4, 2024

Add release notes for Qiskit#1292

2050a6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow some graph and digraph methods to take iterables/generators #1292

Allow some graph and digraph methods to take iterables/generators #1292

jamestwebber commented Oct 8, 2024 •

edited

Loading

CLAassistant commented Oct 8, 2024 •

edited

Loading

coveralls commented Oct 8, 2024 •

edited

Loading

jamestwebber commented Oct 8, 2024

IvanIsCoding commented Oct 8, 2024

jamestwebber commented Oct 8, 2024

IvanIsCoding commented Oct 9, 2024

IvanIsCoding Oct 9, 2024 •

edited

Loading

jamestwebber Oct 10, 2024

IvanIsCoding Oct 10, 2024

jamestwebber commented Oct 9, 2024

IvanIsCoding commented Oct 9, 2024

jamestwebber commented Oct 10, 2024

IvanIsCoding commented Oct 10, 2024

Allow some graph and digraph methods to take iterables/generators #1292

Allow some graph and digraph methods to take iterables/generators #1292

Conversation

jamestwebber commented Oct 8, 2024 • edited Loading

CLAassistant commented Oct 8, 2024 • edited Loading

coveralls commented Oct 8, 2024 • edited Loading

Pull Request Test Coverage Report for Build 11279812325

Details

💛 - Coveralls

jamestwebber commented Oct 8, 2024

IvanIsCoding commented Oct 8, 2024

jamestwebber commented Oct 8, 2024

IvanIsCoding commented Oct 9, 2024

IvanIsCoding Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

jamestwebber Oct 10, 2024

Choose a reason for hiding this comment

IvanIsCoding Oct 10, 2024

Choose a reason for hiding this comment

jamestwebber commented Oct 9, 2024

IvanIsCoding commented Oct 9, 2024

jamestwebber commented Oct 10, 2024

IvanIsCoding commented Oct 10, 2024

jamestwebber commented Oct 8, 2024 •

edited

Loading

CLAassistant commented Oct 8, 2024 •

edited

Loading

coveralls commented Oct 8, 2024 •

edited

Loading

IvanIsCoding Oct 9, 2024 •

edited

Loading