-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up InverseCancellation when there's nothing to cancel #11211
Speed up InverseCancellation when there's nothing to cancel #11211
Conversation
In the cases when there is nothing to cancel in the DAGCircuit the InverseCancellation pass would iterate over the entire circuit twice, once to search for self inverse gates and then a second time to search for other inverse pairs. This is typically fairly fast because the actual search is done in rust, but there is a python callback function that is called for each node. Depending on the size of the circuit this could add up to a significant amount of time. This commit updates the logic in the pass to first check if there are any operations being asked to cancel for either stage, and if there are then only do the iteration if there are any gates in the circuit in the set of intructions that will be cancelled, which means we'll need to do an iteration. These checks are all O(1) for any sized dag, so they're much lower overhead and will mean the pass executes much faster in the case when there isn't anything to cancel.
One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 6803374734
💛 - Coveralls |
Building off of the previous commit this speeds up the inverse cancellation pass when only some of the inverse pairs are not present in the DAG, but others are present. In this case the previous commit would still iterate over the full dag multiple times even when we know some of the inverse pairs are not present in the DAG. This commit updates the logic to not call collect_runs() if we know it's going to be empty or there is no cancellation opportunity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a generally sensible improvement to me.
op_counts = dag.count_ops() | ||
if not self.self_inverse_gate_names.intersection(op_counts): | ||
return dag | ||
# Sets of gate runs by name, for instance: [{(H 0, H 0), (H 1, H 1)}, {(X 0, X 0}] | ||
gate_runs_sets = [dag.collect_runs([gate.name]) for gate in self_inverse_gates] | ||
for gate_runs in gate_runs_sets: | ||
for gate in self.self_inverse_gates: | ||
gate_name = gate.name | ||
gate_count = op_counts.get(gate_name, 0) | ||
if gate_count <= 1: | ||
continue | ||
gate_runs = dag.collect_runs([gate_name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we calculate an intersection with op_counts
at the top of the function, it feels like we could skip the continue
logic by iterating over the intersection rather than manually redoing it with retrieving the counts here - that'd also remove the early return, since the for loop would already serve that purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should in this case. I was thinking about it, but iterating over the intersection directly means the loop is iterating over a set and the execution order is potentially non-deterministic. I think this would be problematic from a reproducibility PoV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could always sort the set - it's rather unlikely to be large enough to matter.
op_counts = dag.count_ops() | ||
if not self.inverse_gate_pairs_names.intersection(op_counts): | ||
return dag | ||
|
||
for pair in self.inverse_gate_pairs: | ||
gate_0_name = pair[0].name | ||
gate_1_name = pair[1].name | ||
if gate_0_name not in op_counts or gate_1_name not in op_counts: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment.
This commit adds a fix for an issue that was caught while tuning the performance of the pass. If a parameterized self inverse was passed in the pass would incorrectly treat all instances of that parameterized' gate as being a self inverse without checking the parameter values. This commit corrects this oversight to handle this case to only cancel gates when the optimization is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine, thanks. It feels like the logic in the runners could be a shade simpler than it is, but we don't need to expand this PR to include ever more refactorings.
In Qiskit#11211, which was intended to improved the performance of the InverseCancellation pass, actually introduced a regression. This was because of the bugfix that was bundled in that PR to ensure we were checking parameter values for gates were correct. This check was done by doing a direct gate object equality between the self inverse gate selected and the gate object found in a run. This however was unecessary overhead as `Instruction.__eq__` has a large amount of overhead to compare gate equality more generally. In the context of a transpiler pass we can make assumptions that any gate with a given name is the same and directly compare the parameters if the names match the inverse pair. This will speed up the equality comparison for inverse pairs and fix the regression.
Summary
In the cases when there is nothing to cancel in the DAGCircuit the InverseCancellation pass would iterate over the entire circuit twice, once to search for self inverse gates and then a second time to search for other inverse pairs. This is typically fairly fast because the actual search is done in rust, but there is a python callback function that is called for each node. Depending on the size of the circuit this could add up to a significant amount of time. This commit updates the logic in the pass to first check if there are any operations being asked to cancel for either stage, and if there are then only do the iteration if there are any gates in the circuit in the set of instructions that will be cancelled, which means we'll need to do an iteration. These checks are all O(1) for any sized dag, so they're much lower overhead and will mean the pass executes much faster in the case when there isn't anything to cancel.
Details and comments