-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid exploring extraneous minima in the cut-finder search space #585
Conversation
Pull Request Test Coverage Report for Build 9083375618Details
💛 - Coveralls |
@@ -149,6 +149,8 @@ class BestFirstSearch: | |||
|
|||
``stop_at_first_min`` (Boolean) is a flag that indicates whether or not to | |||
stop the search after the first minimum-cost goal state has been reached. | |||
In the absence of any QPD assignments, it always makes sense to stop once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "qpd assignments"mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising this. So the way Ed's overall code was designed was that in the first stage, it tries to find locations for all the cuts (assuming everything is just LO, since that provides a cost (overhead) upperbound) and then, in the second stage, assigns the actual QPD's (where the user can chose if it's instead LOCC, blackbox LO, etc.). The idea behind giving the user this choice was that being able to pick between LO, LOCC, and blackbox LO etc. can depend on the resources that are available (e.g whether or not CC is possible and whether or not you have access to ancillas etc). One should also in principle be able to make a different choice for each cut. In this situation, not all sets of cuts found in the first step are created equal; some may be more suitable than others.
Now when you're only even allowing for ''vanilla'' LO, all you are ever doing is the first step and so there is no need to explore multiple minima.
In summary, what I meant by "QPD assignments" was any non-LO QPD assignments.
@@ -268,7 +269,7 @@ def optimization_pass( | |||
|
|||
self.update_minimum_reached(cost) | |||
|
|||
if cost is None or self.cost_bounds_exceeded(cost): | |||
if cost is None or self.cost_bounds_exceeded(cost): # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you understand why this change resulted in this branch losing coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in exploring the multiple minima, it was encountering a None
state for one of the test cases. I'd have to look into this more to figure out exactly what was going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a test for this case (here: c4c2968). The test also allows you to describe the workflow of the optimizer at a more granular level. We force BestFirstSearch.optimize()
to run until it encounters a None
state which then tests this line. I am a little concerned, though, that I overdid this one a little bit and maybe excluding this line from coverage instead would have been okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good test, thanks
Co-authored-by: Jim Garrison <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks reasonable to me. 🚀
Could do a release note if we think the cut-finding performance is noticeable enough in general usage. |
@@ -190,7 +190,7 @@ def test_four_qubit_circuit_two_qubit_qpu( | |||
) # circuit separated into 2 subcircuits. | |||
|
|||
assert ( | |||
optimization_pass.get_stats()["CutOptimization"] == array([15, 46, 15, 6]) | |||
optimization_pass.get_stats()["CutOptimization"] == array([11, 36, 15, 4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some sense I can make of these indices? (I think they're indices). How do you know what they should be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're basically just stats that keep track of things like how many states were added to the queue and how many backjumps were performed during the search. These numbers were obtained just by running the search algorithm on these circuits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK it's dubious to let the output of your function determine the target values in your unit tests. It's better to manually track down whether these values are correct when making the test case (even if it's just printing a bunch of things out and sanity checking for yourself). I understand the temptation in situations like these, but doing this can give you a false sense of security and defeat the purpose of unit testing.
Once you've satisfied yourself that these values are actually what they should be, you can normally just let the other devs know in code review that you verified this output. They can verify it themselves if they choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was partly also because I was not able to come up with a more creative test for this function. I guess for a small enough circuit though, one may be able to predict some of these numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably remove the test if it were my code, since there could very well be a bug here that is treated as ground truth just because that was the output. No one has actually checked that these are the values that should've been returned for this given circuit.
It's kind of outside the scope of this PR, but I just wanted to give my $.02 on that :D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's actually a good point. I am going to change this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome LGTM 🚀
I'll let you decide whether you think a performance release note is worthy here
assert ( | ||
optimization_pass.get_stats()["CutOptimization"] == array([15, 46, 15, 6]) | ||
).all() # matches known stats. | ||
assert optimization_pass.get_stats()["CutOptimization"][3] <= settings.max_backjumps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only consider the element at index [3]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really the only number (the number of backjumps) that is possible to (read easiest to) predict and constrain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now: get_stats
returns an array but every element of it means something specific. This would have been better as a data structure or namedtuple
rather than a numpy array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Tis done (837875c). Have also added a release note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Tis done (837875c). Have also added a release note.
Thank you.
We should probably actually revert this on this branch and then do this as a separate PR. That way, we can backport this current PR since the interface does not change. The improved interface can be done in 0.8.0 with a release note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this function isn't actually exposed through the API. Does that matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this function isn't actually exposed through the API. Does that matter?
OK, in that case back-porting should be fine as is.
Sorry, I thought I saw something previously in the notebook where these same four numbers turned up, but that is either gone now or a phantom memory.
@@ -299,10 +314,10 @@ def minimum_reached(self) -> bool: | |||
"""Return True if the optimization reached a global minimum.""" | |||
return self.min_reached | |||
|
|||
def get_stats(self, penultimate: bool = False) -> np.typing.NDArray[np.int_] | None: | |||
def get_stats(self, penultimate: bool = False) -> NamedTuple | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it complain if you use SearchStats
as the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, I have changed that here (d931ec6).
Co-authored-by: Jim Garrison <[email protected]>
…Qiskit-Extensions/circuit-knitting-toolbox into avoid-exploring-multiple-minima
@@ -155,10 +152,10 @@ def get_results(self) -> DisjointSubcircuitsState | None: | |||
"""Return the optimization results.""" | |||
return self.best_result | |||
|
|||
def get_stats(self, penultimate=False) -> dict[str, NDArray[np.int_]]: | |||
def get_stats(self, penultimate=False) -> dict[str, NamedTuple | None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that there might one day be more keys in this dict than just CutOptimization
?
I wonder if this would be better.
def get_stats(self, penultimate=False) -> dict[str, NamedTuple | None]: | |
def get_stats(self, penultimate=False) -> dict[str, Any]: |
but the docstring is still a little bit weird, because it talks about the "value" of the dict without referencing what the key(s) are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that there might one day be more keys in this dict than just CutOptimization?
Yes, that's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, I think my suggestion of Any
makes the most sense, and maybe edit the docstring for clarity too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with this
* Avoid exploring extraneous minima in the search space * fix failing test * fix coverage * black * update doc string * update doc string Co-authored-by: Jim Garrison <[email protected]> * add new tests and modify states check * update test description * style * change to namedtuple, add release note * update return Co-authored-by: Jim Garrison <[email protected]> * change type hints --------- Co-authored-by: Jim Garrison <[email protected]> (cherry picked from commit 2bcbe7f)
… (#588) * Avoid exploring extraneous minima in the search space * fix failing test * fix coverage * black * update doc string * update doc string Co-authored-by: Jim Garrison <[email protected]> * add new tests and modify states check * update test description * style * change to namedtuple, add release note * update return Co-authored-by: Jim Garrison <[email protected]> * change type hints --------- Co-authored-by: Jim Garrison <[email protected]> (cherry picked from commit 2bcbe7f) Co-authored-by: Ibrahim Shehzad <[email protected]>
A certain step in the cut-finder explores multiple minima of the cost function. This step, however, is only relevant when we have LOCC or LO blackbox/parallel gate cut QPD assignments. As such, it is unnecessary for the LO circuit cutting that we currently support. Disabling the search for multiple minima also helps speed up the performance of the cut finder. This was especially relevant for certain QAOA circuits that were reported to us, which involved multiple Rzz gates with$\theta =0$ (in which case the cost of cutting each of these gates is 1). Telling the cut finder to stop at the first minimum stops it from exploring an evergrowing list of states (since no pruning of states can take place when the cost of each additional gate cut just gives us a multiplicative factor of 1 to the overall cost). This PR aims to fix all of this, simply by setting the default value of certain
stop_at_first_min
flags toTrue
.