-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle custom operations with overlapping names in QPY #11646
Conversation
This commit fixes a longstanding bug and limitation in QPY around the use of custom operations. When QPY encounters a non-standard (meaning not in Qiskit itself, or a direct instance of Gate or Instruction) instruction it is not able to represent the exact instance perfectly in that payload. This is because it would require more code that represents those classes or data not contained inside Qiskit itself, which would be an opportunity for arbitrary code execution which is something QPY was designed to prevent (as it's a limitation with pickle serialization). So to serialize these objects in a circuit, QPY stores as much data as it can extract from the data model from the Instruction and Gate classes and builds a table of custom instructions that will be recreated as Instruction or Gate instances on deserialization. In all previous QPY format versions that table was keyed on the .name attribute of the custom instructions in the circuit, the thinking being the name is a unique op code during circuit compilation and if there are multiple circuit elements with the same name they should be the same object. With this assumption it enabled the QPY payloads generated with repeated custom instructions to be smaller and more efficient because we don't need to store potentially repeated data. While that assumption is true during most of compilation it was ignoring that for a bare quantum circuit (or during the initial stages of compilation) this isn't necessarily true and there are compiler passes that canonicalize custom operations prior to that being true. To make it worse this assumption was causing conflicts in the QPY payload and cause an inaccurate reproduction of the original circuit when deserializing a QPY payload in many cases when custom gates were used. This commit fixes this limitation by introducing a new QPY format version 11, which changes the key used in the custom instruction table to instead of being just the name to `{name}_{uuid}` where each instance of a custom instruction has a different uuid value. This means that there are never any overlapping definitions in the table and we don't have to worry about the case where two custom instructions have the same name. Fixes Qiskit#8941
One or more of the the following people are requested to review this:
|
I wrote this with the intent to combine it with #11644 so there is a version kwarg added to some of the write functions which is not set anywhere externally. Depending on whether this or #11644 merges first we'll want to update the other to ensure that when |
Pull Request Test Coverage Report for Build 7726150026Warning: This coverage report may be inaccurate.We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, it's a bit of a shame that I didn't think of generalizing the fix in #10809 or at least stripping the trailing _{uuid}
(version 10 outputs beautiful gate names like ucrz_dg_0d501712-5682-4a9c-99ea-07453a0a3fe1(-π/2)
), but it's a lot better now. I only have two super minor comments. I can approve once the changes from #11644 are merged.
@@ -138,6 +138,45 @@ def test_empty_layout(self): | |||
qc._layout = TranspileLayout(None, None, None) | |||
self.assert_roundtrip_equal(qc) | |||
|
|||
def test_overlapping_definitions(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense to have this test be part of TestLoadFromQPY
(in test/python/circuit/test_circuit_load_from_qpy.py
) rather than TestLayout
? Could this be an opportunity to merge both test files under test/python/qpy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but for this PR I just left it as is. I'll push a follow up PR after this merges that unifies the tests into a single module under tests/python/qpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I disembarked this so that #10998 could go first (because it'll have conflicts with other things), but that's also just joined the queue, so I can put this back in now. |
Summary
This commit fixes a longstanding bug and limitation in QPY around the use of custom operations. When QPY encounters a non-standard (meaning not in Qiskit itself, or a direct instance of Gate or Instruction) instruction it is not able to represent the exact instance perfectly in that payload. This is because it would require more code that represents those classes or data not contained inside Qiskit itself, which would be an opportunity for arbitrary code execution which is something QPY was designed to prevent (as it's a limitation with pickle serialization). So to serialize these objects in a circuit, QPY stores as much data as it can extract from the data model from the Instruction and Gate classes and builds a table of custom instructions that will be recreated as Instruction or Gate instances on deserialization. In all previous QPY format versions that table was keyed on the .name attribute of the custom instructions in the circuit, the thinking being the name is a unique op code during circuit compilation and if there are multiple circuit elements with the same name they should be the same object. With this assumption it enabled the QPY payloads generated with repeated custom instructions to be smaller and more efficient because we don't need to store potentially repeated data. While that assumption is true during most of compilation it was ignoring that for a bare quantum circuit (or during the initial stages of compilation) this isn't necessarily true and there are compiler passes that canonicalize custom operations prior to that being true. To make it worse this assumption was causing conflicts in the QPY payload and cause an inaccurate reproduction of the original circuit when deserializing a QPY payload in many cases when custom gates were used.
This commit fixes this limitation by introducing a new QPY format version 11, which changes the key used in the custom instruction table to instead of being just the name to
{name}_{uuid}
where each instance of a custom instruction has a different uuid value. This means that there are never any overlapping definitions in the table and we don't have to worry about the case where two custom instructions have the same name.Details and comments
Fixes #8941