Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Qiskit native QPY ParameterExpression serialization #13356

Merged
merged 18 commits into from
Nov 6, 2024

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Oct 22, 2024

Summary

With the release of symengine 0.13.0 we discovered a version dependence on the payload format used for serializing symengine expressions. This was worked around in #13251 but this is not a sustainable solution and only works for symengine 0.11.0 and 0.13.0 (there was no 0.12.0). While there was always the option to use sympy to serialize the underlying symbolic expression (there is a use_symengine flag on qpy.dumps you can set to False to do this) the sympy serialzation has several tradeoffs most importantly is much higher runtime overhead. To solve the issue moving forward a qiskit native representation of the parameter expression object is necessary for serialization.

This commit bumps the QPY format version to 13 and adds a new serialization format for ParameterExpression objects. This new format is a serialization of the API calls made to ParameterExpression that resulted in the creation of the underlying object. To facilitate this the ParameterExpression class is expanded to store an internal "replay" record of the API calls used to construct the ParameterExpression object. This internal list is what gets serialized by QPY and then on deserialization the "replay" is replayed to reconstruct the expression object. This is a different approach to the previous QPY representations of the ParameterExpression objects which instead represented the internal state stored in the ParameterExpression object with the symbolic expression from symengine (or a sympy copy of the expression). Doing this directly in Qiskit isn't viable though because symengine's internal expression tree is not exposed to Python directly. There isn't any method (private or public) to walk the expression tree to construct a serialization format based off of it. Converting symengine to a sympy expression and then using sympy's API to walk the expression tree is a possibility but that would tie us to sympy which would be problematic for #13267 and #13131, have significant runtime overhead, and it would be just easier to rely on sympy's native serialization tools.

The tradeoff with this approach is that it does increase the memory overhead of the ParameterExpression class because for each element in the expression we have to store a record of it. Depending on the depth of the expression tree this also could be a lot larger than symengine's internal representation as we store the raw api calls made to create the ParameterExpression but symengine is likely simplifying it's internal representation as it builds it out. But I personally think this tradeoff is worthwhile as it ties the serialization format to the Qiskit objects and APIs instead of relying on a 3rd party library. This also gives us the flexibility of changing the internal symbolic expression library internally in the future if we decide to stop using symengine at any point.

Details and comments

Fixes #13252

TODO:

  • Add QPY v13 format documentation
  • Add release notes
  • Expand test coverage. We need to ensure we have thorough coverage of all supported expression operations which we weren't previously exercising through QPY.

With the release of symengine 0.13.0 we discovered a version dependence
on the payload format used for serializing symengine expressions. This
was worked around in Qiskit#13251 but this is not a sustainable solution and
only works for symengine 0.11.0 and 0.13.0 (there was no 0.12.0). While
there was always the option to use sympy to serialize the underlying
symbolic expression (there is a `use_symengine` flag on `qpy.dumps` you
can set to `False` to do this) the sympy serialzation has several
tradeoffs most importantly is much higher runtime overhead. To solve
the issue moving forward a qiskit native representation of the parameter
expression object is necessary for serialization.

This commit bumps the QPY format version to 13 and adds a new
serialization format for ParameterExpression objects. This new format
is a serialization of the API calls made to ParameterExpression that
resulted in the creation of the underlying object. To facilitate this
the ParameterExpression class is expanded to store an internal "replay"
record of the API calls used to construct the ParameterExpression
object. This internal list is what gets serialized by QPY and then on
deserialization the "replay" is replayed to reconstruct the expression
object. This is a different approach to the previous QPY representations
of the ParameterExpression objects which instead represented the internal
state stored in the ParameterExpression object with the symbolic
expression from symengine (or a sympy copy of the expression). Doing
this directly in Qiskit isn't viable though because symengine's internal
expression tree is not exposed to Python directly. There isn't any
method (private or public) to walk the expression tree to construct
a serialization format based off of it. Converting symengine to a sympy
expression and then using sympy's API to walk the expression tree is a
possibility but that would tie us to sympy which would be problematic
for Qiskit#13267 and Qiskit#13131, have significant runtime overhead, and it would
be just easier to rely on sympy's native serialization tools.

The tradeoff with this approach is that it does increase the memory
overhead of the `ParameterExpression` class because for each element
in the expression we have to store a record of it. Depending on the
depth of the expression tree this also could be a lot larger than
symengine's internal representation as we store the raw api calls made
to create the ParameterExpression but symengine is likely simplifying
it's internal representation as it builds it out. But I personally think
this tradeoff is worthwhile as it ties the serialization format to the
Qiskit objects instead of relying on a 3rd party library. This also
gives us the flexibility of changing the internal symbolic expression
library internally in the future if we decide to stop using symengine
at any point.

Fixes Qiskit#13252
@mtreinish mtreinish added on hold Can not fix yet priority: high Changelog: New Feature Include in the "Added" section of the changelog Changelog: API Change Include in the "Changed" section of the changelog mod: qpy Related to QPY serialization labels Oct 22, 2024
@mtreinish mtreinish added this to the 1.3.0 milestone Oct 22, 2024
@mtreinish mtreinish requested a review from a team as a code owner October 22, 2024 16:22
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core
  • @mtreinish
  • @nkanazawa1989

@coveralls
Copy link

coveralls commented Oct 22, 2024

Pull Request Test Coverage Report for Build 11708157805

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 272 of 293 (92.83%) changed or added relevant lines in 5 files are covered.
  • 15 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.003%) to 88.786%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/circuit/parameterexpression.py 102 103 99.03%
qiskit/qpy/binary_io/value.py 162 182 89.01%
Files with Coverage Reduction New Missed Lines %
crates/accelerate/src/two_qubit_decompose.rs 1 92.09%
qiskit/qpy/binary_io/value.py 2 87.27%
crates/qasm2/src/lex.rs 6 91.98%
crates/qasm2/src/parse.rs 6 97.62%
Totals Coverage Status
Change from base Build 11706516151: -0.003%
Covered Lines: 77347
Relevant Lines: 87116

💛 - Coveralls

qiskit/qpy/formats.py Outdated Show resolved Hide resolved
@mtreinish
Copy link
Member Author

mtreinish commented Oct 31, 2024

The state of this is it's stuck on the overly complex expression example test I added in: 2b352d7 I'm thinking it might be worth approaching the deserialization in a slightly different approach instead of doing nested expressions recursively like it currently is. If anyone has bandwidth and can come up with a fix before me for that test, feel free to just push to the branch. I think when that test is passing we probably have enough coverage between that new test and the expression tests that already existed for circuits and pulse schedules (we might want to convert those expression tests to circuit equivalents as part of the pulse removal in 2.0).

This commit changes how the deserialization works to use a postfix
stack based approach. Operands are push on the stack and then popped off
based on the operation being run. The result of the operation is then
pushed on the stack. This handles nested objects much more cleanly than
the recursion based approach because we just keep pushing on the stack
instead of recursing, making the accounting much simpler. After the
expression payload is finished being processed there will be a single
value on the stack and that is returned as the final expression.
@mtreinish mtreinish changed the title [WIP] Add Qiskit native QPY ParameterExpression serialization Add Qiskit native QPY ParameterExpression serialization Nov 5, 2024
@mtreinish mtreinish removed the on hold Can not fix yet label Nov 5, 2024
@mtreinish
Copy link
Member Author

Ok, I fixed the issue from last week with: a903387 which switches to a stack based approach instead of a recursive one. I think this should be ready now.

Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mtreinish. After taking a look I think the PR looks good. I think we agree in the need for a Qiskit-native alternative for the symengine serialization even if there are some tradeoffs in terms of memory footprint. I added mostly documentation-related suggestions, some typos and some rewordings, and a few minor comments. The question I left in read_parameter_expression_v4 might be important, but I might have missed something there. Other than that, I was wondering if we should also communicate somehow that use_symengine isn't useful anymore for QPY versions over 13?

qiskit/circuit/parameterexpression.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
qiskit/qpy/binary_io/value.py Show resolved Hide resolved
qiskit/qpy/binary_io/value.py Show resolved Hide resolved
qiskit/qpy/binary_io/value.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
qiskit/qpy/__init__.py Outdated Show resolved Hide resolved
This commit fixes a bug with handling the operand order of subtraction,
division, and exponentiation. These operations are not commutative but
the qpy deserialization code was treating them as such. So in cases
where the argument order was reversed qpy was trying to flip the
operands around for code simplicity and this would result in incorrect
behavior. This commit fixes this by adding explicit op codes for the
reversed sub, div, and pow and preserving the operand order correctly
in these cases.
@mtreinish
Copy link
Member Author

Other than that, I was wondering if we should also communicate somehow that use_symengine isn't useful anymore for QPY versions over 13?

I added a sentence to the qpy.dump function's docstring for the use_symengine argument to say it doesn't do anything in qpy 13. Were you thinking we needed an upgrade release note or something to documente it further?

We can drop the argument in 2.0 potentially since 2.0 will have a minimum compatibility version of qpy format version 13 so nothing >=2.0 emits can support the use_symengine flag. But my worry with that is it is a potentially breaking api change for users that might want to support dumping qpy across code run with multiple qiskit versions. So maybe it's better to just document it as not doing anything in 2.0 and remove it in 3.0. Either way that's something we can debate post 1.3, since we probably should not deprecate it here, and only save that for a future release.

@mtreinish mtreinish requested a review from ElePT November 6, 2024 16:55
Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a sentence to the qpy.dump function's docstring for the use_symengine argument to say it doesn't do anything in qpy 13. Were you thinking we needed an upgrade release note or something to documente it further?

I didn't have a fixed idea, but I think the docstring is a good place for the moment. I also wasn't pushing for a deprecation now, just wanted to raise the point to make sure we didn't forget to deal with it in the future.

Thanks a lot for applying the review comments so fast! I have taken a second look and cannot come up with any other potential issue, so on my side I think the PR is ready to be merged.

@mtreinish mtreinish enabled auto-merge November 6, 2024 17:24
@mtreinish mtreinish added this pull request to the merge queue Nov 6, 2024
Merged via the queue into Qiskit:main with commit 0a7690d Nov 6, 2024
17 checks passed
@mtreinish mtreinish deleted the qpy-paramexpr-native branch November 6, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog mod: qpy Related to QPY serialization priority: high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce Qiskit native ParameterExpression serialization
5 participants