Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix InverseCancellation runtime performance regression #11278

Closed
wants to merge 1 commit into from

Conversation

mtreinish
Copy link
Member

Summary

In #11211, which was intended to improved the performance of the InverseCancellation pass, actually introduced a regression. This was because of the bugfix that was bundled in that PR to ensure we were checking parameter values for gates were correct. This check was done by doing a direct gate object equality between the self inverse gate selected and the gate object found in a run. This however was unecessary overhead as Instruction.__eq__ has a large amount of overhead to compare gate equality more generally. In the context of a transpiler pass we can make assumptions that any gate with a given name is the same and directly compare the parameters if the names match the inverse pair. This will speed up the equality comparison for inverse pairs and fix the regression.

Details and comments

In Qiskit#11211, which was intended to improved the performance of the
InverseCancellation pass, actually introduced a regression. This was
because of the bugfix that was bundled in that PR to ensure we were
checking parameter values for gates were correct. This check was done by
doing a direct gate object equality between the self inverse gate
selected and the gate object found in a run. This however was unecessary
overhead as `Instruction.__eq__` has a large amount of overhead to
compare gate equality more generally. In the context of a transpiler
pass we can make assumptions that any gate with a given name is the same
and directly compare the parameters if the names match the inverse pair.
This will speed up the equality comparison for inverse pairs and fix the
regression.
@mtreinish mtreinish added performance Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler labels Nov 19, 2023
@mtreinish mtreinish added this to the 1.0.0 milestone Nov 19, 2023
@mtreinish mtreinish requested a review from a team as a code owner November 19, 2023 20:19
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@mtreinish mtreinish changed the title Fix InverseGateCancellation runtime performance regression Fix InverseCancellation runtime performance regression Nov 19, 2023
@coveralls
Copy link

Pull Request Test Coverage Report for Build 6922729181

  • 9 of 11 (81.82%) changed or added relevant lines in 1 file are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.009%) to 85.901%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/optimization/inverse_cancellation.py 9 11 81.82%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.76%
crates/qasm2/src/lex.rs 4 92.68%
Totals Coverage Status
Change from base Build 6911016807: 0.009%
Covered Lines: 65945
Relevant Lines: 76769

💛 - Coveralls

@mtreinish
Copy link
Member Author

I ran asv on the benchmarks that were flagged as causing a regression with this applied:

Benchmarks that have improved:

       before           after         ratio
     [1dd4f549]       [bf0d952b]
     <main>            <fix-perf>
-         517±2ms          464±5ms     0.90  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'translator')
-      1.08±0.01s          938±1ms     0.87  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'translator')
-      1.96±0.01s          1.65±0s     0.84  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'translator')
-         149±1ms          115±1ms     0.77  queko.QUEKOTranspilerBench.time_transpile_bss(1, None)
-         135±2ms          104±3ms     0.76  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'sabre')
-         311±3ms          233±7ms     0.75  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'sabre')
-         152±1ms        113±0.6ms     0.75  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'noise_adaptive')
-       143±0.9ms        104±0.9ms     0.73  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'dense')
-         338±9ms          245±5ms     0.72  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'noise_adaptive')
-      77.9±0.7ms       55.7±0.3ms     0.72  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'sabre')
-      94.1±0.4ms       66.6±0.5ms     0.71  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'noise_adaptive')
-       357±0.9ms          239±8ms     0.67  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'dense')
-      86.4±0.7ms       57.3±0.5ms     0.66  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'dense')
-       311±0.6ms          205±3ms     0.66  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'noise_adaptive')
-       130±0.7ms       82.8±0.7ms     0.64  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'noise_adaptive')
-       118±0.8ms       70.5±0.6ms     0.60  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'sabre')
-       149±0.7ms       87.2±0.5ms     0.58  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'noise_adaptive')
-       130±0.3ms       75.9±0.9ms     0.58  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'sabre')
-         301±2ms          175±2ms     0.58  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'dense')
-       135±0.4ms       76.6±0.6ms     0.57  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'dense')
-       127±0.7ms       72.0±0.4ms     0.57  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'dense')
-         311±2ms          175±2ms     0.56  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'sabre')
-         749±5ms        192±0.4ms     0.26  queko.QUEKOTranspilerBench.time_transpile_bss(1, 'sabre')

Benchmarks that have stayed the same:

       before           after         ratio
     [1dd4f549]       [bf0d952b]
     <main>             <fix-perf>
         22.0±3ms         22.7±3ms     1.03  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'synthesis')
       18.0±0.3ms         18.6±1ms     1.03  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'translator')
         229±10ms         236±10ms     1.03  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'synthesis')
          214±1ms          220±3ms     1.03  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'sabre')
          215±2ms          221±4ms     1.03  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'dense')
          119±1ms          122±1ms     1.03  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'dense')
         60.1±2ms         61.7±7ms     1.03  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'synthesis')
          413±1ms          424±9ms     1.03  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'sabre')
          296±4ms          303±3ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'noise_adaptive')
          154±1ms          157±2ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'dense')
        122±0.9ms          125±3ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'sabre')
          183±2ms          187±3ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'sabre')
          287±4ms          293±2ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'noise_adaptive')
       71.9±0.3ms         73.3±3ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'noise_adaptive')
          188±1ms          191±7ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'noise_adaptive')
          932±3ms         945±10ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'noise_adaptive')
       7.76±0.1ms       7.84±0.1ms     1.01  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'translator')
       68.8±0.5ms       69.6±0.4ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'noise_adaptive')
       22.5±0.3ms       22.7±0.3ms     1.01  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'translator')
       94.2±0.4ms         95.0±2ms     1.01  queko.QUEKOTranspilerBench.time_transpile_bss(0, 'sabre')
       55.6±0.3ms       56.0±0.4ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'sabre')
          494±3ms          498±5ms     1.01  queko.QUEKOTranspilerBench.time_transpile_bss(2, 'sabre')
          366±2ms          369±4ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'noise_adaptive')
       4.25±0.04s        4.28±0.1s     1.01  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'synthesis')
          749±2ms          754±8ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'noise_adaptive')
        282±0.8ms          284±4ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'dense')
          222±3ms          223±3ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'noise_adaptive')
       47.1±0.5ms       47.4±0.2ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'dense')
          428±2ms          430±5ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'dense')
          971±5ms          974±6ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'dense')
          452±1ms          454±5ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'noise_adaptive')
          529±4ms          531±3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'sabre')
         93.0±1ms         93.3±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'dense')
        190±0.7ms          191±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'dense')
          220±4ms          221±4ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'noise_adaptive')
          624±3ms          626±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'dense')
        130±0.7ms        130±0.6ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'dense')
          1.44±0s       1.45±0.01s     1.00  queko.QUEKOTranspilerBench.time_transpile_bss(0, None)
       14.4±0.3ms       14.4±0.1ms     1.00  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'synthesis')
        137±0.9ms          137±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'sabre')
          335±2ms          335±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'sabre')
          287±3ms          287±3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'dense')
          128±2ms          128±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'noise_adaptive')
          272±2ms          271±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'dense')
        107±0.4ms        106±0.6ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'sabre')
          320±2ms        319±0.9ms     1.00  queko.QUEKOTranspilerBench.time_transpile_bss(2, None)
          785±3ms          783±4ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'sabre')
        138±0.7ms        137±0.9ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'dense')
       70.4±0.7ms       70.2±0.6ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'dense')
       57.8±0.4ms       57.6±0.3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'dense')
          1.24±0s          1.24±0s     1.00  queko.QUEKOTranspilerBench.time_transpile_bss(3, 'sabre')
        120±0.5ms        120±0.8ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'noise_adaptive')
       58.4±0.8ms       58.1±0.9ms     1.00  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'translator')
       61.1±0.6ms       60.8±0.4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'dense')
       79.7±0.3ms       79.2±0.4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'noise_adaptive')
          251±1ms          250±5ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'sabre')
          172±1ms          171±1ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'noise_adaptive')
          485±6ms          480±6ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'noise_adaptive')
          902±9ms          892±4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'dense')
      9.38±0.07ms       9.27±0.1ms     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'synthesis')
          798±2ms          788±2ms     0.99  queko.QUEKOTranspilerBench.time_transpile_bss(3, None)
       46.5±0.8ms       45.8±0.3ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'sabre')
       58.6±0.5ms       57.8±0.4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'sabre')
         873±20ms         860±30ms     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'synthesis')
       68.7±0.6ms         67.5±2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'sabre')
       93.0±0.9ms       91.3±0.2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'sabre')
       57.9±0.7ms       56.5±0.3ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'noise_adaptive')
       1.04±0.02s       1.01±0.01s     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'sabre')
          306±5ms          297±2ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'sabre')
          150±3ms        145±0.6ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'sabre')
       2.14±0.03s       2.07±0.04s     0.97  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'synthesis')
          106±2ms          102±4ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'noise_adaptive')
          171±1ms        159±0.5ms     0.93  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'translator')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@mtreinish
Copy link
Member Author

This should be superseded by #11370 which should fix the performance regression but is more general. I'll run some performance numbers before closing this to confirm the issue is fixed by that instead.

@mtreinish
Copy link
Member Author

Closing this as superseded by: #11370

@mtreinish mtreinish closed this Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants