-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build release wheels for tier 1 platforms with PGO #11502
Conversation
This commit adds the infrastructure to build the wheels we publish to PyPI at release time with PGO. Profile guided optimization (PGO) is a compiler technique that uses collected data about a typical execution to inform optimizations the compiler can make about building the output binary. For more details on PGO you can refer to the Rust [1] and Clang [2] documention on the topic. To start the only data collected is from running the qiskit unittests. This should give us some broad coverage as we'll exercise everything as we do in the unittests. This also means as we grow our Rust usage in Qiskit we'll continue to cover that code in the unittests and ensure we have coverage in PGO too. However, longer term we'll potentially want to look at dedicated PGO scripts that exercise Qiskit at larger scale than what we can do in the unittests. [1] https://doc.rust-lang.org/rustc/profile-guided-optimization.html [2] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
One or more of the the following people are requested to review this:
|
I tested the builds with this PR on my fork here: https://github.com/mtreinish/qiskit-core/actions/runs/7426837267 if people want to inspect the build logs and/or download the artifacts and test the compiled wheels. The arm64 failure is because I ran the test run before #11499 merged. |
Pull Request Test Coverage Report for Build 7630513450Warning: This coverage report may be inaccurate.We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle this looks like a fine way of going about starting PGO to me, and I think it's a good idea to get it in, especially now that QuantumCircuit.data
itself is in Rust space, so it's near 100% guaranteed that anybody interacting with Qiskit will use Rust code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Have you done any performance benchmarking yet to see how much using the unit tests for PGO helps for large circuit construction and transpilation, especially now that the CircuitData
stuff is in?
I haven't done anything super exhaustive. I just ran a quick example script: import time
from qiskit import transpile
from qiskit.providers.fake_provider import FakeSherbrooke
from qiskit.circuit.library import EfficientSU2
import numpy as np
rng = np.random.default_rng(seed=12345678942)
start = time.perf_counter()
qc = EfficientSU2(100, reps=1000)
qc.measure_all()
stop = time.perf_counter()
print(f"Build time: {stop - start}")
b_start = time.perf_counter()
qc.assign_parameters(rng.random(len(qc.parameters)), inplace=True)
b_stop = time.perf_counter()
print(f"Bind time: {b_stop - b_start}")
backend = FakeSherbrooke()
t_start = time.perf_counter()
tqc = transpile(qc, backend, seed_transpiler=12342)
t_stop = time.perf_counter()
print(f"Transpile time: {t_stop - t_start}") and used the PGO wheels from the CI run linked in #11502 (comment) and then built a wheel locally from 654a530 (which was the branch point from main when the PGO test run was performed). With PGO it returned:
while without PGO it returned:
Nothing drastic but it is a second or two faster, I've run it a few times and the speed up is roughly the same each time. I can do more testing if you have something in mind, but it'll be tricky to get this running in asv though because the only mechanism I have to really test is manually with the wheel files from that test CI run. |
2.5% speedup in construction and transpile is nothing to sneeze at, really, and I imagine we'll probably get more benefit as more code gets converted to Rust. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks, let's go with this for the 1.0rc1.
Summary
This commit adds the infrastructure to build the wheels we publish to PyPI at release time with PGO. Profile guided optimization (PGO) is a compiler technique that uses collected data about a typical execution to inform optimizations the compiler can make about building the output binary. For more details on PGO you can refer to the Rust [1] and Clang [2] documentation on the topic.
To start the only data collected is from running the qiskit unittests. This should give us some broad coverage as we'll exercise everything as we do in the unittests. This also means as we grow our Rust usage in Qiskit we'll continue to cover that code in the unittests and ensure we have coverage in PGO too. However, longer term we'll potentially want to look at dedicated PGO scripts that exercise Qiskit at larger scale than what we can do in the unittests.
Details and comments
[1] https://doc.rust-lang.org/rustc/profile-guided-optimization.html
[2] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization