Use Rust gates for 2q unitary synthesis #12740

mtreinish · 2024-07-08T21:13:51Z

Summary

This commit builds off of what #12650 did for the 1q decomposer and moves to using rust gates for the 2q decomposer too. This means that the circuit sequence generation is using rust's StandardGate representation directly instead of relying on mapping strings. For places where circuits are generated (calling TwoQubitWeylDecomposition.circuit() or or TwoQubitBasisDecomposer.__call__ without the use_dag flag) the entire circuit is generated in Rust and returned to Python.

Details and comments

This commit builds off of what Qiskit#12650 did for the 1q decomposer and moves to using rust gates for the 2q decomposer too. This means that the circuit sequence generation is using rust's StandardGate representation directly instead of relying on mapping strings. For places where circuits are generated (calling `TwoQubitWeylDecomposition.circuit()` or or `TwoQubitBasisDecomposer.__call__` without the `use_dag` flag) the entire circuit is generated in Rust and returned to Python.

qiskit-bot · 2024-07-08T21:13:56Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core
@kevinhartman
@levbishop
@mtreinish

coveralls · 2024-07-08T21:38:53Z

Pull Request Test Coverage Report for Build 10066245198

Details

154 of 156 (98.72%) changed or added relevant lines in 8 files are covered.
6 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.03%) to 89.944%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
crates/accelerate/src/two_qubit_decompose.rs	113	114	99.12%
crates/circuit/src/operations.rs	1	2	50.0%

Files with Coverage Reduction	New Missed Lines	%
crates/accelerate/src/two_qubit_decompose.rs	1	90.25%
crates/qasm2/src/lex.rs	5	92.62%

Totals
Change from base Build 10065639500:	0.03%
Covered Lines:	65846
Relevant Lines:	73208

💛 - Coveralls

jakelishman · 2024-07-09T13:48:02Z

Note: this will conflict with #12730.

mtreinish · 2024-07-11T20:52:18Z

I wrote a little script to test 2q unitary synthesis performance:

import statistics
import time

from qiskit.synthesis.two_qubit import TwoQubitBasisDecomposer
from qiskit.circuit.library import CXGate
from qiskit.quantum_info import random_unitary

decomp = TwoQubitBasisDecomposer(CXGate(), euler_basis="ZSXX")
unitaries = [random_unitary(4) for _ in range(10000)]
runtimes = []
for mat in unitaries:
    start = time.perf_counter()
    decomp(mat)
    stop = time.perf_counter()
    runtimes.append(stop - start)
print(f"Mean runtime per unitary: {statistics.geometric_mean(runtimes)} sec.")
print(f"Total runtime for 10k unitary matrices: {sum(runtimes)} sec.")

With 1.1.1 I ran this script on a m1 max mbp with Python 3.12 and it returned:

Mean runtime per unitary: 0.00014786494136577558 sec.
Total runtime for 10k unitary matrices: 1.504167526960373 sec.

Then with this PR applied on the same configuration this script returned:

Mean runtime per unitary: 3.408156984749889e-05 sec.
Total runtime for 10k unitary matrices: 0.361447433475405 sec.

EDIT: Just to showcase how far we've come in Qiskit 1.0.2 running this script returned:

Mean runtime per unitary: 0.00175304350427186 sec.
Total runtime for 10k unitary matrices: 17.648815331980586 sec.

ElePT

I have taken a look and up to what I can see, it makes sense. I am not sure if you'd prefer merging this before or after #12730, so I will not add it to the queue just in case.

ElePT · 2024-07-15T11:41:15Z

crates/accelerate/src/two_qubit_decompose.rs

        let euler_basis: EulerBasis = match euler_basis {
            Some(basis) => EulerBasis::__new__(basis.deref())?,
            None => self.default_euler_basis,
        };
        let target_1q_basis_list: Vec<EulerBasis> = vec![euler_basis];

-        let mut gate_sequence = Vec::new();
+        let mut gate_sequence: WeylCircuitSequence = Vec::with_capacity(21);


Why is the capacity specifically 21?

An arbitrary 2q decomposition is 3 CX plus local operations, which with single-rotation Eulers is up to 3 gates per local per qubit. Naively, this can run to 27 gates (3x CX interspersed into 4x 6 1q local operations), but maybe there's a way through the decomposition that guarantees that the last set of 6 is never needed? I don't have the maths off the top of my head.

I counted it out as 21 for the two qubit decomposer: https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/two_qubit_decompose.rs#L1836-L1842 (but might have miscounted) and was just reusing that number here.

…y-synth

mtreinish · 2024-07-23T20:57:30Z

I have taken a look and up to what I can see, it makes sense. I am not sure if you'd prefer merging this before or after #12730, so I will not add it to the queue just in case.

#12730 has merged now and I've rebased this on top of it now.

jakelishman

This looks generally fine to merge for me (as Elena had already reviewed) - just one minor question that's skippable.

jakelishman · 2024-07-23T21:06:06Z

crates/accelerate/src/two_qubit_decompose.rs

    fn weyl_gate(
        &self,
        simplify: bool,
-        sequence: &mut TwoQubitSequenceVec,
+        sequence: &mut WeylCircuitSequence,
        atol: f64,
        global_phase: &mut f64,
    ) {
        match self.specialization {
            Specialization::MirrorControlledEquiv => {
-                sequence.push(("swap".to_string(), SmallVec::new(), smallvec![0, 1]));
                sequence.push((
-                    "rzz".to_string(),
-                    smallvec![(PI4 - self.c) * 2.],
-                    smallvec![0, 1],
+                    StandardGate::SwapGate,
+                    SmallVec::new(),
+                    smallvec![Qubit(0), Qubit(1)],
+                ));
+                sequence.push((
+                    StandardGate::RZZGate,
+                    smallvec![Param::Float((PI4 - self.c) * 2.)],
+                    smallvec![Qubit(0), Qubit(1)],
                ));
                *global_phase += PI4
            }


I don't mind too much, but is there anything stopping us from skipping the WeylCircuitSequence intermediate allocations, and just directly building a CircuitData?

Not really, I think we could use CircuitData directly here. I don't remember what my thinking was exactly on using the intermediate type here. Maybe I just wanted to use CircuitData::from_standard_gates()?

We can always just look at that in a follow-up.

I did this locally, but let's do this in a follow-up because to do it well I needed to add helper methods to CircuitData to make it easier to manipulate from rust space (unless I missed methods to do the same already) and think it would be good to review those interfaces in isolation (I'm not keen on the clbit interner usage I have there):

diff --git a/crates/accelerate/src/two_qubit_decompose.rs b/crates/accelerate/src/two_qubit_decompose.rs index ed4d32bd3..3a0c395ac 100644 --- a/crates/accelerate/src/two_qubit_decompose.rs +++ b/crates/accelerate/src/two_qubit_decompose.rs @@ -398,8 +398,6 @@ impl Specialization { } } -type WeylCircuitSequence = Vec<(StandardGate, SmallVec<[Param; 3]>, SmallVec<[Qubit; 2]>)>; - #[derive(Clone, Debug)] #[allow(non_snake_case)] #[pyclass(module = "qiskit._accelerate.two_qubit_decompose", subclass)] @@ -430,56 +428,57 @@ impl TwoQubitWeylDecomposition { fn weyl_gate( &self, simplify: bool, - sequence: &mut WeylCircuitSequence, + sequence: &mut CircuitData, atol: f64, global_phase: &mut f64, - ) { + ) -> PyResult<()> { match self.specialization { Specialization::MirrorControlledEquiv => { - sequence.push(( + sequence.push_standard_gate( StandardGate::SwapGate, SmallVec::new(), smallvec![Qubit(0), Qubit(1)], - )); - sequence.push(( + )?; + sequence.push_standard_gate( StandardGate::RZZGate, smallvec![Param::Float((PI4 - self.c) * 2.)], smallvec![Qubit(0), Qubit(1)], - )); + )?; *global_phase += PI4 } Specialization::SWAPEquiv => { - sequence.push(( + sequence.push_standard_gate( StandardGate::SwapGate, SmallVec::new(), smallvec![Qubit(0), Qubit(1)], - )); + )?; *global_phase -= 3. * PI / 4. } _ => { if !simplify || self.a.abs() > atol { - sequence.push(( + sequence.push_standard_gate( StandardGate::RXXGate, smallvec![Param::Float(-self.a * 2.)], smallvec![Qubit(0), Qubit(1)], - )); + )?; } if !simplify || self.b.abs() > atol { - sequence.push(( + sequence.push_standard_gate( StandardGate::RYYGate, smallvec![Param::Float(-self.b * 2.)], smallvec![Qubit(0), Qubit(1)], - )); + )?; } if !simplify || self.c.abs() > atol { - sequence.push(( + sequence.push_standard_gate( StandardGate::RZZGate, smallvec![Param::Float(-self.c * 2.)], smallvec![Qubit(0), Qubit(1)], - )); + )?; } } } + Ok(()) } } @@ -1059,7 +1058,7 @@ impl TwoQubitWeylDecomposition { }; let target_1q_basis_list: Vec<EulerBasis> = vec![euler_basis]; - let mut gate_sequence: WeylCircuitSequence = Vec::with_capacity(21); + let mut gate_sequence = CircuitData::with_capacity(py, 2, 21, Param::Float(0.))?; let mut global_phase: f64 = self.global_phase; let c2r = unitary_to_gate_sequence_inner( @@ -1072,11 +1071,11 @@ impl TwoQubitWeylDecomposition { ) .unwrap(); for gate in c2r.gates { - gate_sequence.push(( + gate_sequence.push_standard_gate( gate.0, gate.1.into_iter().map(Param::Float).collect(), smallvec![Qubit(0)], - )) + )? } global_phase += c2r.global_phase; let c2l = unitary_to_gate_sequence_inner( @@ -1089,11 +1088,11 @@ impl TwoQubitWeylDecomposition { ) .unwrap(); for gate in c2l.gates { - gate_sequence.push(( + gate_sequence.push_standard_gate( gate.0, gate.1.into_iter().map(Param::Float).collect(), smallvec![Qubit(1)], - )) + )? } global_phase += c2l.global_phase; self.weyl_gate( @@ -1101,7 +1100,7 @@ impl TwoQubitWeylDecomposition { &mut gate_sequence, atol.unwrap_or(ANGLE_ZERO_EPSILON), &mut global_phase, - ); + )?; let c1r = unitary_to_gate_sequence_inner( self.K1r.view(), &target_1q_basis_list, @@ -1112,11 +1111,11 @@ impl TwoQubitWeylDecomposition { ) .unwrap(); for gate in c1r.gates { - gate_sequence.push(( + gate_sequence.push_standard_gate( gate.0, gate.1.into_iter().map(Param::Float).collect(), smallvec![Qubit(0)], - )) + )? } global_phase += c2r.global_phase; let c1l = unitary_to_gate_sequence_inner( @@ -1129,13 +1128,14 @@ impl TwoQubitWeylDecomposition { ) .unwrap(); for gate in c1l.gates { - gate_sequence.push(( + gate_sequence.push_standard_gate( gate.0, gate.1.into_iter().map(Param::Float).collect(), smallvec![Qubit(1)], - )) + )? } - CircuitData::from_standard_gates(py, 2, gate_sequence, Param::Float(global_phase)) + gate_sequence.global_phase = Param::Float(global_phase); + Ok(gate_sequence) } } diff --git a/crates/circuit/src/circuit_data.rs b/crates/circuit/src/circuit_data.rs index a325ca4e1..2822fef75 100644 --- a/crates/circuit/src/circuit_data.rs +++ b/crates/circuit/src/circuit_data.rs @@ -96,7 +96,7 @@ pub struct CircuitData { clbits: BitData<Clbit>, param_table: ParamTable, #[pyo3(get)] - global_phase: Param, + pub global_phase: Param, } impl CircuitData { @@ -127,22 +127,8 @@ impl CircuitData { I: IntoIterator<Item = (StandardGate, SmallVec<[Param; 3]>, SmallVec<[Qubit; 2]>)>, { let instruction_iter = instructions.into_iter(); - let mut res = CircuitData { - data: Vec::with_capacity(instruction_iter.size_hint().0), - qargs_interner: IndexedInterner::new(), - cargs_interner: IndexedInterner::new(), - qubits: BitData::new(py, "qubits".to_string()), - clbits: BitData::new(py, "clbits".to_string()), - param_table: ParamTable::new(), - global_phase, - }; - if num_qubits > 0 { - let qubit_cls = QUBIT.get_bound(py); - for _i in 0..num_qubits { - let bit = qubit_cls.call0()?; - res.add_qubit(py, &bit, true)?; - } - } + let mut res = + Self::with_capacity(py, num_qubits, instruction_iter.size_hint().0, global_phase)?; let no_clbit_index = (&mut res.cargs_interner) .intern(InternerKey::Value(Vec::new()))? .index; @@ -164,6 +150,58 @@ impl CircuitData { Ok(res) } + /// Build an empty CircuitData object with an initially allocated instruction capacity + pub fn with_capacity( + py: Python, + num_qubits: u32, + instruction_capacity: usize, + global_phase: Param, + ) -> PyResult<Self> { + let mut res = CircuitData { + data: Vec::with_capacity(instruction_capacity), + qargs_interner: IndexedInterner::new(), + cargs_interner: IndexedInterner::new(), + qubits: BitData::new(py, "qubits".to_string()), + clbits: BitData::new(py, "clbits".to_string()), + param_table: ParamTable::new(), + global_phase, + }; + if num_qubits > 0 { + let qubit_cls = QUBIT.get_bound(py); + for _i in 0..num_qubits { + let bit = qubit_cls.call0()?; + res.add_qubit(py, &bit, true)?; + } + } + Ok(res) + } + + /// Append a standard gate to this CircuitData + pub fn push_standard_gate( + &mut self, + operation: StandardGate, + params: SmallVec<[Param; 3]>, + qargs: SmallVec<[Qubit; 2]>, + ) -> PyResult<()> { + let no_clbit_index = (&mut self.cargs_interner) + .intern(InternerKey::Value(Vec::new()))? + .index; + let params = (!params.is_empty()).then(|| Box::new(params)); + let qubits = (&mut self.qargs_interner) + .intern(InternerKey::Value(qargs.to_vec()))? + .index; + self.data.push(PackedInstruction { + op: operation.into(), + qubits, + clbits: no_clbit_index, + params, + extra_attrs: None, + #[cfg(feature = "cache_pygates")] + py_op: RefCell::new(None), + }); + Ok(()) + } + fn handle_manual_params( &mut self, py: Python,

Yeah, let's follow up. I've tagged for merge.

In the recently merged Qiskit#12740 a path was added for constructing the `CircuitData` in rust when synthesizing to a `QuantumCircuit` directly. However, in that PR this was done through a layer of indirection by first collecting the gates into an intermediate `Vec` and then passing an iterator of that into `CircuitData::from_standard_gates()`. However this resulted in an unecessary allocation for two `Vec`s and it would have been better to just directly construct the `CircuitData` object directly. However, to accomplish this we needed more rust space methods for creating and manipulating a `CircuitData` object as it's mostly being constructed from Python space or using `CircuitData::from_standard_gates()` with an iterator so far. This commit makes those additions and then updates the `TwoQubitWeylDecomposition` code to directly construct the `CircuitData` object instead of using an intermediate `Vec`.

* Use Rust gates for 2q unitary synthesis This commit builds off of what Qiskit#12650 did for the 1q decomposer and moves to using rust gates for the 2q decomposer too. This means that the circuit sequence generation is using rust's StandardGate representation directly instead of relying on mapping strings. For places where circuits are generated (calling `TwoQubitWeylDecomposition.circuit()` or or `TwoQubitBasisDecomposer.__call__` without the `use_dag` flag) the entire circuit is generated in Rust and returned to Python. * Run cargo fmt and black post rebase

* Directly construct CircuitData in TwoQubitWeylDecomposition In the recently merged #12740 a path was added for constructing the `CircuitData` in rust when synthesizing to a `QuantumCircuit` directly. However, in that PR this was done through a layer of indirection by first collecting the gates into an intermediate `Vec` and then passing an iterator of that into `CircuitData::from_standard_gates()`. However this resulted in an unecessary allocation for two `Vec`s and it would have been better to just directly construct the `CircuitData` object directly. However, to accomplish this we needed more rust space methods for creating and manipulating a `CircuitData` object as it's mostly being constructed from Python space or using `CircuitData::from_standard_gates()` with an iterator so far. This commit makes those additions and then updates the `TwoQubitWeylDecomposition` code to directly construct the `CircuitData` object instead of using an intermediate `Vec`. * Use set_global_phase instead of direct assignment * Use slice inputs for push_standard_gate() params and qargs

mtreinish added performance Changelog: None Do not include in changelog synthesis Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Jul 8, 2024

mtreinish added this to the 1.2.0 milestone Jul 8, 2024

mtreinish requested review from alexanderivrii, ShellyGarion and a team as code owners July 8, 2024 21:13

ElePT previously approved these changes Jul 15, 2024

View reviewed changes

mtreinish added 2 commits July 23, 2024 16:51

Merge remote-tracking branch 'origin/main' into use-rust-gates-unitar…

14854d0

…y-synth

Run cargo fmt and black post rebase

4dbe881

mtreinish dismissed ElePT’s stale review via 4dbe881 July 23, 2024 20:55

mtreinish requested review from ElePT and jakelishman July 23, 2024 20:56

jakelishman approved these changes Jul 23, 2024

View reviewed changes

jakelishman added this pull request to the merge queue Jul 23, 2024

Merged via the queue into Qiskit:main with commit e362da5 Jul 23, 2024
15 checks passed

mtreinish mentioned this pull request Jul 24, 2024

Directly construct CircuitData in TwoQubitWeylDecomposition #12809

Merged

mtreinish deleted the use-rust-gates-unitary-synth branch September 13, 2024 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Rust gates for 2q unitary synthesis #12740

Use Rust gates for 2q unitary synthesis #12740

mtreinish commented Jul 8, 2024

qiskit-bot commented Jul 8, 2024

coveralls commented Jul 8, 2024 •

edited

Loading

jakelishman commented Jul 9, 2024

mtreinish commented Jul 11, 2024 •

edited

Loading

ElePT left a comment

ElePT Jul 15, 2024

jakelishman Jul 15, 2024

mtreinish Jul 15, 2024

mtreinish commented Jul 23, 2024

jakelishman left a comment

jakelishman Jul 23, 2024

mtreinish Jul 23, 2024

jakelishman Jul 23, 2024

mtreinish Jul 23, 2024

jakelishman Jul 23, 2024

Use Rust gates for 2q unitary synthesis #12740

Use Rust gates for 2q unitary synthesis #12740

Conversation

mtreinish commented Jul 8, 2024

Summary

Details and comments

qiskit-bot commented Jul 8, 2024

coveralls commented Jul 8, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10066245198

Details

💛 - Coveralls

jakelishman commented Jul 9, 2024

mtreinish commented Jul 11, 2024 • edited Loading

ElePT left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtreinish commented Jul 23, 2024

jakelishman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jul 8, 2024 •

edited

Loading

mtreinish commented Jul 11, 2024 •

edited

Loading