Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized MPI simulator #1515

Merged
merged 11 commits into from
May 12, 2022
8 changes: 8 additions & 0 deletions qiskit/providers/aer/backends/aer_simulator.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,13 @@ class AerSimulator(AerBackend):
This option should be set when using option ``blocking_enable=True``
(Default: 0).

* ``chunk_swap_buffer_qubits`` (int): Sets the number of qubits of
maximum buffer size (=2^chunk_swap_buffer_qubits) used for multiple
chunk-swaps over MPI processes. This parameter should be smaller than
``blocking_qubits`` otherwise multiple chunk-swaps is disabled.
``blocking_qubits`` - ``chunk_swap_buffer_qubits`` swaps are applied
at single all-to-all communication. (Default: 15).

* ``batched_shots_gpu`` (bool): This option enables batched execution
of multiple shot simulations on GPU devices for GPU enabled simulation
methods. This optimization is intended for statevector simulations with
Expand Down Expand Up @@ -553,6 +560,7 @@ def _default_options(cls):
# cache blocking for multi-GPUs/MPI options
blocking_qubits=None,
blocking_enable=False,
chunk_swap_buffer_qubits=None,
# multi-shots optimization options (GPU only)
batched_shots_gpu=True,
batched_shots_gpu_max_qubits=16,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
upgrade:
- |
MPI parallelization for large number of qubits is optimized to apply
multiple chunk-swaps as all-to-all communication that can decrease
data size exchanged over MPI processes. This upgrade improve scalability
of parallelization.
31 changes: 31 additions & 0 deletions src/simulators/density_matrix/densitymatrix_state.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,9 @@ class State : public Base::StateChunk<densmat_t> {
//-----------------------------------------------------------------------
//swap between chunks
void apply_chunk_swap(const reg_t &qubits) override;

//apply multiple swaps between chunks
void apply_multi_chunk_swap(const reg_t &qubits) override;
};

//=========================================================================
Expand Down Expand Up @@ -1966,6 +1969,34 @@ void State<densmat_t>::apply_chunk_swap(const reg_t &qubits)
BaseState::apply_chunk_swap(qs1);
}

template <class densmat_t>
void State<densmat_t>::apply_multi_chunk_swap(const reg_t &qubits)
{
reg_t qubits_density;

for(int_t i=0;i<qubits.size();i+=2){
uint_t q0,q1;
q0 = qubits[i*2];
q1 = qubits[i*2+1];

std::swap(BaseState::qubit_map_[q0],BaseState::qubit_map_[q1]);

if(q1 >= BaseState::chunk_bits_){
q1 += BaseState::chunk_bits_;
}
qubits_density.push_back(q0);
qubits_density.push_back(q1);

q0 += BaseState::chunk_bits_;
if(q1 >= BaseState::chunk_bits_){
q1 += (BaseState::num_qubits_ - BaseState::chunk_bits_*2);
}
}

BaseState::apply_multi_chunk_swap(qubits_density);
}


//-------------------------------------------------------------------------
} // end namespace DensityMatrix
//-------------------------------------------------------------------------
Expand Down
Loading