-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new lightweight backend with performance comparisons #57
Conversation
Codecov Report
@@ Coverage Diff @@
## master #57 +/- ##
============================================
- Coverage 100.00% 62.80% -37.20%
============================================
Files 3 5 +2
Lines 56 121 +65
============================================
+ Hits 56 76 +20
- Misses 0 45 +45
Continue to review full report at Codecov.
|
Thanks for the PR @ThomasLoke! |
) | ||
|
||
if operations: | ||
# XXX: Do we need a copy here? Not sure of the required copy semantics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you do not need a copy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I take that back I'm not sure. @antalszava do we need to use np.copy()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, checked on a toy example, we'll need copying here. This will also be double-checked when we have tests in place (tests will catch these sort of cases).
Edit: on a second thought, although self._state
is indeed mutated by self.apply_lightning
, it will anyways be updated when considering rotations
in the next if
. So we could get away with not copying here. Best to do some testing to consider some edge cases. 🙂
"msvc": ["-EHsc", "-O2", "-W1", "-std:c++11"], | ||
"unix": ["-O3", "-W", "-fPIC", "-shared", "-fopenmp"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you reduce the warning level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See remark above:
Tone down warning levels: Compiling with Wall with MSVC floods the compilation with warning messages for the Eigen backend, so I toned down the warning levels for my own sanity.
if isinstance(operation, (QubitStateVector, BasisState)): | ||
if i == 0: | ||
self._apply_operation(operation) | ||
del operations[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the del
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just copy-pasta from the original lightning_qubit.py
--well, at least, it was before this commit. My branch isn't that fresh anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That commit will be required here too because the API changed in PennyLane (_apply_operation
changed).
private: | ||
static const std::vector<CplxType> matrix; | ||
public: | ||
static const std::string label; | ||
static XGate create(const std::vector<double>& parameters); | ||
inline const std::vector<CplxType>& asMatrix() { | ||
return matrix; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of this code is repeated. Can we move matrix
, label
and asMatrix()
to the AbstractGate
interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they're static class variables, so they need to be defined in the subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that asMatrix()
already exists in the superclass as a virtual method, so it is already declared there.
if (Pennylane::XGate::label == label) { | ||
gate = std::make_unique<Pennylane::XGate>(Pennylane::XGate::create(parameters)); | ||
} | ||
else if (Pennylane::YGate::label == label) { | ||
gate = std::make_unique<Pennylane::YGate>(Pennylane::YGate::create(parameters)); | ||
} | ||
else if (Pennylane::ZGate::label == label) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a switch case instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you could just do return ...
instead of gate = ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch cases don't work for strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch cases don't work for strings.
Huh. I didn't know that.
@@ -16,6 +16,7 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this entire file is just formatting changes. Can we revert them to make this PR smaller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of this PR does involve formatting changes--see remark 2 above:
Black reformatting: I integrated PyCharm with the black formatter, and it automatically reformats these files every single time I touch them, so I separated these out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ThomasLoke would you be able to git checkout master <filename>
for the files that have only formatting changes, and commit+push? It will help out with the code review side.
At the same time, we should make a separate PR to ensure all tests etc. are blacked, to reduce large formatting diffs in future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have hoped that separating it out into a separate commit would have helped with the code review, given that you can review commit-by-commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I expect PRs to consist of a sequence of well-defined commits that group changes into logical chunks. If you're trying to review the PR based on the overall diff, then yes, I'd expect you'd run into problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, our team usually group chunks by PRs, rather than diffs. Especially as some comments/suggestions/changes that come up during review might affect subsequent updates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good to keep these changes - since the only changes were formatting we don't need to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ThomasLoke, this is a really great approach, thank you so much for this! 🥇 💯
I've left some comments, mostly for my own understanding. Haven't delved into all details, mostly started zooming in at some parts from a higher level.
Also checked the benchmarks and they look really promising indeed when we have >=10 qubit systems! Below that, the new backend is usually comparable or slightly better than the old one.
I'd have hoped that separating it out into a separate commit would have helped with the code review, given that you can review commit-by-commit?
True, with the heads-up it's for sure easier reviewing now, thanks for the note! 🙂
Separating the formatting changes would, however, also help when looking at the PR later in the future. When tests are included, this addition would grow even more (it might easily be that it will be worth splitting it into smaller pieces at that time). Retrospectively looking into this PR might become cumbersome.
if isinstance(operation, (QubitStateVector, BasisState)): | ||
if i == 0: | ||
self._apply_operation(operation) | ||
del operations[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That commit will be required here too because the API changed in PennyLane (_apply_operation
changed).
return StateVector((CplxType*)numpyArrayInfo.ptr, numpyArrayInfo.shape[0]); | ||
} | ||
|
||
Pennylane::StateVector::StateVector(CplxType* const arr, const size_t length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own understanding: any reason we need arr
to be a constant pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though the class field is declared as CplxType* const arr
intentionally, its certainly not required for the function parameter. I'll drop it.
|
||
if (numpyArrayInfo.ndim != 1) | ||
throw std::invalid_argument("NumPy array must be a 1-dimensional array"); | ||
if (numpyArrayInfo.itemsize != sizeof(CplxType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering: isn't the output of sizeof(CplxType)
platform-dependent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, double
"should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards). And struct padding shouldn't make a difference, whether its a 32-bit or 64-bit system. So all-in-all, I think its unlikely to be platform-dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards)
Yeah, was curious because of that. 🤔 Agreed, it would more so be implementation dependent. Would it be worth hardcoding the value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding the value would be inadvisable--at the end of the day, itemsize
must conform to the size of whatever complex type is in use, otherwise the internal numpy array will be misinterpreted when casting to CplxType*
.
AFAIK, there aren't any standard fixed-width floating point representations (unlike integers). However, there does exists a way of checking the floating-point representation is IEEE-754 compliant (e.g. this).
At the end of the day however, I'm not sure its wise to complicate the implementation by trying to support systems in which double
isn't IEEE-754 64-bit floating-point compliant. But I'll leave that up to you :)
) { | ||
unique_ptr<AbstractGate> gate = constructGate(opLabel, opParams); | ||
const vector<CplxType>& matrix = gate->asMatrix(); | ||
assert(matrix.size() == exp2(opWires.size()) * exp2(opWires.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we have such assert
statements? These would be unnecessary to have once the code is tested, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, I don't see the two (asserts and unit tests) as being dichotomous (i.e. why not both?). Asserts are helpful when it comes to making internal consistency checks. Unit tests are helpful when you want to validate the functionality of the application.
In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size()
; I'm somewhat ambivalent about removing the assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size()
Makes sense! The thing with assert
statements is that if they fail, then the user sees an assert error without a clear error message of what was wrong with their inputs. So in that sense throwing might be preferred here.
} | ||
], | ||
"source": [ | ||
"from cpuinfo import get_cpu_info\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requires pip install py-cpuinfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth adding a note that executing this cell requires an external library to be installed.
op_param = [o.parameters for o in operations] | ||
|
||
state_vector = np.ravel(state) | ||
assert state_vector.flags["C_CONTIGUOUS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should rather be unit tested.
assert state_vector.flags["C_CONTIGUOUS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- For this in particular, we probably don't even need to test it since it should be guaranteed by the semantics of
numpy.ravel()
. We can just remove the assert without losing sleep over it. - On tests in general, I think it's rather moot as long as the current unit test suites only run for the old backend. If we replace the old one altogether, then that simplifies the situation, and we can just add new ones to the current suites. If we don't, then we'll need to figure something else out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm up for keeping this assert, since before we were doing some strange order="F"
and would be good to catch this in case there's any remaining edge cases (which I doubt).
) | ||
|
||
if operations: | ||
# XXX: Do we need a copy here? Not sure of the required copy semantics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, checked on a toy example, we'll need copying here. This will also be double-checked when we have tests in place (tests will catch these sort of cases).
Edit: on a second thought, although self._state
is indeed mutated by self.apply_lightning
, it will anyways be updated when considering rotations
in the next if
. So we could get away with not copying here. Best to do some testing to consider some edge cases. 🙂
using std::unique_ptr; | ||
using std::vector; | ||
|
||
vector<unsigned int> Pennylane::getIndicesExcluding(vector<unsigned int>& excludedIndices, const unsigned int qubits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: is it worth representing indices with unsigned int
? My understanding is that the general take is to avoid using them even if it's almost certain that no arithmetic operations are being done using them. Though haven't really come across any weird cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. It's true, unsigned indices can be error-prone. I like unsigned indices because of the conceptual correspondence (i.e. valid indices are always unsigned) and because STL containers size()
methods typically return unsigned types (so using unsigned types avoids the compiler warnings about mixing signed and unsigned types). But I can be easily persuaded to ditch it.
for (unsigned int i = 0; i < qubits; i++) { | ||
indices.insert(indices.end(), i); | ||
} | ||
for (const unsigned int& excludedIndex : excludedIndices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for using a const ref for excludedIndex
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
to prevent modification of the vector elements (the intent is read-only), and a ref to avoid the implicit copy (which compilers may auto-optimise away anyway).
using std::unique_ptr; | ||
using std::vector; | ||
|
||
// FIXME: This should be reworked to use a function dispatch table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function dispatch table
Would that be something like std::map<std::string, func>
where func
is the function that returns the gate of the operator? (Something like what is there for the current ops).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be the idea. I just never got around to polishing it to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
Separating the formatting changes would, however, also help when looking at the PR later in the future. When tests are included, this addition would grow even more (it might easily be that it will be worth splitting it into smaller pieces at that time). Retrospectively looking into this PR might become cumbersome.
I guess I'm not familiar enough with your workflow to have a strong opinion on it; I typically don't find it any more cumbersome if refactoring changes are landed as a separate commit (rather than squashing every single commit altogether), and if I just view the PR commit-by-commit (as it was intended to be reviewed). But a simple git revert
of the second commit is trivial enough to add if that's what we need.
result = method() | ||
endTime = time.time() | ||
durations.append(endTime - startTime) | ||
gc.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less familiar with the garbage collector for Python than I am with Java, but I added this in on the off-chance that reclamation of resources triggered a stop-the-world GC. Probably won't happen unless it was really starved for memory though.
op_param = [o.parameters for o in operations] | ||
|
||
state_vector = np.ravel(state) | ||
assert state_vector.flags["C_CONTIGUOUS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- For this in particular, we probably don't even need to test it since it should be guaranteed by the semantics of
numpy.ravel()
. We can just remove the assert without losing sleep over it. - On tests in general, I think it's rather moot as long as the current unit test suites only run for the old backend. If we replace the old one altogether, then that simplifies the situation, and we can just add new ones to the current suites. If we don't, then we'll need to figure something else out.
for (unsigned int i = 0; i < qubits; i++) { | ||
indices.insert(indices.end(), i); | ||
} | ||
for (const unsigned int& excludedIndex : excludedIndices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
to prevent modification of the vector elements (the intent is read-only), and a ref to avoid the implicit copy (which compilers may auto-optimise away anyway).
) { | ||
unique_ptr<AbstractGate> gate = constructGate(opLabel, opParams); | ||
const vector<CplxType>& matrix = gate->asMatrix(); | ||
assert(matrix.size() == exp2(opWires.size()) * exp2(opWires.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, I don't see the two (asserts and unit tests) as being dichotomous (i.e. why not both?). Asserts are helpful when it comes to making internal consistency checks. Unit tests are helpful when you want to validate the functionality of the application.
In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size()
; I'm somewhat ambivalent about removing the assert.
using std::unique_ptr; | ||
using std::vector; | ||
|
||
// FIXME: This should be reworked to use a function dispatch table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be the idea. I just never got around to polishing it to do that.
|
||
if (numpyArrayInfo.ndim != 1) | ||
throw std::invalid_argument("NumPy array must be a 1-dimensional array"); | ||
if (numpyArrayInfo.itemsize != sizeof(CplxType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, double
"should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards). And struct padding shouldn't make a difference, whether its a 32-bit or 64-bit system. So all-in-all, I think its unlikely to be platform-dependent.
return StateVector((CplxType*)numpyArrayInfo.ptr, numpyArrayInfo.shape[0]); | ||
} | ||
|
||
Pennylane::StateVector::StateVector(CplxType* const arr, const size_t length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though the class field is declared as CplxType* const arr
intentionally, its certainly not required for the function parameter. I'll drop it.
using std::unique_ptr; | ||
using std::vector; | ||
|
||
vector<unsigned int> Pennylane::getIndicesExcluding(vector<unsigned int>& excludedIndices, const unsigned int qubits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. It's true, unsigned indices can be error-prone. I like unsigned indices because of the conceptual correspondence (i.e. valid indices are always unsigned) and because STL containers size()
methods typically return unsigned types (so using unsigned types avoids the compiler warnings about mixing signed and unsigned types). But I can be easily persuaded to ditch it.
return vector<unsigned int>(indices.begin(), indices.end()); | ||
} | ||
|
||
vector<size_t> Pennylane::generateBitPatterns(vector<unsigned int>& qubitIndices, const unsigned int qubits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vector<size_t> Pennylane::generateBitPatterns(vector<unsigned int>& qubitIndices, const unsigned int qubits) { | |
vector<size_t> Pennylane::generateBitPatterns(const vector<unsigned int>& qubitIndices, const unsigned int qubits) { |
* @param qubits number of qubits | ||
* @return decimal value corresponding to all possible bit patterns for the given indices | ||
*/ | ||
std::vector<size_t> generateBitPatterns(std::vector<unsigned int>& qubitIndices, const unsigned int qubits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<size_t> generateBitPatterns(std::vector<unsigned int>& qubitIndices, const unsigned int qubits); | |
std::vector<size_t> generateBitPatterns(const std::vector<unsigned int>& qubitIndices, const unsigned int qubits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ThomasLoke went for another round, it's looking great! Left a couple of comments & questions, but no major blockers for me.
Going forward, we were thinking of the following approach:
- In this PR the new lightweight backend would be added as a stand-alone device (hence the suggestions for a new
short_name
and the change to thesetup.py
). Once we've concluded the discussion, the addition will be merged in. - In a separate PR, our team would add tests for the new backend (and slightly adjust logic if needed). That will also be a good time for us to run some further tests & benchmarks that we already have in place for Python-based PennyLane devices.
- In a third PR, the new backend would be swapped out with the existing one.
What are your thoughts about these steps? 🙂
""" | ||
|
||
name = "Lightning Qubit PennyLane plugin" | ||
short_name = "lightning.qubit" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
short_name = "lightning.qubit" | |
short_name = "lightning.qubit.new" |
@@ -184,7 +207,9 @@ def build_extensions(self): | |||
"packages": find_packages(where="."), | |||
"package_data": {"pennylane_lightning": ["src/*"]}, | |||
"entry_points": { | |||
"pennylane.plugins": ["lightning.qubit = pennylane_lightning:LightningQubit",], | |||
"pennylane.plugins": [ | |||
"lightning.qubit = pennylane_lightning:LightningQubit", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"lightning.qubit = pennylane_lightning:LightningQubit", | |
"lightning.qubit = pennylane_lightning:LightningQubit", | |
"lightning.qubit.new = pennylane_lightning:LightningQubitNew", |
* @return decimal value for the qubit at specified index | ||
*/ | ||
inline size_t decimalValueForQubit(const unsigned int qubitIndex, const unsigned int qubits) { | ||
assert(qubitIndex < qubits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(qubitIndex < qubits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm potentially fine for the assert here. Looking here, isn't the idea of an assert to catch cases that should never happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I generally use asserts for internal consistency checks (which signifies programming errors) rather than for validation. But I can be convinced either way.
} | ||
|
||
/** | ||
* Calculates the decimal value for a qubit, assuming a big-endian convention. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decimal value for a qubit
This seems a bit confusing (perhaps just for me), may be worth rephrasing? Not quite sure what the decimal value for a qubit would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had given this some thought in the past, but could never arrive at a more satisfactory term. It's really just trying to compute something like 0100 -> 2^2 = 4, i.e. the decimal value you'd add if the bit was 1. Happy to take suggestions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! How about maxDecimalForQubit
or basisStateToDecimal
? It seems that we need both qubits
and qubitIndex
to get the result in the big-endian convention (and for assert(qubitIndex < qubits)
). Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex)
. Also, qubitIndex
determines the index in the qubit
long system where we have a 1
(on the other registers we have 0
). So basically qubitIndex
specifies a computational basis state that we are converting to the decimal representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex)
I assumed that wasn't an option, short of having different conventions between the Python side of things (which I assumed interpret qubits in a big-endian fashion) and the C++ backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that wasn't an option, short of having different conventions between the Python side of things (which I assumed interpret qubits in a big-endian fashion) and the C++ backend.
Absolutely, having it big-endian is a great approach.
Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex)
This comment was just further considering what is being done within the function to get closer to a descriptive name, the current implementation is great as is.
result = method() | ||
endTime = time.time() | ||
durations.append(endTime - startTime) | ||
gc.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. In that case, maybe it's worth going without it. Python will systematically call the garbage collector by default. Since we don't change the garbage collection for Python (and memory should be deallocated from within C++) we can get away with no calls here. Having it could incur some time overhead.
gc.collect() |
std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)), | ||
std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This initialization is a bit challenging to read, could we introduce helper members e.g., for std::cos(theta / 2)
and std::sin(theta / 2)
?
0, 0, std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)), | ||
0, 0, std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for introducing helper members as above.
static const std::vector<CplxType> matrix; | ||
public: | ||
static const std::string label; | ||
static XGate create(const std::vector<double>& parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the create
method be kept, it would be worth providing a comment on what its purpose is.
|
||
if (numpyArrayInfo.ndim != 1) | ||
throw std::invalid_argument("NumPy array must be a 1-dimensional array"); | ||
if (numpyArrayInfo.itemsize != sizeof(CplxType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards)
Yeah, was curious because of that. 🤔 Agreed, it would more so be implementation dependent. Would it be worth hardcoding the value here?
// limitations under the License. | ||
#include "StateVector.hpp" | ||
|
||
Pennylane::StateVector Pennylane::StateVector::create(const pybind11::array_t<CplxType>* numpyArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about merging create
into the constructor here too 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a somewhat intentional choice--we don't really need a pybind::array_t<CplxType>
in order to initialise what's really just a wrapper around a CplxType*
array (especially, say, if we wanted to unit test this). Farming it out into a create
method allows the validation checks to be done prior to class instantiation.
Sounds good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ThomasLoke, really nice addition and I'm a big fan of getting rid of Eigen and using our own approach. We'd like to merge this in ASAP, hopefully over the next few days, and we just need to decide practical details like whether we replace the old backend immediately or keep both for a time. Will keep you updated. Thanks!
durations.append(endTime - startTime) | ||
gc.collect() | ||
|
||
return [statistics.mean(durations), statistics.stdev(durations)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'm excited by this new backend!!
Going forward, it would be interesting to use the benchmarking suite we've been working on recently:
https://github.com/PennyLaneAI/benchmark
Do we have any intuition about speedups in the low qubit setting? E.g., do we expect new and old backends to be about the same speed for <10 qubits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any intuition about speedups in the low qubit setting? E.g., do we expect new and old backends to be about the same speed for <10 qubits?
Its hard to say, given that anything less than 10 qubits was difficult to even time in isolation with a millisecond resolution, at least for single-gate operations. I think the worst-case scenario for the new backend is using 3-qubit gates, as its implementation of matrix multiplication is wholly unoptimised, whereas Eigen probably uses a specialised kernel for it. That said, we'll want to move away from using this approach anyway in favour of more optimised kernels on a per-gate basis, so I don't think this is a big deal.
"""PennyLane Lightning device. | ||
|
||
An extension of PennyLane's built-in ``default.qubit`` device that interfaces with C++ to | ||
perform fast linear algebra calculations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perform fast linear algebra calculations | |
perform fast linear algebra calculations. |
|
||
Use of this device requires pre-built binaries or compilation from source. Check out the | ||
:doc:`/installation` guide for more details. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other version of this file, we have a 50+ qubit warning - does this no longer hold here? On the other hand, we'll unlikely need to go that high anyway 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it doesn't need to generate templates at compile-time for each possible number of qubits, so its not restricted by that. The issue is that the address space of a 64-bit machine only goes up to well, 64-bits, so in theory 64-bits is as high as it goes. Probably more like 62 or 63-bits given the need for additional allocations.
supports_inverse_operations=False, | ||
supports_analytic_computation=True, | ||
returns_state=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) | |
) | |
capabilities.pop("passthru_devices", None) |
Would probably get spotted when merging, but just adding here in case
op_param = [o.parameters for o in operations] | ||
|
||
state_vector = np.ravel(state) | ||
assert state_vector.flags["C_CONTIGUOUS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm up for keeping this assert, since before we were doing some strange order="F"
and would be good to catch this in case there's any remaining edge cases (which I doubt).
@@ -16,6 +16,7 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good to keep these changes - since the only changes were formatting we don't need to review.
} | ||
|
||
l_opts = { | ||
"msvc": [], | ||
"unix": ["-O3", "-Wall", "-fPIC", "-shared", "-fopenmp"], | ||
"unix": ["-O3", "-W", "-fPIC", "-shared", "-fopenmp"], | ||
} | ||
|
||
if platform.system() == "Darwin": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not needed for this PR), we can eventually remove the eigen lines from 124 and also mentions in the docs 🎉
* @return decimal value for the qubit at specified index | ||
*/ | ||
inline size_t decimalValueForQubit(const unsigned int qubitIndex, const unsigned int qubits) { | ||
assert(qubitIndex < qubits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm potentially fine for the assert here. Looking here, isn't the idea of an assert to catch cases that should never happen?
|
||
// ------------------------------------------------------------------------------------------------------------- | ||
|
||
Pennylane::AbstractGate::AbstractGate(int numQubits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general comment here: it looks great as is, but there is a risk of divergence between supported operations in PL and supported operations in lightning.qubit
.
Currently, we pass a list of operation names, a list of parameters, and a list of wires from the Python side to the C++ side of lightning.qubit
. What about passing just a list of matrices and a list of wires? I think matrix construction is fairly efficient even on the Python side.
In this way, we don't have to worry about defining operations and their matrices on the C++ side. But I'm not sure if that will result in a performance hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this way, we don't have to worry about defining operations and their matrices on the C++ side. But I'm not sure if that will result in a performance hit?
The idea is that we'll want to move away from defining gates as matrices altogether in favour of more specialised kernels per-gate, so implementation-wise this all should be kept in C++.
Shouldn't the risk of divergence be minimised by having a test suite that tests the full range of supported operations?
Cool! I'll look to push the remaining review changes tonight once I get the time. |
I think I've addressed the reviews--let me know if I've missed anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ThomasLoke thank you so much again for the amazing work, looks good to me! 🏆 💯 🙂
Had a minor suggestion. Also, thanks for the discussions, was really valuable. :)
return dispatchTable; | ||
} | ||
|
||
static const map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>> dispatchTable = createDispatchTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
auto it = dispatchTable.find(label); | ||
if (it == dispatchTable.end()) | ||
throw std::invalid_argument(label + " is not a supported gate type"); | ||
|
||
return it->second(parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor)
auto it = dispatchTable.find(label); | |
if (it == dispatchTable.end()) | |
throw std::invalid_argument(label + " is not a supported gate type"); | |
return it->second(parameters); | |
auto dispatchTableIterator = dispatchTable.find(label); | |
if (dispatchTableIterator == dispatchTable.end()) | |
throw std::invalid_argument(label + " is not a supported gate type"); | |
return dispatchTableIterator->second(parameters); |
template<class GateType> | ||
static void addToDispatchTable(map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>>& dispatchTable) { | ||
dispatchTable.emplace(GateType::label, [](const vector<double>& parameters) { return std::make_unique<GateType>(GateType::create(parameters)); }); | ||
} | ||
|
||
static map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>> createDispatchTable() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this! Looking 🔥
No worries--pleasure is all mine :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ThomasLoke! This is a very exciting PR, approved! 💯
One remaining thing is that my understanding of the underlying approach is still not yet solid. Would you mind giving a quick summary of how it works? It'll help me to see the underlying objective outside of the C++ implementation.
Thanks again for this!
Hi @ThomasLoke, the wheels check seems to have started causing a segmentation fault, although none of the changes seem to have caused it. This might be an error with Will merge this PR into a separate branch for further investigation before merging into |
TBH there's not that much to it. It accomplishes the same thing as the old backend, that is, it computes the tensor contraction of a specified gate sequence with a state vector. It does this by computing the bit patterns associated with each application of the matrix of the gate, e.g. if applying some two-qubit gate G to qubit 2-3 (out of 4), then the bit patterns to which we apply the matrix to are 0XY0 0XY1 1XY0 1XY1 (each denoting a sub-vector of length 4). Most of the apply logic just goes into computing the associated indices for the relevant bit patterns, gathering the sub-vector based on those indices, applying the matrix, and scattering the results back to the state vector. |
…ranch (#66) * Add new lightweight backend with performance comparisons (#57) * Tone down warning levels * Black reformatting * New backend + benchmark notebook * Add compilation for Eigen backend back in again * Fix compilation with gcc * Refactor construction of StateVector from numpy array into a static method * Move method into utility header * Review changes * More review changes + add function dispatch table * Remove outdated comment * Rename variable Co-authored-by: antalszava <[email protected]> * use old GateFactory.cpp from commit cff579f * revert 6124db8 * use old GateFactory.cpp from commit cff579f Co-authored-by: Thomas Loke <[email protected]>
This PR adds in a new backend and some rudimentary performance comparisons between the new and old backend. If we elect to adopt the new backend, then we may just want to replace the old one altogether.
The old backend represents state vectors as a multi-dimensional tensor in Eigen and applies gates in the form of tensor contractions, which really limits the optimisations that can be performed. The long-term goal with the new backend is to write things in such a way that offers more flexibility when it comes to optimisation choices. For now, it implements gate operations as tensor contractions, so it does much the same work as the Eigen backend, but in future, we can write gate operations in ways that don't require the full matrix representation of a gate.
There's 4 commits in this:
Performance comparisons on my machine show about a x20-30 speedup relative to the old backend. For full disclosure, I doubt its from the backend itself--during development of the new backend, I found that the pybind11 bindings was the largest factor responsible for slowness, because it looked like it was returning a plain Python list instead of a numpy array, and then having to convert that into a numpy array. Its possible the old backend also has a similar problem, although I'm not that familiar with how pybind11 handles conversion between numpy arrays and Eigen vectors; if so, then perhaps a similar speedup can be achieved just using the old backend but reworking the pybind11 bindings. Even if that's the case, I still don't think its of much long-term value, just because of how representation using Eigen tensors inhibits future optimisation choices (there's a lot of low-hanging optimisation fruit for the new backend that remains to be picked).
On a much more minor note, the new backend also doesn't need template metaprogramming to have arbitrary rank tensors (although in theory, its hard limited to 64 bits, since indices only go up to unsigned 64-bit integers), which is a plus in my book :)
One caveat is that I haven't tested the results of the new backend extensively (beyond timing and inspecting a few state vectors by eye), so help validating the results would be appreciated.