Add new lightweight backend with performance comparisons #57

ThomasLoke · 2021-02-04T15:38:12Z

This PR adds in a new backend and some rudimentary performance comparisons between the new and old backend. If we elect to adopt the new backend, then we may just want to replace the old one altogether.

The old backend represents state vectors as a multi-dimensional tensor in Eigen and applies gates in the form of tensor contractions, which really limits the optimisations that can be performed. The long-term goal with the new backend is to write things in such a way that offers more flexibility when it comes to optimisation choices. For now, it implements gate operations as tensor contractions, so it does much the same work as the Eigen backend, but in future, we can write gate operations in ways that don't require the full matrix representation of a gate.

There's 4 commits in this:

Tone down warning levels: Compiling with Wall with MSVC floods the compilation with warning messages for the Eigen backend, so I toned down the warning levels for my own sanity.
Black reformatting: I integrated PyCharm with the black formatter, and it automatically reformats these files every single time I touch them, so I separated these out.
New backend + benchmark notebook: The main change, which adds in a separate backend and a notebook that plots the average execution time of both.
Add compilation for Eigen backend back in again: I turned off compilation of the Eigen backend during development because it literally takes minutes to compile; the new backend has no external dependencies (yet), so compilation is almost instantaneous.

Performance comparisons on my machine show about a x20-30 speedup relative to the old backend. For full disclosure, I doubt its from the backend itself--during development of the new backend, I found that the pybind11 bindings was the largest factor responsible for slowness, because it looked like it was returning a plain Python list instead of a numpy array, and then having to convert that into a numpy array. Its possible the old backend also has a similar problem, although I'm not that familiar with how pybind11 handles conversion between numpy arrays and Eigen vectors; if so, then perhaps a similar speedup can be achieved just using the old backend but reworking the pybind11 bindings. Even if that's the case, I still don't think its of much long-term value, just because of how representation using Eigen tensors inhibits future optimisation choices (there's a lot of low-hanging optimisation fruit for the new backend that remains to be picked).

On a much more minor note, the new backend also doesn't need template metaprogramming to have arbitrary rank tensors (although in theory, its hard limited to 64 bits, since indices only go up to unsigned 64-bit integers), which is a plus in my book :)

One caveat is that I haven't tested the results of the new backend extensively (beyond timing and inspecting a few state vectors by eye), so help validating the results would be appreciated.

codecov · 2021-02-04T15:50:55Z

Codecov Report

Merging #57 (92cb98c) into master (2ceb93b) will decrease coverage by 37.19%.
The diff coverage is 30.76%.

@@             Coverage Diff              @@
##            master      #57       +/-   ##
============================================
- Coverage   100.00%   62.80%   -37.20%     
============================================
  Files            3        5        +2     
  Lines           56      121       +65     
============================================
+ Hits            56       76       +20     
- Misses           0       45       +45

Impacted Files	Coverage Δ
pennylane_lightning/lightning_benchmark.py	`0.00% <0.00%> (ø)`
pennylane_lightning/lightning_qubit_new.py	`44.18% <44.18%> (ø)`
pennylane_lightning/__init__.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ceb93b...743e0af. Read the comment docs.

co9olguy · 2021-02-04T17:21:07Z

Thanks for the PR @ThomasLoke!

chaserileyroberts · 2021-02-16T22:37:03Z

pennylane_lightning/lightning_qubit_new.py

+                    )
+
+        if operations:
+            # XXX: Do we need a copy here? Not sure of the required copy semantics


you do not need a copy here.

Actually I take that back I'm not sure. @antalszava do we need to use np.copy() here?

Yes, checked on a toy example, we'll need copying here. This will also be double-checked when we have tests in place (tests will catch these sort of cases).

Edit: on a second thought, although self._state is indeed mutated by self.apply_lightning, it will anyways be updated when considering rotations in the next if. So we could get away with not copying here. Best to do some testing to consider some edge cases. 🙂

pennylane_lightning/src/rework/StateVector.cpp

pennylane_lightning/src/rework/StateVector.hpp

chaserileyroberts · 2021-02-17T01:24:27Z

setup.py

+        "msvc": ["-EHsc", "-O2", "-W1", "-std:c++11"],
+        "unix": ["-O3", "-W", "-fPIC", "-shared", "-fopenmp"],


Why do you reduce the warning level?

See remark above:

Tone down warning levels: Compiling with Wall with MSVC floods the compilation with warning messages for the Eigen backend, so I toned down the warning levels for my own sanity.

chaserileyroberts · 2021-02-17T01:29:57Z

pennylane_lightning/lightning_qubit_new.py

+            if isinstance(operation, (QubitStateVector, BasisState)):
+                if i == 0:
+                    self._apply_operation(operation)
+                    del operations[0]


why the del here?

This is just copy-pasta from the original lightning_qubit.py--well, at least, it was before this commit. My branch isn't that fresh anymore.

That commit will be required here too because the API changed in PennyLane (_apply_operation changed).

pennylane_lightning/src/rework/Gates.cpp

chaserileyroberts · 2021-02-17T02:31:00Z

pennylane_lightning/src/rework/Gates.hpp

+    private:
+        static const std::vector<CplxType> matrix;
+    public:
+        static const std::string label;
+        static XGate create(const std::vector<double>& parameters);
+        inline const std::vector<CplxType>& asMatrix() {
+            return matrix;
+        }


A lot of this code is repeated. Can we move matrix, label and asMatrix() to the AbstractGate interface?

No, they're static class variables, so they need to be defined in the subclass.

Note also that asMatrix() already exists in the superclass as a virtual method, so it is already declared there.

pennylane_lightning/lightning_qubit_new.py

pennylane_lightning/src/rework/Apply.cpp

chaserileyroberts · 2021-02-17T05:56:54Z

pennylane_lightning/src/rework/GateFactory.cpp

+    if (Pennylane::XGate::label == label) {
+        gate = std::make_unique<Pennylane::XGate>(Pennylane::XGate::create(parameters));
+    }
+    else if (Pennylane::YGate::label == label) {
+        gate = std::make_unique<Pennylane::YGate>(Pennylane::YGate::create(parameters));
+    }
+    else if (Pennylane::ZGate::label == label) {


Should we use a switch case instead?

Also you could just do return ... instead of gate = ....

Switch cases don't work for strings.

Switch cases don't work for strings.

Huh. I didn't know that.

chaserileyroberts · 2021-02-17T06:04:59Z

tests/test_apply.py

@@ -16,6 +16,7 @@
 """


It looks like this entire file is just formatting changes. Can we revert them to make this PR smaller?

A lot of this PR does involve formatting changes--see remark 2 above:

Black reformatting: I integrated PyCharm with the black formatter, and it automatically reformats these files every single time I touch them, so I separated these out.

@ThomasLoke would you be able to git checkout master <filename> for the files that have only formatting changes, and commit+push? It will help out with the code review side.

At the same time, we should make a separate PR to ensure all tests etc. are blacked, to reduce large formatting diffs in future

I'd have hoped that separating it out into a separate commit would have helped with the code review, given that you can review commit-by-commit?

To be clear, I expect PRs to consist of a sequence of well-defined commits that group changes into logical chunks. If you're trying to review the PR based on the overall diff, then yes, I'd expect you'd run into problems.

Yes, our team usually group chunks by PRs, rather than diffs. Especially as some comments/suggestions/changes that come up during review might affect subsequent updates

I'm good to keep these changes - since the only changes were formatting we don't need to review.

…ethod

antalszava

Hi @ThomasLoke, this is a really great approach, thank you so much for this! 🥇 💯

I've left some comments, mostly for my own understanding. Haven't delved into all details, mostly started zooming in at some parts from a higher level.

Also checked the benchmarks and they look really promising indeed when we have >=10 qubit systems! Below that, the new backend is usually comparable or slightly better than the old one.

I'd have hoped that separating it out into a separate commit would have helped with the code review, given that you can review commit-by-commit?

True, with the heads-up it's for sure easier reviewing now, thanks for the note! 🙂

Separating the formatting changes would, however, also help when looking at the PR later in the future. When tests are included, this addition would grow even more (it might easily be that it will be worth splitting it into smaller pieces at that time). Retrospectively looking into this PR might become cumbersome.

antalszava · 2021-02-19T20:12:54Z

pennylane_lightning/lightning_qubit_new.py

+            if isinstance(operation, (QubitStateVector, BasisState)):
+                if i == 0:
+                    self._apply_operation(operation)
+                    del operations[0]


That commit will be required here too because the API changed in PennyLane (_apply_operation changed).

antalszava · 2021-02-19T20:29:54Z

pennylane_lightning/src/rework/StateVector.cpp

+    return StateVector((CplxType*)numpyArrayInfo.ptr, numpyArrayInfo.shape[0]);
+}
+
+Pennylane::StateVector::StateVector(CplxType* const arr, const size_t length)


Just for my own understanding: any reason we need arr to be a constant pointer?

Though the class field is declared as CplxType* const arr intentionally, its certainly not required for the function parameter. I'll drop it.

antalszava · 2021-02-19T20:51:29Z

pennylane_lightning/src/rework/StateVector.cpp

+
+    if (numpyArrayInfo.ndim != 1)
+        throw std::invalid_argument("NumPy array must be a 1-dimensional array");
+    if (numpyArrayInfo.itemsize != sizeof(CplxType))


Just wondering: isn't the output of sizeof(CplxType) platform-dependent?

AFAIK, double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards). And struct padding shouldn't make a difference, whether its a 32-bit or 64-bit system. So all-in-all, I think its unlikely to be platform-dependent.

double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards)

Yeah, was curious because of that. 🤔 Agreed, it would more so be implementation dependent. Would it be worth hardcoding the value here?

Hardcoding the value would be inadvisable--at the end of the day, itemsize must conform to the size of whatever complex type is in use, otherwise the internal numpy array will be misinterpreted when casting to CplxType*.

AFAIK, there aren't any standard fixed-width floating point representations (unlike integers). However, there does exists a way of checking the floating-point representation is IEEE-754 compliant (e.g. this).

At the end of the day however, I'm not sure its wise to complicate the implementation by trying to support systems in which double isn't IEEE-754 64-bit floating-point compliant. But I'll leave that up to you :)

antalszava · 2021-02-19T21:56:11Z

pennylane_lightning/src/rework/Apply.cpp

+) {
+    unique_ptr<AbstractGate> gate = constructGate(opLabel, opParams);
+    const vector<CplxType>& matrix = gate->asMatrix();
+    assert(matrix.size() == exp2(opWires.size()) * exp2(opWires.size()));


How come we have such assert statements? These would be unnecessary to have once the code is tested, correct?

To me, I don't see the two (asserts and unit tests) as being dichotomous (i.e. why not both?). Asserts are helpful when it comes to making internal consistency checks. Unit tests are helpful when you want to validate the functionality of the application.

In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size(); I'm somewhat ambivalent about removing the assert.

In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size()

Makes sense! The thing with assert statements is that if they fail, then the user sees an assert error without a clear error message of what was wrong with their inputs. So in that sense throwing might be preferred here.

antalszava · 2021-02-20T21:53:17Z

lightning_benchmark.ipynb

+    }
+   ],
+   "source": [
+    "from cpuinfo import get_cpu_info\n",


Requires pip install py-cpuinfo

Maybe worth adding a note that executing this cell requires an external library to be installed.

antalszava · 2021-02-20T22:24:30Z

pennylane_lightning/lightning_qubit_new.py

+        op_param = [o.parameters for o in operations]
+
+        state_vector = np.ravel(state)
+        assert state_vector.flags["C_CONTIGUOUS"]


This should rather be unit tested.

Suggested change

assert state_vector.flags["C_CONTIGUOUS"]

Two things:

For this in particular, we probably don't even need to test it since it should be guaranteed by the semantics of numpy.ravel(). We can just remove the assert without losing sleep over it.

On tests in general, I think it's rather moot as long as the current unit test suites only run for the old backend. If we replace the old one altogether, then that simplifies the situation, and we can just add new ones to the current suites. If we don't, then we'll need to figure something else out.

I'm up for keeping this assert, since before we were doing some strange order="F" and would be good to catch this in case there's any remaining edge cases (which I doubt).

antalszava · 2021-02-20T22:36:29Z

pennylane_lightning/lightning_qubit_new.py

+                    )
+
+        if operations:
+            # XXX: Do we need a copy here? Not sure of the required copy semantics


Yes, checked on a toy example, we'll need copying here. This will also be double-checked when we have tests in place (tests will catch these sort of cases).

Edit: on a second thought, although self._state is indeed mutated by self.apply_lightning, it will anyways be updated when considering rotations in the next if. So we could get away with not copying here. Best to do some testing to consider some edge cases. 🙂

antalszava · 2021-02-20T22:56:40Z

pennylane_lightning/src/rework/Apply.cpp

+using std::unique_ptr;
+using std::vector;
+
+vector<unsigned int> Pennylane::getIndicesExcluding(vector<unsigned int>& excludedIndices, const unsigned int qubits) {


Out of curiosity: is it worth representing indices with unsigned int? My understanding is that the general take is to avoid using them even if it's almost certain that no arithmetic operations are being done using them. Though haven't really come across any weird cases.

Probably not. It's true, unsigned indices can be error-prone. I like unsigned indices because of the conceptual correspondence (i.e. valid indices are always unsigned) and because STL containers size() methods typically return unsigned types (so using unsigned types avoids the compiler warnings about mixing signed and unsigned types). But I can be easily persuaded to ditch it.

antalszava · 2021-02-20T22:57:52Z

pennylane_lightning/src/rework/Apply.cpp

+    for (unsigned int i = 0; i < qubits; i++) {
+        indices.insert(indices.end(), i);
+    }
+    for (const unsigned int& excludedIndex : excludedIndices) {


Any reason for using a const ref for excludedIndex here?

const to prevent modification of the vector elements (the intent is read-only), and a ref to avoid the implicit copy (which compilers may auto-optimise away anyway).

antalszava · 2021-02-20T23:10:59Z

pennylane_lightning/src/rework/GateFactory.cpp

+using std::unique_ptr;
+using std::vector;
+
+// FIXME: This should be reworked to use a function dispatch table


function dispatch table

Would that be something like std::map<std::string, func> where func is the function that returns the gate of the operator? (Something like what is there for the current ops).

Yeah, that would be the idea. I just never got around to polishing it to do that.

ThomasLoke

Thanks for the reviews!

Separating the formatting changes would, however, also help when looking at the PR later in the future. When tests are included, this addition would grow even more (it might easily be that it will be worth splitting it into smaller pieces at that time). Retrospectively looking into this PR might become cumbersome.

I guess I'm not familiar enough with your workflow to have a strong opinion on it; I typically don't find it any more cumbersome if refactoring changes are landed as a separate commit (rather than squashing every single commit altogether), and if I just view the PR commit-by-commit (as it was intended to be reviewed). But a simple git revert of the second commit is trivial enough to add if that's what we need.

ThomasLoke · 2021-02-22T13:41:22Z

pennylane_lightning/lightning_benchmark.py

+        result = method()
+        endTime = time.time()
+        durations.append(endTime - startTime)
+        gc.collect()


I'm less familiar with the garbage collector for Python than I am with Java, but I added this in on the off-chance that reclamation of resources triggered a stop-the-world GC. Probably won't happen unless it was really starved for memory though.

ThomasLoke · 2021-02-22T13:49:01Z

pennylane_lightning/lightning_qubit_new.py

+        op_param = [o.parameters for o in operations]
+
+        state_vector = np.ravel(state)
+        assert state_vector.flags["C_CONTIGUOUS"]


Two things:

For this in particular, we probably don't even need to test it since it should be guaranteed by the semantics of numpy.ravel(). We can just remove the assert without losing sleep over it.

On tests in general, I think it's rather moot as long as the current unit test suites only run for the old backend. If we replace the old one altogether, then that simplifies the situation, and we can just add new ones to the current suites. If we don't, then we'll need to figure something else out.

ThomasLoke · 2021-02-22T13:51:41Z

pennylane_lightning/src/rework/Apply.cpp

+    for (unsigned int i = 0; i < qubits; i++) {
+        indices.insert(indices.end(), i);
+    }
+    for (const unsigned int& excludedIndex : excludedIndices) {


const to prevent modification of the vector elements (the intent is read-only), and a ref to avoid the implicit copy (which compilers may auto-optimise away anyway).

ThomasLoke · 2021-02-22T14:05:28Z

pennylane_lightning/src/rework/Apply.cpp

+) {
+    unique_ptr<AbstractGate> gate = constructGate(opLabel, opParams);
+    const vector<CplxType>& matrix = gate->asMatrix();
+    assert(matrix.size() == exp2(opWires.size()) * exp2(opWires.size()));


To me, I don't see the two (asserts and unit tests) as being dichotomous (i.e. why not both?). Asserts are helpful when it comes to making internal consistency checks. Unit tests are helpful when you want to validate the functionality of the application.

In this particular case, we should at least throw if the number of qubits on which the gate is defined doesn't match opWires.size(); I'm somewhat ambivalent about removing the assert.

ThomasLoke · 2021-02-22T14:06:05Z

pennylane_lightning/src/rework/GateFactory.cpp

+using std::unique_ptr;
+using std::vector;
+
+// FIXME: This should be reworked to use a function dispatch table


Yeah, that would be the idea. I just never got around to polishing it to do that.

ThomasLoke · 2021-02-22T16:28:14Z

pennylane_lightning/src/rework/StateVector.cpp

+
+    if (numpyArrayInfo.ndim != 1)
+        throw std::invalid_argument("NumPy array must be a 1-dimensional array");
+    if (numpyArrayInfo.itemsize != sizeof(CplxType))


AFAIK, double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards). And struct padding shouldn't make a difference, whether its a 32-bit or 64-bit system. So all-in-all, I think its unlikely to be platform-dependent.

ThomasLoke · 2021-02-22T16:29:25Z

pennylane_lightning/src/rework/StateVector.cpp

+    return StateVector((CplxType*)numpyArrayInfo.ptr, numpyArrayInfo.shape[0]);
+}
+
+Pennylane::StateVector::StateVector(CplxType* const arr, const size_t length)


Though the class field is declared as CplxType* const arr intentionally, its certainly not required for the function parameter. I'll drop it.

ThomasLoke · 2021-02-22T16:59:41Z

pennylane_lightning/src/rework/Apply.cpp

+using std::unique_ptr;
+using std::vector;
+
+vector<unsigned int> Pennylane::getIndicesExcluding(vector<unsigned int>& excludedIndices, const unsigned int qubits) {


Probably not. It's true, unsigned indices can be error-prone. I like unsigned indices because of the conceptual correspondence (i.e. valid indices are always unsigned) and because STL containers size() methods typically return unsigned types (so using unsigned types avoids the compiler warnings about mixing signed and unsigned types). But I can be easily persuaded to ditch it.

antalszava · 2021-02-22T20:29:28Z

pennylane_lightning/src/rework/Apply.cpp

+    return vector<unsigned int>(indices.begin(), indices.end());
+}
+
+vector<size_t> Pennylane::generateBitPatterns(vector<unsigned int>& qubitIndices, const unsigned int qubits) {


Suggested change

vector<size_t> Pennylane::generateBitPatterns(vector<unsigned int>& qubitIndices, const unsigned int qubits) {

vector<size_t> Pennylane::generateBitPatterns(const vector<unsigned int>& qubitIndices, const unsigned int qubits) {

antalszava · 2021-02-22T20:33:36Z

pennylane_lightning/src/rework/Apply.hpp

+     * @param qubits number of qubits
+     * @return decimal value corresponding to all possible bit patterns for the given indices
+     */
+    std::vector<size_t> generateBitPatterns(std::vector<unsigned int>& qubitIndices, const unsigned int qubits);


Suggested change

std::vector<size_t> generateBitPatterns(std::vector<unsigned int>& qubitIndices, const unsigned int qubits);

std::vector<size_t> generateBitPatterns(const std::vector<unsigned int>& qubitIndices, const unsigned int qubits);

antalszava

@ThomasLoke went for another round, it's looking great! Left a couple of comments & questions, but no major blockers for me.

Going forward, we were thinking of the following approach:

In this PR the new lightweight backend would be added as a stand-alone device (hence the suggestions for a new short_name and the change to the setup.py). Once we've concluded the discussion, the addition will be merged in.
In a separate PR, our team would add tests for the new backend (and slightly adjust logic if needed). That will also be a good time for us to run some further tests & benchmarks that we already have in place for Python-based PennyLane devices.
In a third PR, the new backend would be swapped out with the existing one.

What are your thoughts about these steps? 🙂

antalszava · 2021-02-22T22:04:26Z

pennylane_lightning/lightning_qubit_new.py

+    """
+
+    name = "Lightning Qubit PennyLane plugin"
+    short_name = "lightning.qubit"


Suggested change

short_name = "lightning.qubit"

short_name = "lightning.qubit.new"

antalszava · 2021-02-22T22:05:09Z

setup.py

@@ -184,7 +207,9 @@ def build_extensions(self):
    "packages": find_packages(where="."),
    "package_data": {"pennylane_lightning": ["src/*"]},
    "entry_points": {
-        "pennylane.plugins": ["lightning.qubit = pennylane_lightning:LightningQubit",],
+        "pennylane.plugins": [
+            "lightning.qubit = pennylane_lightning:LightningQubit",


Suggested change

"lightning.qubit = pennylane_lightning:LightningQubit",

"lightning.qubit = pennylane_lightning:LightningQubit",

"lightning.qubit.new = pennylane_lightning:LightningQubitNew",

antalszava · 2021-02-22T22:19:34Z

pennylane_lightning/src/rework/Util.hpp

+     * @return decimal value for the qubit at specified index
+     */
+    inline size_t decimalValueForQubit(const unsigned int qubitIndex, const unsigned int qubits) {
+        assert(qubitIndex < qubits);


Suggested change

assert(qubitIndex < qubits);

I'm potentially fine for the assert here. Looking here, isn't the idea of an assert to catch cases that should never happen?

Yeah, I generally use asserts for internal consistency checks (which signifies programming errors) rather than for validation. But I can be convinced either way.

antalszava · 2021-02-22T22:29:51Z

pennylane_lightning/src/rework/Util.hpp

+    }
+
+    /**
+     * Calculates the decimal value for a qubit, assuming a big-endian convention.


decimal value for a qubit

This seems a bit confusing (perhaps just for me), may be worth rephrasing? Not quite sure what the decimal value for a qubit would be.

I had given this some thought in the past, but could never arrive at a more satisfactory term. It's really just trying to compute something like 0100 -> 2^2 = 4, i.e. the decimal value you'd add if the bit was 1. Happy to take suggestions though.

Makes sense! How about maxDecimalForQubit or basisStateToDecimal? It seems that we need both qubits and qubitIndex to get the result in the big-endian convention (and for assert(qubitIndex < qubits)). Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex). Also, qubitIndex determines the index in the qubit long system where we have a 1 (on the other registers we have 0). So basically qubitIndex specifies a computational basis state that we are converting to the decimal representation.

Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex)

I assumed that wasn't an option, short of having different conventions between the Python side of things (which I assumed interpret qubits in a big-endian fashion) and the C++ backend.

I assumed that wasn't an option, short of having different conventions between the Python side of things (which I assumed interpret qubits in a big-endian fashion) and the C++ backend.

Absolutely, having it big-endian is a great approach.

Otherwise, if we had assumed a little-endian convention, we could go with exp2(qubitIndex)

This comment was just further considering what is being done within the function to get closer to a descriptive name, the current implementation is great as is.

antalszava · 2021-02-22T22:35:16Z

pennylane_lightning/lightning_benchmark.py

+        result = method()
+        endTime = time.time()
+        durations.append(endTime - startTime)
+        gc.collect()


Got it. In that case, maybe it's worth going without it. Python will systematically call the garbage collector by default. Since we don't change the garbage collection for Python (and memory should be deallocated from within C++) we can get away with no calls here. Having it could incur some time overhead.

Suggested change

gc.collect()

antalszava · 2021-02-22T23:59:59Z

pennylane_lightning/src/rework/Gates.cpp

+      std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)),
+      std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) }


This initialization is a bit challenging to read, could we introduce helper members e.g., for std::cos(theta / 2) and std::sin(theta / 2)?

antalszava · 2021-02-23T00:01:22Z

pennylane_lightning/src/rework/Gates.cpp

+      0, 0, std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)),
+      0, 0, std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) }


Same comment for introducing helper members as above.

antalszava · 2021-02-23T00:04:51Z

pennylane_lightning/src/rework/Gates.hpp

+        static const std::vector<CplxType> matrix;
+    public:
+        static const std::string label;
+        static XGate create(const std::vector<double>& parameters);


Should the create method be kept, it would be worth providing a comment on what its purpose is.

antalszava · 2021-02-23T00:21:37Z

pennylane_lightning/src/rework/StateVector.cpp

+
+    if (numpyArrayInfo.ndim != 1)
+        throw std::invalid_argument("NumPy array must be a 1-dimensional array");
+    if (numpyArrayInfo.itemsize != sizeof(CplxType))


double "should" be an IEEE-754 64-bit floating point type (though its not explicitly guaranteed by C++ standards)

Yeah, was curious because of that. 🤔 Agreed, it would more so be implementation dependent. Would it be worth hardcoding the value here?

antalszava · 2021-02-23T00:25:13Z

pennylane_lightning/src/rework/StateVector.cpp

+// limitations under the License.
+#include "StateVector.hpp"
+
+Pennylane::StateVector Pennylane::StateVector::create(const pybind11::array_t<CplxType>* numpyArray) {


Curious about merging create into the constructor here too 🤔

This was a somewhat intentional choice--we don't really need a pybind::array_t<CplxType> in order to initialise what's really just a wrapper around a CplxType* array (especially, say, if we wanted to unit test this). Farming it out into a create method allows the validation checks to be done prior to class instantiation.

ThomasLoke · 2021-02-23T14:12:02Z

Going forward, we were thinking of the following approach:

Sounds good to me!

trbromley

Thanks @ThomasLoke, really nice addition and I'm a big fan of getting rid of Eigen and using our own approach. We'd like to merge this in ASAP, hopefully over the next few days, and we just need to decide practical details like whether we replace the old backend immediately or keep both for a time. Will keep you updated. Thanks!

trbromley · 2021-02-24T20:09:21Z

pennylane_lightning/lightning_benchmark.py

+        durations.append(endTime - startTime)
+        gc.collect()
+
+    return [statistics.mean(durations), statistics.stdev(durations)]


Nice! I'm excited by this new backend!!

Going forward, it would be interesting to use the benchmarking suite we've been working on recently:
https://github.com/PennyLaneAI/benchmark

Do we have any intuition about speedups in the low qubit setting? E.g., do we expect new and old backends to be about the same speed for <10 qubits?

Do we have any intuition about speedups in the low qubit setting? E.g., do we expect new and old backends to be about the same speed for <10 qubits?

Its hard to say, given that anything less than 10 qubits was difficult to even time in isolation with a millisecond resolution, at least for single-gate operations. I think the worst-case scenario for the new backend is using 3-qubit gates, as its implementation of matrix multiplication is wholly unoptimised, whereas Eigen probably uses a specialised kernel for it. That said, we'll want to move away from using this approach anyway in favour of more optimised kernels on a per-gate basis, so I don't think this is a big deal.

trbromley · 2021-02-24T20:14:08Z

pennylane_lightning/lightning_qubit_new.py

+    """PennyLane Lightning device.
+
+    An extension of PennyLane's built-in ``default.qubit`` device that interfaces with C++ to
+    perform fast linear algebra calculations


Suggested change

perform fast linear algebra calculations

perform fast linear algebra calculations.

trbromley · 2021-02-24T20:14:59Z

pennylane_lightning/lightning_qubit_new.py

+
+    Use of this device requires pre-built binaries or compilation from source. Check out the
+    :doc:`/installation` guide for more details.
+


In the other version of this file, we have a 50+ qubit warning - does this no longer hold here? On the other hand, we'll unlikely need to go that high anyway 😆

Well, it doesn't need to generate templates at compile-time for each possible number of qubits, so its not restricted by that. The issue is that the address space of a 64-bit machine only goes up to well, 64-bits, so in theory 64-bits is as high as it goes. Probably more like 62 or 63-bits given the need for additional allocations.

trbromley · 2021-02-24T20:28:03Z

pennylane_lightning/lightning_qubit_new.py

+            supports_inverse_operations=False,
+            supports_analytic_computation=True,
+            returns_state=True,
+        )


Suggested change

)

)

capabilities.pop("passthru_devices", None)

Would probably get spotted when merging, but just adding here in case

trbromley · 2021-02-24T21:03:16Z

pennylane_lightning/lightning_qubit_new.py

+        op_param = [o.parameters for o in operations]
+
+        state_vector = np.ravel(state)
+        assert state_vector.flags["C_CONTIGUOUS"]


I'm up for keeping this assert, since before we were doing some strange order="F" and would be good to catch this in case there's any remaining edge cases (which I doubt).

trbromley · 2021-02-24T21:11:54Z

tests/test_apply.py

@@ -16,6 +16,7 @@
 """


I'm good to keep these changes - since the only changes were formatting we don't need to review.

trbromley · 2021-02-24T21:14:16Z

setup.py

    }

    l_opts = {
        "msvc": [],
-        "unix": ["-O3", "-Wall", "-fPIC", "-shared", "-fopenmp"],
+        "unix": ["-O3", "-W", "-fPIC", "-shared", "-fopenmp"],
    }

    if platform.system() == "Darwin":


(not needed for this PR), we can eventually remove the eigen lines from 124 and also mentions in the docs 🎉

trbromley · 2021-02-24T21:25:04Z

pennylane_lightning/src/rework/Util.hpp

+     * @return decimal value for the qubit at specified index
+     */
+    inline size_t decimalValueForQubit(const unsigned int qubitIndex, const unsigned int qubits) {
+        assert(qubitIndex < qubits);


I'm potentially fine for the assert here. Looking here, isn't the idea of an assert to catch cases that should never happen?

trbromley · 2021-02-24T21:54:23Z

pennylane_lightning/src/rework/Gates.cpp

+
+// -------------------------------------------------------------------------------------------------------------
+
+Pennylane::AbstractGate::AbstractGate(int numQubits)


One general comment here: it looks great as is, but there is a risk of divergence between supported operations in PL and supported operations in lightning.qubit.

Currently, we pass a list of operation names, a list of parameters, and a list of wires from the Python side to the C++ side of lightning.qubit. What about passing just a list of matrices and a list of wires? I think matrix construction is fairly efficient even on the Python side.

In this way, we don't have to worry about defining operations and their matrices on the C++ side. But I'm not sure if that will result in a performance hit?

In this way, we don't have to worry about defining operations and their matrices on the C++ side. But I'm not sure if that will result in a performance hit?

The idea is that we'll want to move away from defining gates as matrices altogether in favour of more specialised kernels per-gate, so implementation-wise this all should be kept in C++.

Shouldn't the risk of divergence be minimised by having a test suite that tests the full range of supported operations?

ThomasLoke · 2021-02-25T01:06:50Z

We'd like to merge this in ASAP, hopefully over the next few days, and we just need to decide practical details like whether we replace the old backend immediately or keep both for a time.

Cool! I'll look to push the remaining review changes tonight once I get the time.

ThomasLoke · 2021-02-25T15:32:43Z

I think I've addressed the reviews--let me know if I've missed anything.

antalszava

@ThomasLoke thank you so much again for the amazing work, looks good to me! 🏆 💯 🙂

Had a minor suggestion. Also, thanks for the discussions, was really valuable. :)

antalszava · 2021-02-25T16:37:03Z

pennylane_lightning/src/rework/GateFactory.cpp

+    return dispatchTable;
+}
+
+static const map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>> dispatchTable = createDispatchTable();


antalszava · 2021-02-25T16:41:05Z

pennylane_lightning/src/rework/GateFactory.cpp

+    auto it = dispatchTable.find(label);
+    if (it == dispatchTable.end())
+        throw std::invalid_argument(label + " is not a supported gate type");
+
+    return it->second(parameters);


(minor)

Suggested change

auto it = dispatchTable.find(label);

if (it == dispatchTable.end())

throw std::invalid_argument(label + " is not a supported gate type");

return it->second(parameters);

auto dispatchTableIterator = dispatchTable.find(label);

if (dispatchTableIterator == dispatchTable.end())

throw std::invalid_argument(label + " is not a supported gate type");

return dispatchTableIterator->second(parameters);

antalszava · 2021-02-25T16:45:10Z

pennylane_lightning/src/rework/GateFactory.cpp

+template<class GateType>
+static void addToDispatchTable(map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>>& dispatchTable) {
+    dispatchTable.emplace(GateType::label, [](const vector<double>& parameters) { return std::make_unique<GateType>(GateType::create(parameters)); });
+}
+
+static map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<double>&)>> createDispatchTable() {


Thanks for adding this! Looking 🔥

ThomasLoke · 2021-02-25T17:02:56Z

No worries--pleasure is all mine :)

trbromley

Thanks @ThomasLoke! This is a very exciting PR, approved! 💯

One remaining thing is that my understanding of the underlying approach is still not yet solid. Would you mind giving a quick summary of how it works? It'll help me to see the underlying objective outside of the C++ implementation.

Thanks again for this!

antalszava · 2021-02-25T22:44:19Z

Hi @ThomasLoke, the wheels check seems to have started causing a segmentation fault, although none of the changes seem to have caused it. This might be an error with pytest.

Will merge this PR into a separate branch for further investigation before merging into master. The authorship of this contribution will be kept in the process.

ThomasLoke · 2021-02-26T01:01:31Z

One remaining thing is that my understanding of the underlying approach is still not yet solid. Would you mind giving a quick summary of how it works? It'll help me to see the underlying objective outside of the C++ implementation.

TBH there's not that much to it. It accomplishes the same thing as the old backend, that is, it computes the tensor contraction of a specified gate sequence with a state vector. It does this by computing the bit patterns associated with each application of the matrix of the gate, e.g. if applying some two-qubit gate G to qubit 2-3 (out of 4), then the bit patterns to which we apply the matrix to are 0XY0 0XY1 1XY0 1XY1 (each denoting a sub-vector of length 4). Most of the apply logic just goes into computing the associated indices for the relevant bit patterns, gathering the sub-vector based on those indices, applying the matrix, and scattering the results back to the state vector.

…ranch (#66) * Add new lightweight backend with performance comparisons (#57) * Tone down warning levels * Black reformatting * New backend + benchmark notebook * Add compilation for Eigen backend back in again * Fix compilation with gcc * Refactor construction of StateVector from numpy array into a static method * Move method into utility header * Review changes * More review changes + add function dispatch table * Remove outdated comment * Rename variable Co-authored-by: antalszava <[email protected]> * use old GateFactory.cpp from commit cff579f * revert 6124db8 * use old GateFactory.cpp from commit cff579f Co-authored-by: Thomas Loke <[email protected]>

ThomasLoke added 5 commits January 25, 2021 09:35

Tone down warning levels

ad37c41

Black reformatting

73e9135

New backend + benchmark notebook

a113296

Add compilation for Eigen backend back in again

1487063

Fix compilation with gcc

71814bf

chaserileyroberts self-requested a review February 16, 2021 22:25

chaserileyroberts reviewed Feb 16, 2021

View reviewed changes

chaserileyroberts reviewed Feb 17, 2021

View reviewed changes

ThomasLoke added 3 commits February 17, 2021 22:22

Refactor construction of StateVector from numpy array into a static m…

a395beb

…ethod

Move method into utility header

7ca96d4

Review changes

cff579f

antalszava reviewed Feb 20, 2021

View reviewed changes

ThomasLoke commented Feb 22, 2021

View reviewed changes

antalszava reviewed Feb 22, 2021

View reviewed changes

antalszava reviewed Feb 23, 2021

View reviewed changes

trbromley reviewed Feb 24, 2021

View reviewed changes

ThomasLoke added 2 commits February 25, 2021 15:26

More review changes + add function dispatch table

4d224ab

Remove outdated comment

28620ae

antalszava approved these changes Feb 25, 2021

View reviewed changes

antalszava and others added 2 commits February 25, 2021 11:57

Merge branch 'master' into rework

51b53ce

Rename variable

743e0af

trbromley approved these changes Feb 25, 2021

View reviewed changes

antalszava changed the base branch from master to new_backend February 25, 2021 22:40

antalszava merged commit 7348978 into PennyLaneAI:new_backend Feb 25, 2021

antalszava mentioned this pull request Feb 26, 2021

Add new lightweight backend with performance comparisons to default branch #66

Merged

		"msvc": ["-EHsc", "-O2", "-W1", "-std:c++11"],
		"unix": ["-O3", "-W", "-fPIC", "-shared", "-fopenmp"],

	vector<size_t> Pennylane::generateBitPatterns(vector<unsigned int>& qubitIndices, const unsigned int qubits) {
	vector<size_t> Pennylane::generateBitPatterns(const vector<unsigned int>& qubitIndices, const unsigned int qubits) {

	std::vector<size_t> generateBitPatterns(std::vector<unsigned int>& qubitIndices, const unsigned int qubits);
	std::vector<size_t> generateBitPatterns(const std::vector<unsigned int>& qubitIndices, const unsigned int qubits);

	short_name = "lightning.qubit"
	short_name = "lightning.qubit.new"

	"lightning.qubit = pennylane_lightning:LightningQubit",
	"lightning.qubit = pennylane_lightning:LightningQubit",
	"lightning.qubit.new = pennylane_lightning:LightningQubitNew",

		std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)),
		std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) }

		0, 0, std::cos(theta / 2) * std::pow(M_E, CplxType(0, (-phi - omega) / 2)), -std::sin(theta / 2) * std::pow(M_E, CplxType(0, (phi - omega) / 2)),
		0, 0, std::sin(theta / 2) * std::pow(M_E, CplxType(0, (-phi + omega) / 2)), std::cos(theta / 2) * std::pow(M_E, CplxType(0, (phi + omega) / 2)) }

	perform fast linear algebra calculations
	perform fast linear algebra calculations.


		Use of this device requires pre-built binaries or compilation from source. Check out the
		:doc:`/installation` guide for more details.


		// -------------------------------------------------------------------------------------------------------------

		Pennylane::AbstractGate::AbstractGate(int numQubits)

Add new lightweight backend with performance comparisons #57

Add new lightweight backend with performance comparisons #57

Conversation

ThomasLoke commented Feb 4, 2021 • edited Loading

codecov bot commented Feb 4, 2021 • edited Loading

Codecov Report

co9olguy commented Feb 4, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava Feb 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava Feb 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke Feb 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLoke commented Feb 4, 2021 •

edited

Loading

codecov bot commented Feb 4, 2021 •

edited

Loading

antalszava Feb 20, 2021 •

edited

Loading

ThomasLoke Feb 22, 2021 •

edited

Loading

antalszava Feb 20, 2021 •

edited

Loading

ThomasLoke Feb 22, 2021 •

edited

Loading

ThomasLoke Feb 22, 2021 •

edited

Loading

ThomasLoke Feb 22, 2021 •

edited

Loading

ThomasLoke Feb 25, 2021 •

edited

Loading

ThomasLoke Feb 23, 2021 •

edited

Loading