diff --git a/doc/source/code-examples/applications.rst b/doc/source/code-examples/applications.rst index 3687a2b79f..aca52c059a 100644 --- a/doc/source/code-examples/applications.rst +++ b/doc/source/code-examples/applications.rst @@ -55,6 +55,7 @@ High-Energy Physics :maxdepth: 1 tutorials/qPDF/qPDF.ipynb + tutorials/anomaly_detection/README.md Quantum Physics @@ -105,6 +106,7 @@ Variational Quantum Circuits tutorials/unary/README.md tutorials/bell-variational/README.md tutorials/qPDF/qPDF.ipynb + tutorials/anomaly_detection/README.md Grover's Algorithm diff --git a/doc/source/code-examples/tutorials/anomaly_detection/README.md b/doc/source/code-examples/tutorials/anomaly_detection/README.md new file mode 120000 index 0000000000..631704e07d --- /dev/null +++ b/doc/source/code-examples/tutorials/anomaly_detection/README.md @@ -0,0 +1 @@ +../../../../../examples/anomaly_detection/README.md \ No newline at end of file diff --git a/doc/source/code-examples/tutorials/anomaly_detection/images b/doc/source/code-examples/tutorials/anomaly_detection/images new file mode 120000 index 0000000000..9bee242a0a --- /dev/null +++ b/doc/source/code-examples/tutorials/anomaly_detection/images @@ -0,0 +1 @@ +../../../../../examples/anomaly_detection/images/ \ No newline at end of file diff --git a/examples/README.md b/examples/README.md index 0a1a1db7fc..efc9eea474 100644 --- a/examples/README.md +++ b/examples/README.md @@ -21,6 +21,7 @@ physics problems. - [Maximal violation of Bell inequalities variationally](bell-variational/README.md) - [Feedback-based ALgorithm for Quantum OptimizatioN - FALQON](falqon/README.md) - [A general Grover model](grover/README.md) +- [Quantum anomaly detection](anomaly_detection/README.md) In the `benchmarks` folder we have included examples concerning: - A generic benchmark script for multiple circuits (`benchmarks/main.py`) diff --git a/examples/anomaly_detection/README.md b/examples/anomaly_detection/README.md new file mode 100644 index 0000000000..3d14b099d3 --- /dev/null +++ b/examples/anomaly_detection/README.md @@ -0,0 +1,89 @@ +# Anomaly detection with variational quantum circuits + +Code at: [https://github.com/qiboteam/qibo/tree/master/examples/anomaly_detection](https://github.com/qiboteam/qibo/tree/master/examples/anomaly_detection). + +## Problem overview + +With the contemporary peak in interest regarding machine learning algorithms for their many applications in scientific research as well as industrial technology, we also have the simultaneous development of quantum computing. A combination of these two fields has lead to the development of quantum machine learning algorithms. +With this example we want to study the quantum version of a classic machine learning algorithm known as [anomaly detection](https://arxiv.org/abs/2007.02500). This algorithm is implemented with an artificial neural network, in particular an [autoencoder](https://arxiv.org/abs/2003.05991). In quantum machine learning, the autoencoder is realised using a variational quantum circuit. The proposed algorithm is not meant to outperform the classical counterpart on classical data. This work aims to demonstrate that it is possible to use quantum variational algorithms for anomaly detection with possible future advantages in the analysis of quantum data. + +## Background + +In this section we want to explain the main elements to understand the proposed quantum anomaly detection algorithm. + +### Anomaly detection + +Anomaly detection is a classification algorithm that allows to identify anomalous data. The advantage in using this machine learning technique is that only a dataset with non anomalous (standard) data samples is required for the training. +To achieve this it's necessary to train a particular artificial neural network (ANN) architecture called autoencoder. An autoencoder is composed of two main parts: encoder and decoder. + +![Autoencoder architecture](images/Fig1.png) + +The encoder compresses initial data down to a small dimension (called latent dimension). The decoder inverts the process to reconstruct the original data from the compressed one. The parameters of the neural network are trained in order to minimize the difference between the initial and reconstructed data. The loss function (also called reconstruction loss) is therefore a measure of how accurately the reconstructed data resembles the original. + +For anomaly detection, the autoencoder is trained only on data samples belonging to the standard class. When the trained model is applied to new samples we expect the loss function to have different values for standard and anomalous data. +By choosing a threshold value for the loss function it is possible to classify an input based on whether its reconstruction loss lands above or below this threshold. The ROC curve (Receiver Operating Characteristic) indicates the true positive rate and false positive rate as a function of the threshold. This can help to set the threshold value in order to maximize true positive classifications and minimize false positives. + +### Variational quantum circuits + +A Variational Quantum Circuit (VQC), also known as parametrized quantum circuit, can be used as the quantum counterpart of classical ANNs. In this kind of circuits the input information is stored in the initial state of the qubits. It can be stored as the phase (phase encoding) or in the states amplitudes (amplitude encoding). The initial state is transformed using rotation gates and entangling gates, usually controlled-not (C-NOT) gates. These gates can be organised in layers, in this circuit architecture one layer is composed of rotation gates (R_x, R_y, R_z) acting on all qubits followed by a series of C-NOT gates coupling neighbouring qubits. The trainable weights are the angles of rotation gates and can be trained using standard backpropagation (implemented with Tensorflow). + +![variational quantum circuit (one layer)](images/Fig2.png) + +A quantum circuit implements a unitary, thus invertible, transformation on the initial state. This represents a great advantage for the autoencoder architecture, as the decoder can be taken as the inverse of the encoder quantum circuit. In order to compress information the encoder circuit has to disentangle and set to zero state a given number of qubits. The loss function is thus taken as the expected measurement values of these qubits. In this way, for the training of the circuit, it is necessary only the encoder. + +![Quantum autoencoder](images/Fig3.png) + +## Algorithm implementation + +This section refers to the optimal algorithm parameters (default). + +Anomaly detection on handwritten digits is carried out on the MNIST dataset using zeros as the standard data and ones as the anomalous data. We compressed the images down to 8X8 pixels, in this way it is possible to encode initial data in six qubits. It is necessary to normalise the initial data array so that it can be encoded as state amplitudes. +The best configuration has been found with six layers and three compressed qubits. For entangling gates we have tested different C-NOT configurations, the one that gave better performance is reported in figure below. +Moreover this configuration requires only nearest neighbour connectivity for six qubits placed in a ring topology. In order to improve the performance, rotation gates with trainable parameters were added at the end of the encoder circuit for the three compressed qubits. A summary of the employed circuit is reported in the next figure. + +![Circuit ansatz](images/Fig4.png) + +### Training + +For the training of the circuit a dataset of 5000 images of zero handwritten digits has been employed. The loss function is the sum of the probabilities of the ground state for the first three qubits, thus these qubits are forced to the |1> state. +Training has been performed for 20 epochs using [Adam optimizer](https://arxiv.org/pdf/1412.6980.pdf), with a dynamic learning rate that spans from 0.4 in the first epochs to 0.001 in the last ones. This variable learning rate has helped reducing the problem of [barren plateaus](https://arxiv.org/pdf/1803.11173.pdf). + +### Performance evaluation + +To test the anomaly detection algorithm after the training phase, we have used 2000 standard images not used in the training and 2000 anomalous images. Figure below shows the loss distribution for the two test datasets. + +![Loss function distribution](images/Fig5.png) + +The ROC curve shows the rate of true positive with respect to the rate of false positive by moving the loss value threshold for the binary classification. + +![ROC curve](images/Fig6.png) + +## How to run an example? + +The code is divided into two parts, training of the circuit (`train.py`) and performance evaluation (`test.py`). + +It is possible to define the following hyper-parameters for the training of the circuit (default have good performance): +- `n_layers` (int): number of ansatz circuit layers (default 6). +- `batch_size` (int): number of samples in one training batch (default 20). +- `nepochs` (int): number of training epochs (default 20). +- `train_size` (int): number of samples used for training, the remainings are used for performance evaluation, total samples are 7000 (default 5000). +- `filename` (str): location and file name where trained parameters are saved (default "parameters/trained_params.npy"). +- `lr_boundaries` (list): epochs when learning rate is reduced, 6 monotone growing values from 0 to nepochs (default [3,6,9,12,15,18]). + +It is possible to define the following hyper-parameters for the performance evaluation of the circuit, `n_layers` must be equal to the one used for training: +- `n_layers` (int): number of ansatz circuit layers (default 6). +- `train_size` (int): number of samples used for training, the remainings are used for performance evaluation, total samples are 7000 (default 5000). +- `filename` (str): location and file name of trained parameters to be tested (default "parameters/trained_params.npy"). +- `plot` (bool): make plots of ROC and loss function distribution (default True). +- `save_loss` (bool): save losses for standard and anomalous data (default False). + +As an example, in order to use 4 layers in the variational quantum ansatz, you should execute the following command for training: + +```bash +python train.py --n_layers 4 +``` +And the following command for performance evaluation: + +```bash +python test.py --n_layers 4 +``` diff --git a/examples/anomaly_detection/data/anomalous_data.npy b/examples/anomaly_detection/data/anomalous_data.npy new file mode 100644 index 0000000000..0d57f3d40d Binary files /dev/null and b/examples/anomaly_detection/data/anomalous_data.npy differ diff --git a/examples/anomaly_detection/data/standard_data.npy b/examples/anomaly_detection/data/standard_data.npy new file mode 100644 index 0000000000..7bd92673e2 Binary files /dev/null and b/examples/anomaly_detection/data/standard_data.npy differ diff --git a/examples/anomaly_detection/images/Fig1.png b/examples/anomaly_detection/images/Fig1.png new file mode 100644 index 0000000000..7c072874d2 Binary files /dev/null and b/examples/anomaly_detection/images/Fig1.png differ diff --git a/examples/anomaly_detection/images/Fig2.png b/examples/anomaly_detection/images/Fig2.png new file mode 100644 index 0000000000..858a00658e Binary files /dev/null and b/examples/anomaly_detection/images/Fig2.png differ diff --git a/examples/anomaly_detection/images/Fig3.png b/examples/anomaly_detection/images/Fig3.png new file mode 100644 index 0000000000..e7f2b8cdf2 Binary files /dev/null and b/examples/anomaly_detection/images/Fig3.png differ diff --git a/examples/anomaly_detection/images/Fig4.png b/examples/anomaly_detection/images/Fig4.png new file mode 100644 index 0000000000..7f2c7a35ec Binary files /dev/null and b/examples/anomaly_detection/images/Fig4.png differ diff --git a/examples/anomaly_detection/images/Fig5.png b/examples/anomaly_detection/images/Fig5.png new file mode 100644 index 0000000000..b559fd425c Binary files /dev/null and b/examples/anomaly_detection/images/Fig5.png differ diff --git a/examples/anomaly_detection/images/Fig6.png b/examples/anomaly_detection/images/Fig6.png new file mode 100644 index 0000000000..f79c15eee5 Binary files /dev/null and b/examples/anomaly_detection/images/Fig6.png differ diff --git a/examples/anomaly_detection/parameters/trained_params.npy b/examples/anomaly_detection/parameters/trained_params.npy new file mode 100644 index 0000000000..e14de68391 Binary files /dev/null and b/examples/anomaly_detection/parameters/trained_params.npy differ diff --git a/examples/anomaly_detection/test.py b/examples/anomaly_detection/test.py new file mode 100644 index 0000000000..6bf03a27b0 --- /dev/null +++ b/examples/anomaly_detection/test.py @@ -0,0 +1,202 @@ +import argparse + +import matplotlib.pyplot as plt +import numpy as np +import tensorflow as tf + +import qibo +from qibo import gates +from qibo.models import Circuit + + +def main(n_layers, train_size, filename, plot, save_loss): + """Implements performance evaluation of a trained circuit. + + Args: + n_layers (int): number of ansatz circuit layers (default 6). + train_size (int): number of samples used for training, the remainings are used for performance evaluation, total samples are 7000 (default 5000). + filename (str): location and file name of trained parameters to be tested (default "parameters/trained_params.npy"). + plot (bool): make plots of ROC and loss function distribution (default True). + save_loss (bool): save losses for standard and anomalous data (default False). + """ + + qibo.set_backend("tensorflow") + + # Circuit ansatz + def make_encoder(n_qubits, n_layers, params, q_compression): + """Create encoder quantum circuit. + + Args: + n_qubits (int): number of qubits in the circuit. + n_layers (int): number of ansatz circuit layers. + params (tf.Tensor): parameters of the circuit. + q_compression (int): number of compressed qubits. + + Returns: + encoder (:class:`qibo.models.circuit.Circuit`): variational quantum circuit. + """ + + index = 0 + encoder = Circuit(n_qubits) + for i in range(n_layers): + for j in range(n_qubits): + encoder.add(gates.RX(j, params[index])) + encoder.add(gates.RY(j, params[index + 1])) + encoder.add(gates.RZ(j, params[index + 2])) + index += 3 + + for j in range(n_qubits): + encoder.add(gates.CNOT(j, (j + 1) % n_qubits)) + + for j in range(q_compression): + encoder.add(gates.RX(j, params[index])) + encoder.add(gates.RY(j, params[index + 1])) + encoder.add(gates.RZ(j, params[index + 2])) + index += 3 + return encoder + + # Evaluate loss function (3 qubit compression) for one sample + def compute_loss_test(encoder, vector): + """Evaluate loss function for one test sample. + + Args: + encoder (:class:`qibo.models.circuit.Circuit`): variational quantum circuit (trained). + vector (tf.Tensor): test sample, in the form of 1d vector. + + Returns: + loss (tf.Variable): loss of the test sample. + """ + reconstructed = encoder(vector) + # 3 qubits compression + loss = ( + reconstructed.probabilities(qubits=[0])[0] + + reconstructed.probabilities(qubits=[1])[0] + + reconstructed.probabilities(qubits=[2])[0] + ) + return loss + + # Other hyperparameters + n_qubits = 6 + q_compression = 3 + + # Load and pre-process data + dataset_np_s = np.load("data/standard_data.npy") + dataset_np_s = dataset_np_s[train_size:] + dataset_s = tf.convert_to_tensor(dataset_np_s) + dataset_np_a = np.load("data/anomalous_data.npy") + dataset_np_a = dataset_np_a[train_size:] + dataset_a = tf.convert_to_tensor(dataset_np_a) + + # Load trained parameters + trained_params_np = np.load(filename) + trained_params = tf.convert_to_tensor(trained_params_np) + + # Create and print encoder circuit + encoder_test = make_encoder(n_qubits, n_layers, trained_params, q_compression) + encoder_test.compile() + print("Circuit model summary") + print(encoder_test.draw()) + + print("Computing losses...") + # Compute loss for standard data + loss_s = [] + for i in range(len(dataset_np_s)): + loss_s.append(compute_loss_test(encoder_test, dataset_s[i]).numpy()) + + # Compute loss for anomalous data + loss_a = [] + for i in range(len(dataset_np_a)): + loss_a.append(compute_loss_test(encoder_test, dataset_a[i]).numpy()) + + if save_loss: + np.save("results/losses_standard_data", loss_s) + np.save("results/losses_anomalous_data", loss_a) + + # Make graphs for performance analysis + if plot: + + """Loss distribution graph""" + plt.hist(loss_a, bins=60, histtype="step", color="red", label="Anomalous data") + plt.hist(loss_s, bins=60, histtype="step", color="blue", label="Standard data") + plt.ylabel("Number of images") + plt.xlabel("Loss value") + plt.title("Loss function distribution (MNIST dataset)") + plt.legend() + plt.savefig("results/loss_distribution.png") + plt.close() + + """Compute ROC curve""" + max1 = np.amax(loss_s) + max2 = np.amax(loss_a) + ma = max(max1, max2) + min1 = np.amin(loss_s) + min2 = np.amin(loss_a) + mi = min(min1, min2) + + tot_neg = len(loss_s) + tot_pos = len(loss_a) + + n_step = 100.0 + n_step_int = 100 + step = (ma - mi) / n_step + fpr = [] + tpr = [] + for i in range(n_step_int): + treshold = i * step + mi + c = 0 + for j in range(tot_neg): + if loss_s[j] > treshold: + c += 1 + false_positive = c / float(tot_neg) + fpr.append(false_positive) + c = 0 + for j in range(tot_pos): + if loss_a[j] > treshold: + c += 1 + true_positive = c / float(tot_pos) + tpr.append(true_positive) + + """Roc curve graph """ + plt.title("Receiver Operating Characteristic") + plt.plot(fpr, tpr) + plt.xlim([0, 1]) + plt.ylim([0, 1]) + plt.ylabel("True Positive Rate") + plt.xlabel("False Positive Rate") + plt.savefig("results/ROC.png") + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument( + "--n_layers", + default=6, + type=int, + help="(int): number of ansatz circuit layers", + ) + parser.add_argument( + "--train_size", + default=5000, + type=int, + help="(int): number of samples used for training, the remainings are used for performance evaluation (total samples 7000)", + ) + parser.add_argument( + "--filename", + default="parameters/trained_params.npy", + type=str, + help="(str): location and file name of trained parameters to be tested", + ) + parser.add_argument( + "--plot", + default=True, + type=bool, + help="(bool): make plots of ROC and loss function distribution", + ) + parser.add_argument( + "--save_loss", + default=False, + type=bool, + help="(bool): save losses for standard and anomalous data", + ) + args = parser.parse_args() + main(**vars(args)) diff --git a/examples/anomaly_detection/train.py b/examples/anomaly_detection/train.py new file mode 100644 index 0000000000..a112160ed2 --- /dev/null +++ b/examples/anomaly_detection/train.py @@ -0,0 +1,203 @@ +import argparse +import math + +import numpy as np +import tensorflow as tf +from tensorflow.keras.optimizers import Adam, schedules + +import qibo +from qibo import gates +from qibo.models import Circuit + + +def main(n_layers, batch_size, nepochs, train_size, filename, lr_boundaries): + """Implements training of variational quantum circuit. + + Args: + n_layers (int): number of ansatz circuit layers (default 6). + batch_size (int): number of samples in one training batch (default 20). + nepochs (int): number of training epochs (default 20). + train_size (int): number of samples used for training, the remainings are used for performance evaluation, total samples are 7000 (default 5000). + filename (str): location and file name where trained parameters are saved (default "parameters/trained_params.npy"). + lr_boundaries (list): epochs when learning rate is reduced, 6 monotone growing values from 0 to nepochs (default [3,6,9,12,15,18]). + """ + + qibo.set_backend("tensorflow") + + # Circuit ansatz + def make_encoder(n_qubits, n_layers, params, q_compression): + """Create encoder quantum circuit. + + Args: + n_qubits (int): number of qubits in the circuit. + n_layers (int): number of ansatz circuit layers. + params (tf.Variable): parameters of the circuit. + q_compression (int): number of compressed qubits. + + Returns: + encoder (:class:`qibo.models.circuit.Circuit`): variational quantum circuit. + """ + + index = 0 + encoder = Circuit(n_qubits) + for i in range(n_layers): + for j in range(n_qubits): + encoder.add(gates.RX(j, params[index])) + encoder.add(gates.RY(j, params[index + 1])) + encoder.add(gates.RZ(j, params[index + 2])) + index += 3 + + for j in range(n_qubits): + encoder.add(gates.CNOT(j, (j + 1) % n_qubits)) + + for j in range(q_compression): + encoder.add(gates.RX(j, params[index])) + encoder.add(gates.RY(j, params[index + 1])) + encoder.add(gates.RZ(j, params[index + 2])) + index += 3 + return encoder + + # Evaluate loss function (3 qubit compression) for one sample + @tf.function + def compute_loss(encoder, params, vector): + """Evaluate loss function for one train sample. + + Args: + encoder (:class:`qibo.models.circuit.Circuit`): variational quantum circuit. + params (tf.Variable): parameters of the circuit. + vector (tf.Tensor): train sample, in the form of 1d vector. + + Returns: + loss (tf.Variable): loss of the training sample. + """ + + encoder.set_parameters(params) + reconstructed = encoder(vector) + # 3 qubits compression + loss = ( + reconstructed.probabilities(qubits=[0])[0] + + reconstructed.probabilities(qubits=[1])[0] + + reconstructed.probabilities(qubits=[2])[0] + ) + return loss + + # One optimization step + @tf.function + def train_step(batch_size, encoder, params, dataset): + """Evaluate loss function on one train batch. + + Args: + batch_size (int): number of samples in one training batch. + encoder (:class:`qibo.models.circuit.Circuit`): variational quantum circuit. + params (tf.Variable): parameters of the circuit. + vector (tf.Tensor): train sample, in the form of 1d vector. + + Returns: + loss (tf.Variable): average loss of the training batch. + """ + + loss = 0.0 + with tf.GradientTape() as tape: + for sample in range(batch_size): + loss = loss + compute_loss(encoder, params, dataset[sample]) + loss = loss / batch_size + grads = tape.gradient(loss, params) + optimizer.apply_gradients(zip([grads], [params])) + return loss + + # Other hyperparameters + n_qubits = 6 + q_compression = 3 + + # Load and pre-process data + dataset_np = np.load("data/standard_data.npy") + dataset = tf.convert_to_tensor(dataset_np) + train = dataset[0:train_size] + + # Initialize random parameters + n_params = (n_layers * n_qubits + q_compression) * 3 + params = tf.Variable(tf.random.normal((n_params,))) + + # Create and print encoder circuit + encoder = make_encoder(n_qubits, n_layers, params, q_compression) + print("Circuit model summary") + print(encoder.draw()) + + # Define optimizer parameters + steps_for_epoch = math.ceil(train_size / batch_size) + boundaries = [ + steps_for_epoch * lr_boundaries[0], + steps_for_epoch * lr_boundaries[1], + steps_for_epoch * lr_boundaries[2], + steps_for_epoch * lr_boundaries[3], + steps_for_epoch * lr_boundaries[4], + steps_for_epoch * lr_boundaries[5], + ] + values = [0.4, 0.2, 0.08, 0.04, 0.01, 0.005, 0.001] + learning_rate_fn = tf.keras.optimizers.schedules.PiecewiseConstantDecay( + boundaries, values + ) + optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate_fn) + + # This array contains the parameters at each epoch + trained_params = np.zeros((nepochs, n_params), dtype=float) + + # Training + print("Trained parameters will be saved in: ", filename) + print("Start training") + for epoch in range(nepochs): + tf.random.shuffle(train) + for i in range(steps_for_epoch): + loss = train_step( + batch_size, + encoder, + params, + train[i * batch_size : (i + 1) * batch_size], + ) + trained_params[epoch] = params.numpy() + print("Epoch: %d Loss: %f" % (epoch + 1, loss)) + + # Save parameters of last epoch + np.save(filename, trained_params[-1]) + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument( + "--n_layers", + default=6, + type=int, + help="(int): number of ansatz circuit layers", + ) + parser.add_argument( + "--batch_size", + default=20, + type=int, + help="(int): number of samples in one training batch", + ) + parser.add_argument( + "--nepochs", + default=20, + type=int, + help="(int): number of training epochs", + ) + parser.add_argument( + "--train_size", + default=5000, + type=int, + help="(int): number of samples used for training, the remainings are used for performance evaluation (total samples 7000)", + ) + parser.add_argument( + "--filename", + default="parameters/trained_params.npy", + type=str, + help="(str): location and file name where trained parameters are saved", + ) + parser.add_argument( + "--lr_boundaries", + default=[3, 6, 9, 12, 15, 18], + type=list, + help="(list): epochs when learning rate is reduced (6 monotone growing values from 0 to nepochs)", + ) + args = parser.parse_args() + main(**vars(args)) diff --git a/examples/test_examples.py b/examples/test_examples.py index 12e2de30b4..fc59964a9b 100644 --- a/examples/test_examples.py +++ b/examples/test_examples.py @@ -318,3 +318,19 @@ def test_grover_example3(nqubits, num_1): sys.path[-1] = path os.chdir(path) run_script(args, script_name="example3.py") + + +@pytest.mark.parametrize("n_layers", [6]) +@pytest.mark.parametrize("batch_size", [20]) +@pytest.mark.parametrize("nepochs", [7]) +@pytest.mark.parametrize("train_size", [100]) +@pytest.mark.parametrize("filename", ["parameters/test_params"]) +@pytest.mark.parametrize("lr_boundaries", [[1, 2, 3, 4, 5, 6]]) +def test_anomalydetection_train( + n_layers, batch_size, nepochs, train_size, filename, lr_boundaries +): + args = locals() + path = os.path.join(base_dir, "anomaly_detection") + sys.path[-1] = path + os.chdir(path) + run_script(args, script_name="train.py")