title | jupytext | kernelspec | mystnb | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Mathematical framework for Quantum Computing |
|
|
|
(lecture-4)=
:class: tip
In this lecture we discuss the mathematical framework and tools required to properly understand how quantum system works. We give a short introduction to notion of vector spaces, linear operators, and how they connect to physical reality.
Let us go through some of the building blocks of quantum mechanics, and see how the required mathematical framework emerges.
It is a fundamental principle of quantum mechanics, which states that certain pairs of physical properties can't be measured simultaneously with arbitrary precision, for example a particle's position and momentum. The properties which can form such pairs are called conjugate properties to each other. In the coming sections we will discuss more about them. In fact the product of uncertainties of the conjugate properties is bounded below. If the conjugate properties are
where
In everyday life we use the term state
when we refer to the situation/condition somebody or something is in. We make it more comprehensive by adding adjectives to it, such as in state of weakness, in state of pain, etc. In case of a physical system, the same idea extends to a comprehensive definition, with quantifiability. The state of a physical system refers to something that can provide complete physically accessible information about the system at a given instant of time. This is quantified with first determining the number of independent quantities required for complete information, and then assigning a variable for each of those.
A numerical value of each of those variable defines the state of the system. Formally -
A state is a set of variables describing a system which does not include anything about its history.
This set of variable is assumed to be minimal so that removing any component results in only partial information about the system. Take for example the case of a single isolated particle.
Then it's position and velocity are two independent minimal properties. If the motion of this particle is constrained along a line, then a variable
Note that this set has to be ordered so there is no confusion in the meaning of each of its elements. Such ordered sets are essentially what one calls vectors. We discuss vectors in more details in the coming sections.
Add Figure
A reference frame refers to a chosen point of origin, and a given orientation of coordinate axes in which a position is measured.
:class: tip
The degrees of freedom of a system is a number associated with it. It is the number of independent ways one
can change the system's state. Alternatively, it's also the minimum number of individual quantities required to specify the state.
In case of single particle, it's degrees of freedom is 6. For a system of
Now consider a quantum particle, and think whether we can use the above mentioned set of positions and velocities/momenta
So how do we define state of a quantum system? Here one postulates, that it is possible to define a state
of a quantum system, represented as some type of abstract object. This object will likely have multiple components associated with it, one for each degree of freedom. It will thus be a vector
of some kind. For the moment we call it an abstract vector, as we don't know it's structure. As we go throught the sections, we will explore how this structure emerges, and how it works.
So, for the moment, we say that the state of a quantum system is represented by an abstract vector. In due course we will learn several synonyms for this abstract vector defining the state of a quantum system.
Superposition, issentially means applying multiple effect/stimuli/attributes on a system/object.
In general, the superposition principle applies to anywhere more than one attributes can be imposed on a single object. Formally speaking, the principle states that in all linear systems, the net response caused by two or more stimuli on a system is sum of the individual responses that would have been caused by each individual stimulus.
The sum of responses of individual effects, is same as the response of the sum of individual effecs.
Once we go through the mathematical structure in the coming sections, it will become clear that the superposition principle issentially is a consequence of linearity in the system, or linear behaviour.
In quantum mechanics, the superposition applies to states. What that means is, that if a system can exist in say two states
Figure for illustration
The notion of intereference is quantum systems is very similar to what it means in waves. When two waves are combined, their instantaneous intensities or displacements are added. The intensity or displacement of the resulting wave can be greater or lower depending on their relative phase
.
In case of quantum system, as we will see, probability amplitudes take the role of intensity or displacements.
In quantum mechanics, a measurement is testing or manipulation of a physical system that yields a numerical result.
This in principle doesn't look much different than a measurement of a classical system. However, in the quantum realm,
the process of measureing
something also changes the underlying system.
Here we discuss in bravety the necessary mathematical structures upon which the formulation of quantum mechanics relies on. It progressively goes as follows -
mindmap
root(Mathematical Framework)
Vector Space, Hilbert Space
Scalar vs Vector
States as Vector (Bra and Ket)
Linear combination
Linear independence
Inner Product
Overlap of vectors
Orthogonality
Linear Operators
Commutativity
Special operators: Unitary, Hermitian, ...
Probability conservation
Representation theory
Any quantity that can be described by a single numerical value, usually with real, and sometimes with complex numbers is a scalar. The common examples of scalars are volume, density, speed, energy, mass and time.
Quantities require multiples of numerical values, with each value describing some aspect or attribute of the quantities, are called vectors. The number of these values is called the dimension
of the vector. Common examples of vector quantities from physics include position, velocity, momentum, and force. They all have two common attributes, (i) a magnitude, and (ii) a direction associated with them.
Both scalars and vectors are represented usually with real numbers, and often with complex numbers. So the notions of addition
, subtraction
, multiplication
and division
,
through which we manipulate numbers, are suitably extended on scalars and vectors.
Scalars: are expressed in terms of single number, so their algebra is essentially the same as algebra of real/complex numbers they are expressed in.
Vectors: are expressed in terms of multiple numbers. So it's not straight forward as to how to manipulate vectors, based on scalars, or formally extend their algebra from the component numbers. We see below how this is done.
Vector Space, Hilbert Space: A Vector space is a mathematical structure, that constitutes the necessary components to manipulate vectors in sensible way. Before we define it, we need to understand a few definitions, namely Set
, map
, Binary operations
, Group
and Field
-
Learning the notion of sets, and their manipulation provide not only training in fundamentals of logic of categorisation, organisation, it is crucial building block of most of mathematics.
In Mathematics, a set is defined as a unique collection of well defined objects1, and the object in the collection are called elements of the set.
For example,
-
$A={a, e, i, o, u}$ is a set of vowels in english language. -
$\mathbb{Z} = {0, \pm1, \pm2, \pm3,\dots}$ is the set of integers.
It's important to imphesise the important of uniqueness in a set. It means in that in a set, a member exists only once. Thus
Secondly, the order of elements in a set have no meaning, so
A set is expressed with mainly two ways:
-
Roster method: This is the method where we list all the elements of the set, in no particular order, and put them in curly braces. The set of vowels,
${a, e, i, o, u}$ is one such examples. It is the simplest method, but becomes inconvenient for sets with large number of elements, or infinite number of elements. -
Set builder/logical method: In this method, instead of trying to list elements one by one, we name a representative element, and describe it's specific properties that make it a member of the set. For example, see below, and example of set builder method.
The above is seen or read as "B equals a set of all '$a$' such that
A set can be finite, or infinite, that is, the number of elements in a set can be finite, or infinite.
You can manipulate a set, by adding or removing elements from it. A set with no elements is called Null set
, denoted by
Sets can be finite and infinite. We all know what being finite is, so a finite set is a set with finite number of elements which one could count.
Infinite is a slightly tricky concept in mathematics. To start with, an infinite set is a set that is not finite.
The set of natural numbers
We use the term cardinality for representing a measure of number of elements in a set. The cardinality of a finite set is the number of elements in it. For infinite sets, the cardinality is symbolic and is established by comparison of two sets and defining map among them.
Imagine we have two sets, A and B, and it is such that, every element of A is also element of B, then we say that A is a subset of B. It is denoted as
Naturally, every set is subset and superset of itself. We a call a set A to be a proper subset of B, if every element of A is in B, but at least one element exists in B that is not in A. In this case, B is called a proper superset of A.
For finite sets, their proper subsets have strictly lower number of element than the set itself. However for infinite sets, a proper subset may have same cardinality as the set itself.
:align: center
:width: 400px
A is subset of B, and B is superset of A
:align: center
:width: 400px
Visualisation of the set of numbers, natural numbers $\mathbb{N}$, integers $\mathbb{Z}$, rational numbers $\mathbb{Q}$, real numbers $\mathbb{R}$, and complex numbers $\mathbb{C}$. We have following $\mathbb{N}\subset \mathbb{Z}\subset \mathbb{Q}\subset \mathbb{R}\subset \mathbb{C}$
:align: center
:tags: ["remove-input"]
# Library
from matplotlib import pyplot as plt
from matplotlib_venn import venn2
# Basic Venn
v = venn2((10, 12, 10), ("A", "B", "AB"), alpha = 0.5)
# Change Backgroud
plt.gca().set_facecolor('white')
plt.gca().set_axis_on()
# Show it
plt.show()
We defined what a set is, and introduced a notion of comparison by defining what a subset, and superset is. There is a lot more one can do with the notion of sets, to manipulate them, to the extent that it looks like everyday algebra.
-
Universal set: For a given consideration of problem, a universal set
$U$ is set of all elements considered, and fixed, so that every set defined for the problem, is a subset of$U$ . -
Complement: Compliment of a set A, denoted by
$A'$ , or sometimes$A^c$ is defined with respect to the universal set, is set of all elements of$U$ that are not in A.
```{figure} ./set-union-inters-diff.jpg
Set Union, intersection and set difference.
```
-
Union: A union of two sets, say A and B, denoted as
$A\cup B$ is defined as the set of all elements that belong to either A, or B, or both. For example, if$A={1,2,3,4}$ and$B={1,3,5,7}$ , then
-
Intersection: An intersection of two sets, say A and B, denoted as
$A\cap B$ is defined as set of all elements that belong to both A and B. -
Difference: The set difference of A from B, denote as
$A-B$ , is set of all elements of A that are not elements of B. -
Cartesian Product: A cartesian product of two sets, say A and B, denoted by
$A\times B$ is the set of all ordered pairs$(a, b)$ such that$a$ belongs to A, and$b$ belongs to B.
For examples, if
In mathematics, a map is somethings that defines a symbolic relation from elements of a set X to elements of another set Y, such that each element of X gets associated to an element of Y. The set X whose elemets get mapped to
is called the domain of the map and the set Y the elements of which they get mapped from
is called the range or codomain of the map.
Symbolically, a map is expressed as follows
- We say that
$f$ is a map from X to Y by$f: X\longrightarrow Y$ - For an element
$x$ in X, the element in Y it maps to, is expressed as$f(x)$ . - The element
$f(x)$ belongs to Y, and is called the image of$x$ .
:align: center
:width: 70%
-
The map, mapping, correspondence are synonyms. Usually, if the set is of numbers, they are also called functions.
-
A map between two sets let's us see a holistic relation between them.
-
If distinct elements of the domain set X get mapped to distinct elements of the codomain set Y, the map is called injective, or one-to-one. For example, consider the set of integers
$\mathbb{Z}$ , and a map $f: \mathbb{Z} \longrightarrow \mathbb{Z} $ such that$f(x) = 3 x$ . This is an injective map from$\mathbb{Z}$ to itself. -
A map in which each element of the codomain set is image of some element in domain set, is called surjective or on-to map.
-
A map that is both injective, and surjective, is called a bijective map. Bijective maps play an important role in mathematics.
:align: center
:width: 70%
-
Recall the uncountable sets. Using concept of map we can more clearly define it as follows - A set X is uncountable if and only if there exists no injective map from the set X to set of natural numbers
$\mathbb{N}$ . -
Consider the set of integers
$\mathbb{Z}={\dots, -2, -1, 0, 1, 2, \dots}$ and a map$f:\mathbb{Z}\longrightarrow \mathbb{Z};\quad f(n)=3n$ . The range of this set is a proper subset of$\mathbb{Z}$ , as the range only contains numbers that are multiple of 3. However, since the map is injective, this means that the subset has the same cardinality as$\mathbb{Z}$ .
- Maps can be combined. Instead of exploring general consequences, consider two bijective maps
$f, g$ from a set X to itself. If$f$ maps an arbitrary element$a$ in X to$b=f(a)$ , and$g$ maps$b$ to$c=g(b)$ , then the the combination, or composition of the two maps$fog$ maps$a$ to$fog(a) = f(g(a))$ .
:align: center
graph LR
a(["X"])
b(["X"])
c(["X"])
a --f--> b --g--> c
a --fog--> c
Binary operations, as the name suggests are operations that take two objects and combine them to give (usually) one unique object.
In mathematics, binary operation is defined on a set, that takes two elements of the set, and returns one element of a set.
Formally, a binary operation on a set A is a mapping of elements of
For the binary operation to be well defined, the operation ob$ is denoted as result of the binary operation.
Commutativity A binary operation is said to be commutative, if the result of combining does not depend on which is combined to the other i.e., if
Examples:
- On the set of real numbers
$\mathbb{R}$ , the usual addition$o(a,b) = a + b$ , and the usual multiplication$o(a,b) = ab$ are most common examples of binary operations. - Consider a given set X, and a set S of all possible bijective maps from X to X, then the composition of maps
o
is a binary operation.
When we have a set, it let's us categorize, and organize the elements. Having binary operations defined on a set tells us how a pair of elements of the set result in another element, in effect how combining elements gives us different elements.
The binary operations defined on a set, give new structure to the set. A group is one such structure.
A group is a set
-
Associativity A binary operation is called associative, if
$a o (b o c) = (a o b) o c$ for every elements$a, b, c$ in$A$ . -
Existence of Identity There exist an element
$e$ in$A$ such that for every element$a\in A$ ,$e o a = a$ , i.e., combining any element with$e$ results in the same element. -
Existence of Inverse For every element
$a\in A$ , there exists another element, say$a'$ such that$a' o a = e$ , i.e, combining the two results in indentity element.
The inverse of an element
The two expressions are in general different, and can potentially, mean existence of two types of identity elements, say left identity
and right identity
. However one can prove based on the purely logic, and the knowledge that
The same question can be posed for the existence of the inverse. The left and the right inverses of an element (can be proven) are the same.
The inverse of the inverse of the element
Just like sets have subsets, groups have subgroups. A subgroup of a group
- The binary operation
$o$ defined on the set$G$ , is also a binary operation on the subset$S$ . - The identity of
$o$ , say$e$ belongs to$S$ , and for every$a$ in$S$ ,$a^{-1}$ also belongs to$S$
:class: information
The arithmatic operation `+` is a binary operation, as adding any two integers results in another, unique integer. Since the order of adding two integers, does not matter, the operation is obviously commutative.
Next, we know that addition of three numbers is associative (otherwise grocery shopping to stock markets, everything would have been a mess! 😀).
Zero is clearly the identity element in the set of integers.
For every number, it's negative is the additive inverse.
- What about the set of rational numbers, real numbers and complex numbers. Do any of these form a group with arithmatic addition
+
, or multiplication*
? Following table gives a summary of answers, try to figure out why.
Set A | (A, + ) |
(A, . ) |
(. ) |
---|---|---|---|
Group | No | No | |
Group | No | Yes | |
Group | No | Yes | |
Group | No | Yes |
Here
The above defines what a group is, and based on the above example it may look like that a group is just a formal abstract version of what we already know about numbers from our school algebra/arithmatic knowlege. However this abstraction leads to generalization of the arithmatic operation to sets of other objects. See the following examples -
:class: information
A permutation is an arrangement of elements of a set. As we know, a set does not have a notion order of elements. But if the order of the elements mattered, each permutation, seen as ordered sequence of elements, will look different, as in figure below.
```{figure} https://upload.wikimedia.org/wikipedia/commons/4/4c/Permutations_RGB.svg
:align: center
:name: rgb
Each row represents an arrangement of the three balls of colour red, green and blue. There are 6 such arrangements.
To abstractify, instead of ball of three colours, consider a set of 3 objects, and without loss of generality we can call them
Sequence | (abc) | (acb) | (bac) | (bca) | (cab) | (cba) |
Mapping | (123) | (132) | (213) | (231) | (312) | (321) |
If we choose a reference arrangement to be
They can be computed as following -
Below is the operation table for composition
If we consider the set
This is called permutation group. We showed the example of three objects, but the group generalises to any number of objects. For
:class: information
Modular arithmatic is a system of arithmatic, where numbers wrap around after reaching a certain value. A common example is arithmatic of `12-hour clock`. In general, consider a positive integer $m > 1$. Now any arbitrary integer $a\in \mathbb{Z}$, we can write $a = qm + r$ where $q, r$ are some integers. There is unique pair $q, r$ such that $q$ is largest, and $0\le r\lt m$, in which case we know $q$ and $r$ as quotient and remainder when $a$ is divided by $m$. Since in this unique representation $r$ can have only $m$ possible values in set $Z_m = \{0, 1, 2, \dots, m-1\}$.
On this set, we define modular addition $\oplus$, to distinguish with usual addition `+` as follows:
For any two arbitrary $a, b$ in $Z_m$, $a \oplus b = c$ where $a + b = qm + c$, i.e, $c$ is obtained by computing the remainder when $a+b$ is divided by $m$.
Then $(Z_m, \oplus)$ forms a group, because of following -
- $\oplus$ is a binary operation, moreover, it is also commutative.
- 0 is the identity, for any $a\oplus 0 = a$.
- For any $a$ in $Z_m$, $0\le a\lt m$, there exists $m-a$, such that $a\oplus (m-a) = 0$, so $m-a$ is the inverse of $a$.
Illustration: Consider
In mathematics, a field is defined as a set +
and .
such that following conditions are satisfied -
- The binary operations
+
and.
are commutative, i.e.,$a + b = b + a$ , and$a\cdot b = b\cdot a$ for every$a, b\in F$ . -
$(F, +)$ is a group. Let's call0
it's identity for+
. - $(F^, \cdot)$ is also a group, where $F^ = F - {0}$ is set with identity of
+
removed from it. Let's call the identity for this as1
. - The operation
.
distributes over+
, i.e.,$a\cdot (b + c) = (a\cdot b) + (a\cdot c)$ for every$a, b, c \in F$ .
Examples:
-
The set of rational (
$\mathbb{Q}$ ), real ($\mathbb{R}$ ), and complex numbers ($\mathbb{C}$ ), all from respective fields with usual arithmatic addition+
and multiplication.
.$(\mathbb{Q}, +, \cdot)$ ,$(\mathbb{R}, +, \cdot)$ ,$(\mathbb{C}, +, \cdot)$ are all fields, with 0 as additive identity, and 1 as multiplicative identity. -
There are examples of fields of finite set, but their discussion takes us (more than usual) different direction than we intend this lecture notes to. For most of the discussion, even the group aspect might be generally needed, the field of number relevant for us will be that of
$(\mathbb{R}, +, \cdot)$ and$(\mathbb{C}, +, \cdot)$ . -
Recall the modular arithmatic, where
$(Z_m, \oplus)$ formed a group for any arbitrary positive integer$m$ . Now consider a modular version of the usual arithmatic multiplication,$\odot$ , defined as follows. If$a, b$ are two arbitrary integers in$Z_m$ , then$a\odot b = c$ where$c$ is the remainder you get when you divide$ab$ with$m$ . Does the$(Z_m, \oplus, \odot)$ - Clearly,
$\odot: Z_m\times Z_m \longrightarrow Z_m$ is a binary operation. - 1 is the multiplicative identity, as
$1 \odot a = a$ for every$a$ in$Z_m$ . - However, unless
$m$ is prime, we have issues with the structure of identity and inverse. - Does every element has an inverse? Imagine
$a, b$ in$Z_m$ are inverses of each other, then$a\odot b = 1$ , which means$ab$ is of the form$ab = qm + 1$ , which means$b= {qm + 1\over a}$
- Clearly,
:class: tip
We need the structure of a field, i.e., set with atleast two binary operations to construct expressions that we call polynomials.
All the above mathematical definitions, going through which can be perhaps excruciating, let us define what we need for the underlying math of quantum computing: A vector space, and a Hilbert space. We will see, that these two are nearly same structures, with one difference.
A vector space is a mathematical structure, that consists of two sets, which have the following substructures:
-
The first set, say V, whose elements are called vectors, is a commutative group
$(\mathbf{V}, +_v)$ withaddition
$+_v$ . Let's call$0_v$ the zero vector as the identity of this group. -
The second set, say F, whose elements are called scalars, is a Field
$(F, +_f, \cdot_f)$ withaddition
$+_f$ andmultiplication
$\cdot_f$ . Let's call the$0$ and$1$ additive and multiplicative identity of the field. -
A binary operation
$\cdot : F\times V \longrightarrow V$ called scalar multiplication. This operation combines a scalar and a vector, and gives us a vector. It lets manipulate vectors through scalars, and satisfies following -
-
Associativity:
For any arbitrary
$a, b$ in F, and any arbitrary$\mathbf{v}$ in V,
This means multiplying a vector successively by two scalars gives the same vector, as when multiplying the field multiplication of those two scalars to the vector.
-
Distribution:
The scalar multiplication distributes over addition of vectors, and addition of scalars. That is, for any arbitrary scalars
$a, b$ in F and any arbitrary vectors$\mathbf{u, v}$ in V, the following holds
- Multiplying any scalar with zero vector gives zero vector,
$a\cdot 0_v = 0_v$ , and multiplying zero scalar 0, with any vector gives zero vector,$0\cdot v = 0_v$ .
Because of associativity, and distribution of scalars with vectors, we often use the same symbol for addition (+
for
Let us take a moment to appreciate what makes the structure of vector space rich. We have a set V of vectors elements, and set F of scalar elements. If one of the sets is infinite, the vector space automatically becomes infinite. (Guess Why?)
Examples:
Consider the set of points on 3D space, they are represented by the cartesian product of the set of real numbers
If we define vector addition of two arbitrary points
then
Consider a toy example of 'constructing a vector space'. Let there be a set of just two non-zero vectors
But what about
Thus the only option within the set is
This gives the structure of this three element group, where combining any non-zero element with itself gives the other element, and both non-zero elements are inverses of each other.
Other option, which gives us more to play with in context of vector space is, to assume that
With introduction of more elements, we have to again make sure that the binary operation is well defined, i.e., addition of any two element results an element in the set. Consider now the
So,
- This process of extending is actually endless
- If we keep doing this over and over again, the new elements added are expressed as some 'combination' of the originally assumed two non-zero vectors.
- We have deceptively used the notion of actual numbers to express new vectors.
- The fully extended set looks like
In a more compact form (albeit terse looking) the set looks like
So we had to issentially grow the initial set to contain infinitely many element to make sure the binary operation was within the set, which makes
Just like Groups have subgroup, vector space had vector subspace. Given a vector space (V, +), with a field
We saw from the above example, how a finite set object can be extended to an infinite set with a vector space by using combinations of those finite elements.
These combinations are called linear combination of vectors, as they are expressed as sum of terms, each of those terms being vector, with scalar multiplied to. More formally, consider an arbitrarily chosen set of
and the scalars
Let's try to make some interpretations on this linear combination, and here we recall that in physics we usually called single valued objects as scalars, and multi-valued objects as vectors. Thus a number is enough to represent scalar, while a vector having multiple attributes, requires multiple numbers.
Given a vector
- Multiplying a vector with a scalar, changes it's
magnitude
. -
$\mathbf{v}$ ,$a\mathbf{v}$ ,$b\mathbf{v}$ ,$(a+b)\mathbf{v}$ all have the samedirection
.
Given a vector space (V, +), with a field
Clearly,
:class: information
Consider a set A with $n$ number of non-zero vectors $A=\{\mathbf{v_1, v_2,\dots,v_n}\}$, all elements of a given vector space, then a set containing all the vectors that can be generated by forming linear combination of vectors in A is called the **linear span** of A. Formally, linear span of the set A is
$$
F_A=F_{\{\mathbf{v_1, v_2, \dots,v_n}\}} = \{a_1\mathbf{v_1}+a_2\mathbf{v_2}+\dots +a_n\mathbf{v_n}: \text{For every}~~a_1, a_2, \dots, a_n \in F\}
$$
Consider the following equation:
The left part of the expression is a vector that is linear combination of
-
$a\mathbf{u}$ and$b\mathbf{v}$ are inverses of each other. - If
$a \ne 0$ , then$\mathbf{u} = (-a^{-1}b)\mathbf{v}$ , which means$\mathbf{u}$ shares the same direction as$\mathbf{v}$ . - If
$b \ne 0$ , then$\mathbf{v} = (-b^{-1}a)\mathbf{u}$ , which also means$\mathbf{u}$ shares the same direction as$\mathbf{v}$ . - The set of their span are identical
$F\mathbf{u} = F\mathbf{v}$ (try to see why).
:class: tip
Thus if a linear combination of two vectors can be set to zero, with atleast one coefficient being non-zero, then the two vectors share the same direction.
What happens, if this cannot happen? That is, for the above two vectors,
Then the vector's don't share directions, and in a sense are independent of each other. A logically equivalent, but more intuitive way of saying the above is if two vectors don't share direction, or can't be written as scalar multiple of each other, then
:class: information
Imagine we have two vectors $\mathbf{v_1}$, $\mathbf{v_2}$ that don't share direction. We can construct the span of these two vectors
$$
F\mathbf{v_1, v_2} = \{a_1\mathbf{v_1}+a_2\mathbf{v_2}: \text{For every}~~a_1, a_2 \in F\}
$$
It's also easy to see that this is a sup-space, and linear spans $F\mathbf{v_1}$ and $F\mathbf{v_2}$ are contained in it.
Let's extend this line of thinking, and consider three non-zero vectors
-
If all the coefficients
$a_1, a_2, a_3$ are non-zero, any one vector can be written as linear combination of the other two, for example,$\mathbf{v_1} = (-a_1^{-1}a_2)\mathbf{v_2} + (-a_1^{-1}a_3)\mathbf{v_3}$ , assuming$a_1\ne 0$ . -
If one of the coefficients is zero, then the expression maps to case of two vectors, and the other two coefficients being non-zero, they share direction.
-
If two of the coefficients are zero, say
$a_1 = a_2 = 0$ , then the third has to be zero automatically, as$a_1\mathbf{v_1}+a_2\mathbf{v_2}+a_3\mathbf{v_3} = a_3\mathbf{v_3} = 0$ , but$\mathbf{v_3}$ is non-zero, so$a_3$ has to be zero. -
If all the coefficients are zero, then non of the two vector out of the above three share direction. One can say, that each vector has directions that are independent of each other.
:class: information
Now, if we have three vectors $\mathbf{v_1}$, $\mathbf{v_2}$ and $\mathbf{v_3}$ with independent directions. We can construct the span of these three vectors
$$
F\mathbf{v_1, v_2, v_3} = \{a_1\mathbf{v_1}+a_2\mathbf{v_2}+a_3\mathbf{v_3}: \text{For every}~~a_1, a_2, a_3 \in F\}
$$
Just like two vectors, this is a sup-space, and linear spans $F\mathbf{v_1}$, $F\mathbf{v_2}$, $F\mathbf{v_3}$, $F\mathbf{v_1,v_2}$, $F\mathbf{v_2,v_3}$ and $F\mathbf{v_1,v_3}$ are contained in it.
This situation, generalised to arbitrary number of vectors is defined as linear independence. Formally, a set of vectors
They are called linearly dependent, if there exists a set of coefficients
This concept linear independence enhances the ability to express vectors, and categorise or manipulate vector space by defining various types of sub-spaces.
We saw what are linearly independent vectors, and what linear span of some vectors is. When we start collecting non-zero vectors to a set A, and consider the linear span of the set
Now let's notice that as the number of elements in the set A increases, the linear span also starts to get bigger, infact, a lot bigger. See the table below, where we illustrate the linear span of a set A for different number of elements in it, assuming in each case, the vectors in the set A are linearly independent.
No of elements | Set A | Linear span |
---|---|---|
1 | ||
2 | ||
3 | ||
n |
- For one element set
${\mathbf{v_1}}$ , the linear span$F_A$ contains a different vector for each possible value of scalar$a_1$ . - For two element set
${\mathbf{v_1, v_2}}$ , the linear span$F_A$ contains a different vector for each possible value of scalar$a_1$ as well as each possible value of$a_2$ . - For three element set, the linear span contains a vector for each possible values of
$a_1$ ,$a_2$ and$a_3$ independently. - The scalar possibilities get multiplied everytime we add another vector to the set.
Clearly, at each stage the linear span
- Will there be a stage, at which the linear span becomes the V, the vector space itself?
- If so, can we keep adding vectors still?
- What happens if
$F_A$ never becomes V?
First, when this happens, i.e., the linear span
Thus a set A of vectors is called a basis, if all its vectors are linearly independent, and every vector in the vector space can be written as linear combination of the vectors of A.
Following theorem sheds light on question 1 and 3.
Every vector space has a basis.
Now the question 2. If the set A has become the basis, then every vector can be written as linear combination of its elements. But then there are no more vectors left to add so that the set can still stay linearly independent.
Without going into details of proving it, though you are encouraged to try, following statements hold for any basis and any vector space.
- A vector space may have multiple basis, however the number of elements in each basis stays the same.
- The basis may be finite or infinite. The number of element in the basis is called dimension of vector space. If the set A is infinite, then the dimension of the vector space is also infinite.
- Basis is a minimal set. As we saw above, you can't add elements to it. However, if you remove any element from it, it will no longer be a basis.
For a given basis A for vector space V, since every vector
The linear combination is defined by the scalars
It is often easier to imagine and interpret even abstract vectors in terms of these scalar coordinates, and this is facilitated by the basis.
Now that we know about vector space, let's discuss how it's elements are manipulated.
Linear operators, are issentially maps from a vector space to another that follow linearity.
For our purposes, we will restrict ourselves to maps from a vector space to itself. In that case, a linear operator, or a linear map is a map
Let's see some simple implications of this. If we choose
Next, if we choose
The above two deduction might look like just a clever word-play, but their impact is profound, which is that the linear relations of the vectors stays preserved under the linear maps.
Here is the place we the use of Basis becomes very handy. For a linear map L, suppose we have a basis
Since B is basis, any arbitrary vector
Then because of linearity, the image of
This means a linear map is completely specified by how it maps elements of a basis. For a general set on the other hand, one has to speficy the mapping of every element of the set.
Now let's consider one of the mapped vectors
Then the map L is completely specified in terms of the values
Is the matrix representation of the linear map L, and if we express an arbitrary vector
This is how matrices representation of linear operators naturally emerges. To see more clearly, let's compute
Thus the image
-
So once we choose the basis, we can express abstract vectors as regular array of numbers, and linear operators as matrices.
-
This helps us encode abstract problems of vector space to matrix-vector manipulations, which are naturally done efficiently on computers.
-
This is why we map states and operators to regular vectors and matrices in Quantum Physics and quantum computing.
We now know what a vector space
For vector spaces that are defined over the field of real
An inner product, is a map
For any arbitrary vectors
-
Conjugate symmetry: $\langle v_1, v_2\rangle = \langle v_2, v_1\rangle^$, where $^$ denotes complex conjugate of the complex number. If the
$F=\mathbb{R}$ , then this becomes symmetry. -
Linearity: Given that it is a function of two vectors, and is conjugate symmetric (point 1 above), it should have some level of linearity. Formally, it should be linear in the second argument, and conjugate linear in first argument, i.e.,
-
Positive definiteness: Note, that conjugate symmetry dictates that
$\langle v, v\rangle = \langle v, v \rangle^*$ , so the inner product of a vector with itself is a real number. We further say, that for every$v\in V$ , the inner product of a vector with itself is non-negative, i.e,$\langle v, v\rangle \ge 0$ . Moreover,$\langle v, v\rangle = 0$ if and only if$v=0$ , i.e, a non-zero vector's inner product with itself is always a positive non-zero number.
The inner product can be defined in two ways in terms of linearity, (i) as above, where in condition 2, we imposed linearity in second argument, and conjugate linearity in first argument, and (ii) where we impose linearity in the first argument
:class: tip
A vector space that has an inner product defined on it, is called an **Inner product space**.
As we saw in the definitio of inner product,
The norm of a vector is defined as
$$
|av| = \sqrt{\langle av, av\rangle} = \sqrt{a^\langle v, av\rangle} = \sqrt{a^ a\langle v, v\rangle} =\sqrt{|a|^2\langle v, v\rangle} = |a||v|
$$
where
It is a theorem, that states that the absolute value of the inner product of two vectors in a vector space is always less than or equal to the product of their norm.
Since this ratio of inner product and product of norms of the two vector is bounded below 1, we can interpret this ratio to be cosine of an angle, and call this angle as the angle between two vectors.
With this interpretation, the inner product tells us how 'close' two vectors are in direction.
With the inner product defined on the vector space, we can extend the notion of distance between two vectors. For this we need a function of two vectors to act as distance, but just any function wouldn't do. This function has to generalise the concept of distance in real life, for it to make sense.
In mathematics, such a function is called a metric. A metric, or distance function defined on a set
-
Positive definiteness: For any arbitrary
$v_1, v_2 \in V$ ,$d(v_1, v_2)\ge 0$ , and$d(v_1, v_2)=0$ if and only if$v_1 = v_2$ . -
Symmetry: the distance function is symmetric, i.e.,
$d(v_1, v_2) = d(v_2, v_1)$ -
Triangular inequality: For any three elements/vectors
$v_1, v_2, v_3 \in V$ , the sum of the distance between the first two vectors is always more than or equal to the distance between the first and third vector, i.e., we have $$ d(v_1, v_2) + d(v_2, v_3) \ge d(v_1, v_3) $$
Any set with a metric defined on it, is called a metric space. It's elements can be thought of as point in a geometrical sense, as there is a well defined distance between each of them.
With this notion added, the vectors can be seen as 'points' in a vector space. The inner product space has a naturally defined notion of distance through the inner product itself, and that is
It is easy to check that the above defined distance satisfies the three condition for being a metric.
With this distance defined on it, we have a Hilbert space. Formally,
:class: tip
Hilbert Space is a vector space equipped with an inner product that induces a metric/distance function for which the space is a complete metric space.
Now what is a complete metric space, or equivalently completeness? We will not go into details of this, but rather illustrate the concept of completeness. We know that vector addition is a binary operation, and similarly usual addition is a binary operation. And by definition, adding two numbers or vectors gives another number/vector.
Within the set of rational numbers
is also a rational number and belongs to the same set
So far, the sum is rational number. However in the limit of
This illustrates, that an infinite summation of rational numbers is actually not rational, but an irrational number. This is what completeness is about. In a complete metric space, the limit of an infinite sequence lies in the set. Thus,
An infinite series of sum of vectors results in a well defined net displacement vector.
In case of vectors in a vector space, this translates to the property that linear combination of infinitely many vectors is still a vector in the vector space. Having this property insures we do not end up in an unphysical state outside the vector space.
Normality, or a vector being normal is just size of vector being of unit size. A vector
Any non-zero vector $v$ can always be normalized as $v\to\frac{1}{|v|}v$
Orthogonality in a vector space is the generalisation of two vectors being perpendicular in coordinate geometry. In this sense, the inner product naturally extends the notion of ordinary dot product.
So, two vectors, say
If two vectors are linearly independent, then the angle between them can not be zero. Such vectors can be transformed into orthogonal vectors. This process is called Gram Schmidt Orthogonalization, see the wiki link.
Let's see this through two vectors. Assume we have two vectors
We can choose the first vector
then
Here we subtracted
$\frac{\langle v_1, v_2\rangle}{\langle v_1, v_1 \rangle}v_1$ is interpreted as the projection of $v_2$ with $v_1$.
In an inner product space, the inner product of two vectors
A basis of an inner product space, in which every vector is normal, and all the vectors are orthogonal to one another, is called an orthonormal basis. It is the basis most convenient to work with in a vector space.
Consider a basis
$$
u=\sum_i a_i v_i = a_1v_1 + a_2v_2 +\dots, a_nv_n, \quad\text{and}\quad v=\sum_i b_i v_i = b_1v_1 + b_2v_2 +\dots, b_nv_n
$$
where
$$ \langle u, v\rangle = \left\langle\sum_i a_i v_i, \sum_j b_j v_j \right\rangle = \sum_{ij} a_i^b_j \langle v_i, v_j\rangle = \sum_i a_i^ b_i = {\bf a^{\dagger}b} $$
where we utilised the linearity of the inner product and the fact that the basis vectors are orthonormal. This shows that in the orthonormal basis, the inner product of the two vectors becomes the usual complex dot product of the corresponding coordinate vectors.
Another advantage one can quickly see that if the basis vectors were not orthonormal, the double summation will not simplify to single summation, and there will be
So from now on, when we refer to a basis, it will be orthonormal basis, unless stated otherwise.
A basis works as a reference in a vector space. When we write a vector as linear combination of basis elements, the coefficients, or coordinates basically tell us how much each of the basis vector needs to scale so that the added resulting vector is the desired one.
Depending on what system the vector space describes, a specific basis can mean a specific orientation of the system, a specific frame of reference, or specific type of measurement system. One often changes the orientation, reference, or type of measurement. This means choosing a different basis.
So here we will see how a vector transforms when we change the basis. Consider a basis
where
where
Consider a set of all linear operators on a vector space
-
Sum of two linear operators is defined as follows. If
$L_1, L_2 \in L_v$ are linear operators, then their sum$L=L_1 + L_2$ is a linear operator such that$L(v) = L_1(v) + L_2(v)$ for every$v\in V$ . -
Scalar multiplication to a linear operator is defined through
$L=a_1L_1\implies L(v) = a_1 L_1(v)$ -
The zero operator, or map is
$O:V \longrightarrow V$ , that maps every vector$v$ to zero vector,$O(v)=0$ .
With the above, it is easy to see that the set
- The Identity operator,
$I:V\longrightarrow V$ maps every vector to itself,$I(v) = v$ .
Linear operators can be combined, or 'multiplied' in the following way. If
-
Invertable operators: Operators are map, and if they are bijective maps, they can be inverted. The inverse of an operator
$L$ is an operator denoted as$L^{-1}$ such that$L^{-1}L = I, i.e., L^{-1}(L(v))=L^{-1}L(v)=I(v)=v$ for every$v\in V$ .
Adjoint operator is an operator related to another operator. For a given operator in an inner product space, it's adjoint operator is defined best with help of the inner product, as follows. If
So if we swap the two vectors, and conjugate the inner product, then the adjoint operator gives the same scalar as the given operator
Hermitian operators are of vital importance in linear algebra, vector spaces, and in quantum science. These are operators that are adjoint of themselves. That is, an operator
The matrix representation of a Hermitian operator is conjugate symmetric, i.e., ${\bf A^{\dagger}_{ij} = A^*_{ji}}$.
Unitary operators are special type of invertable operators, such that their adjoint is their inverse, i.e.,
They are also of vital importance in quantum science, as well as linear algebra. To see why, let us see what happens to the norm of a vector that is transformed by a unitary operator. Say
That is, norm of a vector does not change under the effect of unitary operator. That is why, the act of unitary operator on vectors is interpreted as 'rotation' of vectors, and is called unitary rotation.
The concept eigenvalues and eigenvectors has rather wide range of applications in physics, as well as technology.
In a vector space, eigenvalues and eigenvector are attributes associated with a linear map or operator. We all know that linear operators map a vector to another vector. Say
If it so happens that for a particular vector
:class: tip
For a given operator $L$, a vector $v$ is called an **eigenvector** if the vector gets mapped to a vector parallel to itself, i.e.,
$$
L(v) = \lambda v
$$
The corresponding number $\lambda$ with which the eigenvector get scaled is called the **eigenvalue**.
One can ask the following, how many eigenvalues and/or eigenvectors can a linear operator have?
To answer it, and to also understand the nature and purpose of eigenvalues and eigenvectors, let's work with the matrix representation of the linear operator and vectors. Consider a basis
$$
{\bf LX} = \lambda{\bf X}, \quad\text{or, equivalently}\quad ({\bf L}-\lambda {\bf I})X = 0
$$
where
Determinant of a square matrix is a scalar function, expressed as
If we label the
:class: tip
Eigenvectors corresponding to two distinct eigenvalues are linearly independent.
In fact, if we have a matrix with all distinct eigenvalues, the corresponding eigenvectors form a Basis of the vector space, called eigenbasis.
The Dirac notation, also called Bra-Ket notation, is a notation used in Quantum Physics to express vector operations, and inner product. It makes interpretation of state, observations and inner product easier.
- It uses angular brackets
$\langle , \rangle$ and vertical bar$|$ . - A Ket is a vector of the Hilbert space denoted as
$|v\rangle$ , and represents state of a quantum system. - Action of linear operator
$A$ on a ket is expressed as$A|v\rangle$ . - A Bra is of the form
$\langle f|$ . It is a map$f: V\longrightarrow C$ , i.e., it maps a vector to a number, usually complex number. - With an inner product
$\langle , \rangle$ defined on vector space, each ket has a unique corresponding bra$\langle v, ~ \rangle \equiv \langle v|$ - Basically a bra is a map that is literally parameterised by the vector of Hilbert space, and its action on a vector is governed by the defined inner product.
- The inner product then is equivalently expressed more conveniently as
-
The set of all Bra is called dual space of vector space, and both can be used to represent the state.
-
One also defines outer product, of two vectors as
$L=|v_1\rangle \langle v_2|$ , which acts as an operator$L(v)\equiv L|v\rangle = |v_1\rangle\langle v_2|v\rangle = \langle v_2|v\rangle|v_1\rangle$ . Here applying$L$ on$|v\rangle$ gives the state$|v_1\rangle$ multiplied by scalar$\langle v_2|v \rangle$ . -
For a basis
${|v_1\rangle, |v_2\rangle, \dots, |v_n\rangle }$ there is a corresponding basis for dual space, a basis of bra's${\langle v_1|, \langle v_2|, \dots, \langle v_n|}$ -
The linear combination for a ket is simply expressed as
$|v\rangle = \sum_i a_i |v_i\rangle$ , however, for the corresponding bra$\langle v|$ has conjugate coefficients.
Here we will recall the fundamental notions of quantum mechanics we use when we setup the mathematical framework for a given quantum system.
To describe or study a quantum system, we first start with a set of states that are mutually exclusive with respect to an observation or measurement. Classically, a system and observations are two distinct things without their interplay. Since classically when we observe something, for example a moving car, or a passing meteor, our 'observation' does not affect the dynamics or behaviour of the system in question. It is because the interaction of the observer and the system is usually so weak that there is no effect of the presence of observer on the system.
But within the quantum realm, this is not the case. In the strict sense, any observation in quantum realm is result of a measurement, which itself is associated with the property of the system, such as position, momentum, or energy of the system. Secondly, in the quantum realm, properties are quantised, that is, instead of any possible value, there exist descrete values or outcomes that individual measurement can result in. For example, in case of a qubit, the measurement of the state result in either
So when there is an observation we think of, and the decrete set of state the system is in, and these state are mutually exlusive in the sense of possibility.
If the system is found in one of those state, and the corresponding measurement is done, it will always be found in that state. We choose this set as our basis. Imagine there are
Each vector in this Hilbert space is a possible state. This is because of the superposition principle which states that If a quantum system can exist in two states (or more), it can exist in all possible superpositions of the two (or more of those) states. Let's see below what this superposition means in connection with a quantum system.
We have the following probabilistic interpretation of a general state in the Hilbert space.
:class: tip
A quantum state $|s\rangle = c_1|1\rangle + c_2|2\rangle + \dots c_n|n\rangle $ is a state, in which possibilities of being measured in each one of the basis states $|l\rangle$ coexist simultaneously, untill the measurement is done.
The coefficients
The inner product of the two states $\langle s, s'\rangle \equiv \langle s|s'\rangle$ gives the overlap of probabilities of the two states.
For mutually exclusive states this overlap is zero, hence the mutually exclusive states should be orthogonal to each other. That is why we impose that the basis above is orthogonal, i.e.,
Further, lets compute the norm of the above generic state
But the probability argument implied that the last summation equals 1, so
Let us try to illustrate how quantum measurement works. When we measure a property of a quantum system, the outcome of the measurement should yield a value.
Let's assume the quantum system imagined in the previous section, the states are
This observation, against our intuition of uncertainty in quantum system, is actually definite. Is there something special about the system being in state
The uncertainty comes if before the measurement system was in a superposition state
Once we observe the system, or measure it, the system collapses into one of the
Mathematically, the above measurement is interpreted as following.
Every physically measurable quantity is called an observable, and for every observable, there is a Hermitian operator, say
The overlap, or inner product of the resulting state with initial state gives the outcome of the measurement. The outcome, which is also called the expectation value, is expressed as
-
For the
$|l\rangle$ state, the outcome is$\langle l|A|l\rangle = a_l\langle l|l\rangle = a_l$ , and if definite. -
For the superposition state, the outcome is probabilistic with expectation value
which is same as one interpreted as measurement outcome in the previous section.
Let us finally come to example of quantum systems that are useful for quantum computing.
A Qubit is a quantum system with two state basis. A single qubit is one of the simplest quantum system of 2-dimensional Hilbert space.
A qubit is an abstraction of a quantum system with two dimensional Hilbert space, with a basis
Note that in order to represent a quantum state, the kets
Since the Hilbert space is two dimensional, all the operator on it can be
represented as
Noteworthy operators are the following -
- Identity
$I$ , and Hadamard$H$ operators. Their matrix representation is following -
- Pauli
$X, Y, Z$ operators. Their matrix representation is following - $$ X = \begin{pmatrix} 0 & 1\1 & 0 \end{pmatrix}; Y = \begin{pmatrix} 0 & -i\i & 0 \end{pmatrix}; Z = \begin{pmatrix} 1 & 0\0 & -1 \end{pmatrix}; $$
One assumes a coordinate system with
In general, when we combine two quantum systems, the Hilbert space of the combined system is constructed from the cartesian product of the basis of component systems.
Let us first consider a system of two qubits, if the basis for one qubit is
Where the order of the states in ordered pair follows the qubit label.
For example,
If we have more qubits, the Hilbert space will grow as
The size of the basis, and hence the dimension of the Hilbert space of $n$ qubits is $2^n$
The quantum state we referred to so far are what one calls pure states, and they are represented by a normalised vector in a complex Hilbert space. In the Dirac notation, they are described as a ket
There is another way to describe the state of quantum system which can be generalised to mixed states (defined below), and the is through density operator. For a system in pure quantum state
- It is easy to see that trace of the density operator is 1, i.e.,
$\text{Tr}(\rho)=1$ . - For a pure state,
$\rho^2 = \rho$ , so$\text{Tr}(\rho^2)=1$
Suppose we have a quantum system, and wish to prepare it in a state
What mathematical object should we use to represent such a state? Clearly a single vector does not suffice, so we need something that depends on the weights
Such state are represented by the following
- For a mixed state,
$\rho^2 \ne \rho$ , and$\text{Tr}(\rho) <1$ .
Footnotes
-
See the Wikipedia article on set for more details. ↩