L.1 Relationship between linear algebra and quantum computing
- Quantum computing uses linear algebra throughout,
and for nearly everything.
Doing quantum computing without linear algebra would be
like taking a course on Shakespeare without knowing English.
- The parts of standard linear algebra we'll most use are:
- Vectors, matrices, matrix-vector multiplication,
matrix-matrix multiplication.
- Span, basis, linear independence.
- Eigenvectors and eigenvalues.
- The new parts of linear algebra (that we'll cover after the
review) involve:
- A new type of notation called the Dirac
(or ket) notation.
- Vectors and matrices that have complex numbers in them.
- A few new concepts such as: outer-products, projectors,
and special types
of matrices called Hermitian and unitary matrices.
What is likely to be a bit confusing:
- The new (Dirac) notation takes some getting used to.
- If you haven't seen complex numbers before, they need
a little practice.
- Unfortunately, unlike the arrows we've used for regular
real-valued vectors, complex vectors do not have a
convenient visualization. Thus, you'll need to get used
to symbols and abstraction.
Accordingly:
- To avoid the double whammy of new notation and concepts
simultaneously, we'll first review in standard notation
and then review again in the new notation.
- The only way to get used to the new concepts is
to develop facility through practice problems.
L.2 Review of basic linear algebra
Please review in this order:
- Review
Part-I: Vectors,
dot-products, lengths, angles, orthogonality, matrix-vector
multiplication, solving \({\bf Ax} = {\bf b}\) exactly and approximately.
- Review
Part-II:
matrix-matrix multiplication, spaces, span, basis, independence,
orthogonality (again), projections.
- Review
Part-III:
summary of useful basic results, summary of some key theorems.
- Review
Part-IV:
change of basis for vectors and matrices
The above review sections are from the linear algebra course.
In the remainder, we will selectively review other topics,
covering only those aspects relevant to quantum computing.
If you have not completed the above 4-part review by now, please do so
before continuing here.
L.3 Some useful facts
Useful fact #1: the product \({\bf AB}\) can be written in
terms of the columns of \({\bf B}\) as
$$
{\bf AB} \eql
{\bf A}
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf b}_1 & {\bf b}_2 & \cdots & {\bf b}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
\eql
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf A b}_1 & {\bf A b}_2 & \cdots & {\bf A b}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
$$
That is, we stack the individual \({\bf A b}_i\) products (which
are vectors) as columns in the result.
- Let's see how this works in an example. Consider
$$
{\bf A B} \eql
\mat{1 & 2\\ 0 & -3} \mat{\color{dkgreen}{2} & \color{blue}{-3}\\ \color{dkgreen}{0} & \color{blue}{1}}
\eql
\mat{\color{dkgreen}{2} & \color{blue}{-1}\\ \color{dkgreen}{0} & \color{blue}{-3}}
$$
- The first column on the right becomes
$$
{\bf A b}_1 \eql
\mat{1 & 2\\ 0 & -3} \mat{\color{dkgreen}{2} \\ \color{dkgreen}{0}}
\eql
\mat{\color{dkgreen}{2} \\ \color{dkgreen}{0} }
$$
The second column becomes
$$
{\bf A b}_2 \eql
\mat{1 & 2\\ 0 & -3} \mat{\color{blue}{-3}\\ \color{blue}{1}}
\eql
\mat{\color{blue}{-1}\\ \color{blue}{-3}}
$$
L.4 Inner-products and outer-products
We're already familiar with an inner-product:
- The inner-product of two vectors (of the same dimension) is
also known as the dot-product.
- An example with
$$
{\bf u} \eql \vecthree{1}{2}{3}
\;\;\;\;\;
{\bf v} \eql \vecthree{0}{-1}{2}
$$
The inner (dot) product is the number
$$
1\times 0 \;\; + 2\times -1\;\; + 3\times 2 \; \eql 4
$$
- We often write the inner product as \({\bf u} \cdot {\bf v}\).
- But there is another way of writing this that will
ultimately be more useful for us:
- Let's write the two vectors as a row times a column:
$$
\mat{1 & 2 & 3} \; \mat{0\\-1\\2}
$$
- Think of the object on the left as a
single-row matrix, the transpose
$$
{\bf u}^T \eql \mat{1 & 2 & 3}
$$
- In effect, we are writing the inner product as:
$$
{\bf u}^T \; {\bf v}
$$
- Why is this useful?
- This puts the inner product on an equal footing with
matrix-vector multiplication, which means larger expressions
can be algebraically simplified using associativity
and distribution.
- That is, think of
$$
{\bf u}^T \; {\bf v}
$$
as this "matrix" product with dimensions as shown:
$$
{\bf u}^T_{1\times n} \; {\bf v}_{n\times 1}
$$
- For example:
$$\eqb{
|{\bf u}+{\bf v}|^2
& \eql &
({\bf u}+{\bf v})^T ({\bf u}+{\bf v}) \\
& \eql &
({\bf u}^T +{\bf v}^T) ({\bf u}+{\bf v}) \\
& \eql &
|{\bf u}|^2 + 2 {\bf u}^T {\bf v}
+ |{\bf v}|^2\\
}$$
If we had instead only the previous inner-product definition,
we would be stuck at the second line.
We've seen that a row times a column is a number (the inner product).
We could ask: what do we get if we multiple a column times a row?
Answer: The product of a column times a row is a matrix:
- For example:
$$
\vecthree{1}{2}{3}
\mat{0 & -1 & 2}
\eql
\mat{0 & -1 & 2\\
0 & -2 & 4\\
0 & -3 & 6}
$$
This is simply all possible number multiplications.
- This is called the outer-product of two vectors.
- If \({\bf A}_{i,j}\) is the i-th row, j-th column element
of the resulting matrix then
$$
{\bf A}_{i,j} \eql {\bf u}_i \; {\bf v}_j
$$
That is, the i-th component of \({\bf u}\) (a number)
multiplied by the j-th component of \({\bf v}\) (another number).
- We don't even need the vectors to be of the same length,
for example:
$$
\vectwo{1}{2}
\mat{0 & -1 & 2}
\eql
\mat{0 & -1 & 2\\
0 & -2 & 4}
$$
- Thus, if \({\bf u}\) and \({\bf v}\) are m-dimensional
and n-dimensional vectors,
the outer-product
\({\bf u} {\bf v}^T\)
has dimensions \(m\times n\).
- As we will see, outer-products are used all over
quantum computing (with different notation that we will encounter).
L.5 Eigenvectors and eigenvalues: a review
We'll focus on the second interpretation of matrix-vector
multiplication
\(
{\bf A u} \eql {\bf v}
\)
where we'll think of the matrix \({\bf A}\) as
transforming or "acting on" vector \({\bf u}\)
to produce vector \({\bf v}\).
Here's an example:
Notice that \({\bf v}\) has a different length and direction
than \({\bf u}\). This is typical for a random matrix acting on
a random vector.
However, now let's see what the same \({\bf A}\) does to two particular
vectors:
We see that
$$
{\bf A} {\bf x}_i \eql \lambda_i {\bf x}_i
$$
Such \({\bf x}_i\) vectors are called eigenvectors of
the matrix \({\bf A}\), each with its associated
eigenvalue \(\lambda_i\) (a number).
Thus, in the above case, \({\bf A}\) has eigenvectors
$$
{\bf x}_1 \eql \vectwo{1}{0}
\;\;\;\;\;
{\bf x}_2 \eql \vectwo{0.5}{1}
$$
with associated eigenvalues
$$
\lambda_1 \eql 5
\;\;\;\;\;
\lambda_2 \eql -1
$$
Here's what to remember about eigenvectors and eigenvalues:
- Note: the matrix \({\bf A}\) in \({\bf Ax}=\lambda{\bf x}\)
needs to be square:
- Consider \({\bf A}_{m\times n} {\bf x}_{n\times 1}\).
- This produces a vector of dimension \(m\times 1\).
- The only way \({\bf Ax}=\lambda{\bf x}\) is if \(m=n\).
- Some matrices will not have (real) eigenvectors.
- For example, no rotation matrix can possibly
leave a vector in the same direction.
- Most math textbooks focus on hand-calculating eigenvalues
using the so-called characteristic equation. In practice,
eigenvectors/values are almost always computed using
numerical algorithms.
Let's examine the special case when
when the matrix of interest is symmetric
(i.e., \({\bf A}^T = {\bf A}\)):
- The spectral theorem (as it's called) tells us that:
- All its eigenvalues \(\lambda_1,\lambda_2,\ldots,\lambda_n\)
(with possible multiplicities) are real.
- There exist \(n\) corresponding
eigenvectors \({\bf x}_1,{\bf x}_2,\ldots,{\bf x}_n\)
such that the matrix
$$
{\bf S}
\defn
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf x}_1 & {\bf x}_2 & \cdots & {\bf x}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
$$
is orthonormal (making them a basis for the space).
This is not at all obvious. Surprisingly, this is
easier to prove with complex-number matrices.
- Now, suppose we place the eigenvalues from above
diagonally in a matrix \({\bf \Lambda}\):
$$
{\bf \Lambda} \eql
\mat{\lambda_1 & 0 & & 0 \\
0 & \lambda_2 & & 0 \\
\vdots & & \ddots & \\
0 & 0 & & \lambda_n
}
$$
(with the eigenvalues on the diagonal) then
$$
{\bf A S}
\eql
{\bf S \Lambda}
$$
- It is conventional to renumber the indices so that
the eigenvalues are sorted along the diagonal from large to small.
- Since
\({\bf A} {\bf x}_i = \lambda_i {\bf x}_i\)
you can check that
$$
{\bf A S}
\eql
{\bf S \Lambda}
$$
(with the eigenvalue matrix appearing on the right side).
- Then, multiply both sides by \({\bf S}^{-1}\) on the right
to get
$$
{\bf A}
\eql
{\bf S \Lambda S}^{-1}
$$
- Because \({\bf S}\) has orthonormal columns,
\({\bf S}^{-1} = {\bf S}^{T}\). Which means we can write
$$
{\bf A}
\eql
{\bf S \Lambda S}^{T}
$$
This is sometimes called the diagonalization of the
the matrix \({\bf A}\).
- Let's look at an example of a symmetric matrix:
$$
{\bf A} \eql
\mat{2 & 1 & 3\\
1 & 0 & 1\\
3 & 1 & 2}
$$
For this case, it turns out that the
$$
{\bf x}_1 \; = \; \mat{0.684\\0.255\\0.684}
\;\;\;\;\;\;
{\bf x}_2 \; = \; \mat{0.18\\-0.967\\0.18}
\;\;\;\;\;\;
{\bf x}_3 \; = \; \mat{0.707\\0\\-0.707}
$$
are the orthonormal eigenvectors with corresponding eigenvalues
$$
\lambda_1 = 5.372
\;\;\;\;\;\;
\lambda_2 = -0.372
\;\;\;\;\;\;
\lambda_3 = -1
$$
Which means we can write
It turns out we can express the (symmetric) matrix \({\bf A}\)
above using outer-products of the eigenvectors:
- Let's go back to
\(
{\bf A}
\eql
{\bf S \Lambda S}^{T}
\)
and write
$$
{\bf S \Lambda}
\eql
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf x}_1 & {\bf x}_2 & \cdots & {\bf x}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
\mat{\lambda_1 & 0 & & 0 \\
0 & \lambda_2 & & 0 \\
\vdots & & \ddots & \\
0 & 0 & & \lambda_n
}
\eql
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
\lambda_1{\bf x}_1 & \lambda_2{\bf x}_2 & \cdots & \lambda_n{\bf x}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
$$
- Therefore
$$
{\bf A }
\eql
{\bf S \Lambda} {\bf S}^T
\eql
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
\lambda_1{\bf x}_1 & \lambda_2{\bf x}_2 & \cdots & \lambda_n{\bf x}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
\mat{
\ldots & {\bf x}_1^T & \ldots \\
\ldots & {\bf x}_2^T & \ldots \\
& \vdots & \\
\ldots & {\bf x}_n^T & \ldots \\
}
\eql
\sum_{i=1}^n \lambda_i {\bf x}_i {\bf x}_i^T
$$
- For our \(3\times 3\) matrix \({\bf A}\) example:
$$\eqb{
{\bf A }
& \eql &
\sum_{i=1}^3 \lambda_i {\bf x}_i {\bf x}_i^T \\
& \eql &
\lambda_1 {\bf x}_1 {\bf x}_1^T
+
\lambda_2 {\bf x}_2 {\bf x}_2^T
+
\lambda_3 {\bf x}_3 {\bf x}_3^T \\
& \eql &
5.372 \mat{0.684\\0.255\\0.684} \mat{0.684 & 0.255 & 0.684}\\
& & -
0.372 \mat{0.18\\ -0.967 \\ 0.18} \mat{0.18 & -0.967 & 0.18}\\
& & -1 \mat{0.707\\ 0 \\-0.707} \mat{0.707 & 0 & -0.707}
}$$
- Each outer-product is matrix. Let's expand to make this
clear:
$$
{\bf A }
\eql
5.372
\mat{0.468 & 0.174 & 0.468\\
0.174 & 0.065 & 0.174\\
0.468 & 0.174 & 0.468}
- 0.372
\mat{0.032 & -0.174 & 0.032\\
-0.174 & 0.935 & -0.174\\
0.032 & -0.174 & 0.032}
- 1
\mat{0.5 & 0 & -0.5\\
0 & 0 & 0\\
-0.5 & 0 & 0.5}
$$
- Why is this useful? Two reasons.
- The first reason is to approximate and simplify a matrix:
- Outside of quantum computing, large eigenvalues can indicate
what "matters" in a matrix.
- Thus, in the above case, the first term can be computed as
$$
5.372
\mat{0.468 & 0.174 & 0.468\\
0.174 & 0.065 & 0.174\\
0.468 & 0.174 & 0.468}
\eql
\mat{2.513 & 0.937 & 2.513\\
0.937 & 0.349 & 0.937\\
2.513 & 0.937 & 2.513}
\; \approx \;
\mat{2 & 1 & 3\\
1 & 0 & 1\\
3 & 1 & 2}
\eql {\bf A}
$$
Which is a first-order approximation of the original matrix \({\bf A}\).
- In quantum computing and quantum mechanics, however,
this eigenvector outerproduct
expression is an essential part of
the theory:
- The act of measuring, a very strange aspect
of the fundamental theory, will use such outerproducts.
We'll now turn to some topics that your linear algebra
course might not have covered.
Read through once, and we'll review in class (albeit with a
different notation).
L.6 New topic: Unit length vectors and their importance
A unit-length vector \({\bf v}\) has
$$
|{\bf v}| \eql \sqrt{{\bf v} \cdot {\bf v}} \eql 1
$$
which implies \({\bf v}\cdot {\bf v} = 1\).
The importance of unit-length in quantum computing (and
mechanics):
- Unit-length vectors are the only kind of vector used
in quantum computing (and mechanics).
- The reason is: convenience.
- Unit lengths simplify calculations and terms you have to
carry around.
Let's look at an example of such a simplification:
- Recall how we computed the projection of a vector
\({\bf u}\) on \({\bf v}\):
- The projection of \({\bf u}\) along \({\bf v}\) is:
$$\eqb{
{\bf y}
& \eql &
\alpha {\bf v} \\
& \eql &
\left(\frac{{\bf v}\cdot {\bf u}}{{\bf v}\cdot {\bf v}} \right) {\bf v}
}$$
- When \({\bf v}\) has unit length, this becomes
$$
{\bf y}
\eql
({\bf v}\cdot {\bf u}) \; {\bf v} \\
$$
Unit lengths also simplify orthogonal bases:
- The basis
$$
{\bf w}_1 \eql \vectwo{1}{\sqrt{3}}
\;\;\;\;\;\;
{\bf w}_2 \eql \vectwo{-\sqrt{3}}{1}
$$
is orthogonal (Check: \({\bf w}_1 \cdot {\bf w}_2 = 0\)).
- But neither vector is of unit length.
- Which means that if we place them as columns in
a matrix
$$
{\bf X} \eql
\mat{\vdots & \vdots\\
{\bf w}_1 & {\bf w}_2\\
\vdots & \vdots}
\eql
\mat{1 & -\sqrt{3}\\
\sqrt{3} & 1}
$$
then the transpose is not quite the inverse:
$$\eqb{
{\bf X}^{T} {\bf X}
& \eql &
\mat{1 & \sqrt{3}\\
-\sqrt{3} & 1}
\mat{1 & -\sqrt{3}\\
\sqrt{3} & 1}
& \eql &
\mat{4 & 0\\
0 & 4}
}$$
- It is easy to convert an orthogonal basis to
an orthonormal basis: divide each vector by its length.
- Thus, define
$$
{\bf v}_1 \eql \frac{{\bf w}_1}{|{\bf w}_1|}
\eql \vectwo{\frac{1}{2}}{\frac{\sqrt{3}}{2}}
$$
and
$$
{\bf v}_2 \eql \frac{{\bf w}_2}{|{\bf w}_2|}
\eql \vectwo{-\frac{\sqrt{3}}{2}}{\frac{1}{2}}
$$
- Then, if we define
$$
{\bf Y} \eql
\mat{\vdots & \vdots\\
{\bf v}_1 & {\bf v}_2\\
\vdots & \vdots}
\eql
\mat{\frac{1}{2} & -\frac{\sqrt{3}}{2}\\
\frac{\sqrt{3}}{2} & \frac{1}{2}}
$$
we see that
$$
{\bf Y}^{T} {\bf Y}
\eql
\mat{\frac{1}{2} & \frac{\sqrt{3}}{2}\\
-\frac{\sqrt{3}}{2} & \frac{1}{2}}
\mat{\frac{1}{2} & -\frac{\sqrt{3}}{2}\\
\frac{\sqrt{3}}{2} & \frac{1}{2}}
\eql
\mat{1 & 0\\0 & 1}
\eql
{\bf I}
$$
L.7 New topic: Projectors, completeness
Let's revisit projection on unit-length vectors:
Now consider projecting a vector \({\bf u}\) onto every
vector in an orthonormal basis
\({\bf v}_1, \ldots,{\bf v}_n \):
We'll next describe a property called completeness for
projectors:
- First, let's recall that a vector is the sum of its
projections on any orthonormal basis:
$$
{\bf u} \eql
({\bf v}_1^T {\bf u}) \; {\bf v}_1
+ \ldots
({\bf v}_n^T {\bf u}) \; {\bf v}_n
$$
- Why is this true?
- This comes from first writing \({\bf u}\) in terms of the
basis:
$$
{\bf u} \eql
\alpha_1 {\bf v}_1
+ \ldots +
\alpha_n {\bf v}_n
$$
- Now multiply from the left by \({\bf v}_i^T\) to get
\(\alpha_i = {\bf v}_i^T {\bf u}\)
- Since \({\bf v}_i^T {\bf u}\) is a number we can rewrite
and use associativity
$$\eqb{
{\bf u}
& \eql &
({\bf v}_1^T {\bf u}) \; {\bf v}_1
+ \ldots +
({\bf v}_n^T {\bf u}) \; {\bf v}_n \\
& \eql &
{\bf v}_1 \; ({\bf v}_1^T {\bf u})
+ \ldots +
{\bf v}_n \; ({\bf v}_n^T {\bf u}) \\
& \eql &
({\bf v}_1 {\bf v}_1^T) \; {\bf u}
+ \ldots +
({\bf v}_n {\bf v}_n^T) \; {\bf u} \\
& \eql &
\left({\bf v}_1 {\bf v}_1^T
+ \ldots +
{\bf v}_n {\bf v}_n^T\right) \; {\bf u} \\
}$$
- Which means the sum of projector matrices \({\bf v}_i {\bf v}_i^T\)
is the identity matrix:
$$
{\bf v}_1 {\bf v}_1^T
+ \ldots +
{\bf v}_n {\bf v}_n^T
\eql
{\bf I}
$$
- This is sometimes called the completeness theorem
for projectors.
- We will make frequent use of this result.
That concludes the review of linear algebra.
We will return to some of these concepts when we introduce
the Dirac notation.
When we do, we will work almost exclusively with unit-length
complex vectors, the fundamental building block for
everything in quantum computing and quantum mechanics.
Back to main review page