r.9 Summary of useful results
We will tersely list, in no particular order, a number of results we've
established that will be useful in the future:
- Properties of matrix multiplication:
$$\eqb{
{\bf A B} & \;\; \neq \;\; & {\bf B A} &
\;\;\;\;\mbox{Typically not commutative} \\
{\bf A (B C)} & \eql & {\bf (A B) C} &
\;\;\;\;\mbox{Always associative} \\
{\bf A (B + C)} & \eql & {\bf (A B) + (A C)} &
\;\;\;\;\mbox{Distributes over addition} \\
{\bf (A + B)C} & \eql & {\bf (A C) + (B C)} &
\;\;\;\;\mbox{Distributes over addition} \\
\alpha {\bf (A B)} & \eql & {\bf (\alpha A) B}
\eql {\bf A (\alpha B)} &
\;\;\;\;\mbox{Scalars move freely} \\
}$$
- A different way to view matrix-matrix multiplication:
$$
{\bf A}
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf b}_1 & {\bf b}_2 & \cdots & {\bf b}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
\eql
\mat{ & & & \\
\vdots & \vdots & \vdots & \vdots\\
{\bf A b}_1 & {\bf A b}_2 & \cdots & {\bf A b}_n\\
\vdots & \vdots & \vdots & \vdots\\
& & &
}
$$
- Reverse order rules:
$$\eqb{
({\bf AB})^{T} & \eql & {\bf B}^{T} {\bf A}^{T}
& \;\;\;\; \mbox{Transpose of a product} \\
({\bf AB})^{-1} & \eql & {\bf B}^{-1} {\bf A}^{-1}
& \;\;\;\; \mbox{Inverse of a product} \\
}$$
- Relationship between dot-product and angle between two
vectors:
$$
\cos(\theta)
\eql
\frac{{\bf v} \cdot {\bf u}}{
|{\bf v}| |{\bf u}|}
$$
- If \({\bf A}\) has orthonormal columns, then
\({\bf A}^T {\bf A} = {\bf I}\) which makes
\({\bf A}^{-1} = {\bf A}^{T}\).
- If \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is a collection
of vectors and \({\bf A}\)
is a matrix that has these vectors as columns, then
If columns \(i_1,i_2, \ldots, i_k\) are the pivot
columns of \(RREF({\bf A})\),
then \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\)
are independent vectors.
- The rowspace dimension (minimum number of vectors needed
to span the rowspace) is the same as the colspace dimension:
$$\mbox{dim}(\mbox{rowspace}({\bf A}))
\eql \mbox{dim}(\mbox{colspace}({\bf A}))
$$
- A collection of mutually-orthogonal vectors is linearly independent.
- If the columns of \({\bf A}_{m\times n}\)
are linearly independent, then \( ({\bf A}^T {\bf A})^{-1} \) exists.
- This is useful in the least squares solution
$$
\hat{\bf x} = ({\bf A}^T {\bf A})^{-1} {\bf A}^T {\bf b}
$$
- In most applications, we'll typically have many more rows
than columns, and so it's very likely that the columns are independent.
- Also, if the system \({\bf Ax}={\bf b}\) has a
solution, the least squares solution is an exact solution.
- The Gram-Schmidt algorithm takes a collection of
linearly independent vectors and produces an orthogonal
(orthonormal, if tweaked) basis for the span of those vectors.
- Another way of saying it: it turns a non-orthogonal basis
into an orthogonal one.
-
The following are equivalent (any one implies the other two):
- \({\bf Q}\) has orthonormal columns.
- \(|{\bf Qx}| \eql |{\bf x}|\)
\(\rhd\)
Transformation by \({\bf Q}\) preserves lengths.
- \(({\bf Qx}) \cdot ({\bf Qy}) \eql
{\bf x}\cdot{\bf y}\)
\(\rhd\)
Transformation by \({\bf Q}\) preserves dot-products
- The product of two orthogonal matrices is orthogonal.
r.10 Summary of some key theorems
We will not list all the important theorems nor every part of a
theorem, just the most important ones (and most important parts):
- The main result that ties together inverses and equation
solutions for square matrices:
The following are equivalent (each
implies any of the others) for a real matrix \({\bf A}_{n\times n}\)
- \({\bf A}\) is invertible (the inverse exists).
- \({\bf A}^T\) is invertible.
- \(RREF({\bf A}) \eql {\bf I}\)
- \(\mbox{rank}({\bf A}) = n\).
- The rows are linearly independent.
- The columns are linearly independent.
- \(\mbox{nullspace}({\bf A})=\{{\bf 0}\}\)
\(\rhd\)
\({\bf Ax}={\bf 0}\) has \({\bf x}={\bf 0}\) as the only solution.
- \({\bf Ax}={\bf b}\) has a unique solution.
- \(\mbox{colspace}({\bf A})=\mbox{rowspace}({\bf A})=\mathbb{R}^n\)
-
For any matrix \({\bf A}_{m\times n}\)
$$
\mbox{rank}({\bf A}) = \mbox{rank}({\bf A}^T{\bf A})
$$
- The singular value decomposition:
any real matrix \({\bf A}_{m\times n}\) with rank \(r\)
can be written as the product
$$
{\bf A} \eql
{\bf U}
{\bf \Sigma}
{\bf V}^T
$$
where \({\bf U}_{m\times m}\) and \({\bf V}_{n\times n}\) are real orthogonal matrices and
\({\bf \Sigma}_{m\times n}\) is a real diagonal matrix of the form
$$
{\bf \Sigma} \eql
\mat{
\sigma_1 & 0 & 0 & \ldots & & & 0\\
0 & \sigma_2 & 0 & \ldots & & & 0\\
\vdots & \vdots & \ddots & & & & 0\\
0 & 0 & \ldots & \sigma_r & & \ldots & 0\\
0 & \ldots & & & 0 & \ldots & 0 \\
0 & \ldots & & & \ldots & \ddots & 0\\
0 & \ldots & & & & \ldots & 0\\
}
$$
- For every linear transformation \(T\) there is an equivalent matrix
\({\bf A}\) such that \(T({\bf x}) = {\bf Ax}\) for all \({\bf x}\).
- This last point is worth understanding carefully:
- First, think of a transformation \(T({\bf x})\) as
something that does something to a vector.
- An example: square the components of a vector:
$$
T(x_1, x_2, \ldots, x_n) \eql (x_1^2, x_2^2, \ldots, x_n^2)
$$
- Thus, a transformation produces another vector.
- A linear transformation satisfies the property:
$$
T(\alpha{\bf x} + \beta{\bf y}) \eql
\alpha T({\bf x}) + \beta T({\bf y})
$$
for all scalars \(\alpha, \beta\) and all
vectors \({\bf x}\). (The squaring example above is not linear.)
- Let's now expand \({\bf x}\) in terms of a basis like the
standard basis and apply the linearity of \(T\):
$$\eqb{
T({\bf x})
& \eql &
T(x_1 {\bf e}_1 + x_2 {\bf e}_2 + \ldots + x_n {\bf e}_n) \\
& \eql &
x_1 T({\bf e}_1) + x_2 T({\bf e}_2) + \ldots + x_n T({\bf e}_n)\\
}$$
- What this means:
once a linear transformation has taken a linear combination
of the standard vectors, it is forced to produce the
same linear combination of the transformed standard vectors.
- Thus, a linear transformations actions on standard vectors
completely determine its action on any vector.
- This becomes the equivalent matrix (with the
\(T({\bf e}_i)\)'s as columns).
- Why should we care about linear transformations when we
could instead work with their matrix representations?
- The linear transformation version is useful in proofs.
- Linear transformations generalize beyond real vectors
to functions.
Finally, let's point out two important things to remember:
- Consider \(\mathbb{R}^n\), the set of all n-component
vectors:
- Then, \(\mathbb{R}^n\) needs exactly \(n\)
vectors for a basis.
- That is, \(n\) linearly independent vectors are sufficient
and necessary.
- If \({\bf w}\) is in the span of linearly independent
vectors \({\bf v}_1, {\bf v}_2,\ldots,{\bf v}_m\), then
there's a unique linear combination that expresses
\({\bf w}\) in terms of the \({\bf v}_i\)'s.
- To prove this, try two different linear combinations and
subtract. Then use the linear independence of the \({\bf v}_i\)'s.