\( \newcommand{\blah}{blah-blah-blah} \newcommand{\eqb}[1]{\begin{eqnarray*}#1\end{eqnarray*}} \newcommand{\eqbn}[1]{\begin{eqnarray}#1\end{eqnarray}} \newcommand{\bb}[1]{\mathbf{#1}} \newcommand{\mat}[1]{\begin{bmatrix}#1\end{bmatrix}} \newcommand{\nchoose}[2]{\left(\begin{array}{c} #1 \\ #2 \end{array}\right)} \newcommand{\defn}{\stackrel{\vartriangle}{=}} \newcommand{\rvectwo}[2]{\left(\begin{array}{c} #1 \\ #2 \end{array}\right)} \newcommand{\rvecthree}[3]{\left(\begin{array}{r} #1 \\ #2\\ #3\end{array}\right)} \newcommand{\rvecdots}[3]{\left(\begin{array}{r} #1 \\ #2\\ \vdots\\ #3\end{array}\right)} \newcommand{\vectwo}[2]{\left[\begin{array}{r} #1\\#2\end{array}\right]} \newcommand{\vecthree}[3]{\left[\begin{array}{r} #1 \\ #2\\ #3\end{array}\right]} \newcommand{\vecfour}[4]{\left[\begin{array}{r} #1 \\ #2\\ #3\\ #4\end{array}\right]} \newcommand{\vecdots}[3]{\left[\begin{array}{r} #1 \\ #2\\ \vdots\\ #3\end{array}\right]} \newcommand{\eql}{\;\; = \;\;} \definecolor{dkblue}{RGB}{0,0,120} \definecolor{dkred}{RGB}{120,0,0} \definecolor{dkgreen}{RGB}{0,120,0} \newcommand{\bigsp}{\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \newcommand{\plss}{\;\;+\;\;} \newcommand{\miss}{\;\;-\;\;} \newcommand{\implies}{\Rightarrow\;\;\;\;\;\;\;\;\;\;\;\;} \)


Module 6: Theory, Part I

Independence, Rank, Basis


Module objectives

 
By the end of this module, you should be able to

 


6.0   A matrix as data

 

Multivariate data:

  • In many applications, one measures a number of variables simultaneously.

  • For example: a weather measurement at a location could include these 6 variables: temperature, pressure, wind, humidity, cloud cover, precipitation.

  • Assuming all the measurements can be converted to numeric form, each measurement of a set of variables can be treated as a vector.

  • For example, suppose the following six measurements of some 6-variable system were made at a certain time: \(12.5, 13.5, 15.25, 12.55, 13, 11.5\).

  • This is a single, multivariate data sample.

  • Then, we might get another sample of the six quantities, such as \(13.2, 13.5, 16.65, 12.69, 14.4, 11.85\).

  • We will place each sample as a column in a matrix.

  • Thus, the above two samples would look like: $$ \mat{ 12.5 & 13.2 \\ 13.5 & 13.5 \\ 15.25 & 16.65 \\ 12.55 & 12.69 \\ 13 & 14.4 \\ 11.5 & 11.85 \\ } $$

  • Here is an example with 8 samples, each as one column in the data matrix we're calling \({\bf X}\): $$ {\bf X} \eql \mat{ 12.500 & 13.200 & 12.270 & 13.710 & 14.781 & 12.299 & 10.134 & 8.962\\ 13.500 & 13.500 & 12.600 & 14.400 & 16.080 & 12.384 & 9.197 & 7.488\\ 15.250 & 16.650 & 13.440 & 19.020 & 23.682 & 13.174 & 4.063 & -0.845\\ 12.550 & 12.690 & 12.234 & 13.062 & 13.780 & 12.175 & 10.786 & 10.039\\ 13.000 & 14.400 & 12.540 & 15.420 & 17.562 & 12.598 & 8.268 & 5.923\\ 11.500 & 11.850 & 11.835 & 11.655 & 11.351 & 11.957 & 12.468 & 12.737 } $$

  • We'll use this notation:
    • Let \({\bf x}_i = \) the i-th column.
    • \({\bf x}_i(j) = \) the j-th entry in the i-th column.

  • For example: $$ {\bf x}_3 \eql \mat{12.270\\ 12.600\\ 13.440\\ 12.234\\ 12.540\\ 11.835} $$ and \({\bf x}_3(5) \eql 12.540\).

  • Thus, the data matrix \({\bf X}\) consisting of \(n\) samples, each of dimenion \(m\) can be viewed as $$ {\bf X} \eql \mat{ & & & \\ \vdots & \vdots & \ldots & \vdots \\ {\bf x}_1 & {\bf x}_2 & \ldots & {\bf x}_n\\ \vdots & \vdots & \ldots & \vdots \\ & & & } $$ where each column vector has \(m\) elements (numbers) $$ {\bf x}_i = ({\bf x}_i(1), {\bf x}_i(2), \ldots, {\bf x}_i(m)) $$
 

Why place data into a matrix?

  • We will later, when studying the singular value decomposition (SVD), see that linear algebra techniques are the key to applications like latent semantic analysis, recommendation engines, and image search.

  • As an example, let's apply the RREF algorithm to the above \(6 \times 8\) matrix to get: $$ RREF({\bf X}) \eql \mat{ 1 & 0 & 0 & 0.60 & 1.55 & -0.36 & -1.97 & -2.82\\ 0 & 1 & 0 & 1.40 & 2.32 & 0.12 & -1.81 & -2.86 \\ 0 & 0 & 1 & -1 & -2.87 & 1.24 & 4.78 & 6.68 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 } $$ We see that some rows have "disappeared".

    That is, the data appears reduced to three rows.

  • The matrix is what we'd call a low rank matrix, as opposed to a full rank matrix.

  • Later, we will apply the SVD to the matrix to get transformed data \({\bf Y}\): $$ {\bf Y} \eql \mat{ 32.08 & 33.77 & 30.59 & 35.94 & 40.26 & 30.44 & 21.91 & 17.31\\ 0.49 & 1.36 & -0.95 & 3.15 & 6.64 & -1.2 & -7.98 & -11.63\\ -0.66 & 0.31 & -0.14 & 0.17 & 0.09 & 0.1 & 0.08 & 0.05\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 } $$ We see that the magnitudes of the numbers in the first row dwarf the others, suggesting some type of importance.

    This will turn out to be an effective way to reduce the dimension of data sets.

  • But as background to understanding the SVD, we'll first need to understand the concepts of linear independence, rank and basis.
 


6.1   Linearly independent vectors

 

Consider this example:

  • Suppose we have three vectors $$ {\bf u} \eql \vecthree{2}{-1}{1} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; {\bf v} \eql \vecthree{0}{1}{1} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; {\bf w} \eql \vecthree{6}{1}{7} $$ and ask the question are there scalars \(\alpha, \beta\) such that $$ {\bf w} \eql \alpha {\bf u} + \beta {\bf v}? $$ That is, can \({\bf w}\) be expressed as a linear combination of \({\bf u}\) and \({\bf v}\)?
     

    In-Class Exercise 1: How does one address the question for the above example?
     

  • With the above exercise, you saw that, \({\bf w}\) can indeed be expressed as a linear combination of \({\bf u}\) and \({\bf v}\): $$ {\bf w} \eql 3 {\bf u} + 4 {\bf v}? $$ Here, \(\alpha = 3, \beta=4\).
     

    In-Class Exercise 2: Can \({\bf u}\) be expressed in terms of \({\bf v}\) and \({\bf w}\)?
     

  • On the other hand \({\bf z}=(6,1,8)\) cannot be expressed as a linear combination of \({\bf u}\) and \({\bf v}\).
     

    In-Class Exercise 3: Show that this is the case.
     

    In-Class Exercise 4: Can \({\bf u}\) above be expressed as a linear combination of \({\bf v}\) and \({\bf z}\)? Or \({\bf v}\) as a linear combination of \({\bf u}\) and \({\bf z}\)?
     

  • One could say \({\bf w}\) depends in a linear fashion on \({\bf u}\) and \({\bf v}\).

  • Similarly, there is no way that \({\bf z}\) is linearly dependent on \({\bf u}\) and \({\bf v}\).
 

We are led to a definition of linear independence:

  • One way to define linear independence is to first define dependence for a particular vector, along the lines of:
    \({\bf w}\) is linearly dependent on \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) if there exist scalars \(\alpha_1,\alpha_2,\ldots,\alpha_n\) such that $$ {\bf w} \eql \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \ldots + \alpha_n {\bf v}_n $$

  • Observe: we can algebraically rearrange to see that $$ {\bf v}_2 \eql \beta_1 {\bf v}_1 + \beta_2 {\bf w} + \ldots + \beta_n {\bf v}_n $$ (for a different set of coefficients)
    • That is, there's nothing one-directional about \({\bf w}\)'s dependence.
    • Once one variable is dependent on others, they all are.
    • Therefore, it's often more useful to define linear independence directly, and for a collection of vectors.

  • Definition: Vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are linearly independent if the only solution to the equation $$ \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \ldots + \alpha_n {\bf v}_n \eql {\bf 0} $$ is \(\alpha_1 = \alpha_2 = \ldots = \alpha_n = 0\).

  • Let's see how this works for an individual vector in the collection:
    • Suppose \({\bf v}_1\) were linearly dependent on the others.
    • Then, we'd be able to write $$ {\bf v}_1 = \beta_2 {\bf v}_2 + \ldots + \beta_n {\bf v}_n $$
    • Then, $$ - {\bf v}_1 + \beta_2 {\bf v}_2 + \ldots + \beta_n {\bf v}_n \eql {\bf 0} $$ implies that at least one coefficient (that of \({\bf v}_1\)) is not zero.
    • This is consistent with the definition.

  • Another way of thinking about it:
    • Suppose we have $$ \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \ldots + \alpha_n {\bf v}_n \eql {\bf 0} $$ where at least one \(\alpha_i \neq 0\).
    • Suppose, for example, \(\alpha_2 \neq 0\) (among possibly other non-zero scalars).
    • Then, we could write $$ - \alpha_2 {\bf v}_2 \eql \alpha_1 {\bf v}_1 + \ldots + \alpha_n {\bf v}_n $$
    • Which, because \(\alpha_2 \neq 0\), means that we can divide: $$ {\bf v}_2 \eql \frac{\alpha_1}{-\alpha_2} {\bf v}_1 + \ldots + \frac{\alpha_n}{-\alpha_2} {\bf v}_n $$
    • That is, \({\bf v}_2\) is dependent on the other vectors.

  • The weirdness of the zero vector:
    • Suppose \({\bf v}_1 = {\bf 0} = (0,0,...,0)\).
    • Then, consider $$ {\bf v}_1 = \beta_2 {\bf v}_2 + \ldots + \beta_n {\bf v}_n $$
    • Here, we can make \(\beta_2 = \beta_3 = ... = \beta_n = 0\) so that $$ {\bf 0} = 0 {\bf v}_2 + \ldots + 0 {\bf v}_n $$
    • That is, the zero vector is can always be interpreted as "dependent" on any other collection of vectors.
    • If we were to use the definition, we would write: $$ (-1) {\bf 0} + 0 {\bf v}_2 + \ldots + 0 {\bf v}_n \eql {\bf 0} $$
    • Thus, no collection of vectors that includes the zero vector can be linearly independent.

  • If that seems strange, it is because we read more than mathematically necessary into the word "dependent".
 

Another example:

  • Consider the vectors $$ {\bf v}_1 \eql \vecthree{2}{-1}{1} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; {\bf v}_2 \eql \vecthree{0}{1}{1} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; {\bf v}_3 \eql \vecthree{6}{1}{8} $$

  • By the definition, they would be independent if $$ \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \alpha_3 {\bf v}_3 \eql {\bf 0} $$ implies \(\alpha_1 = \alpha_2 = \alpha_3 = 0\).

  • Let's write this in column form: $$ \alpha_1 \vecthree{2}{-1}{1} \plss \alpha_2 \vecthree{0}{1}{1} \plss \alpha_3 \vecthree{6}{1}{8} \eql {\bf 0} $$

  • But this can be written in matrix form with the vectors as columns: $$ \mat{2 & 0 & 6\\ -1 & 1 & 1\\ 1 & 1 & 8} \vecthree{\alpha_1}{\alpha_2}{\alpha_3} \eql \vecthree{0}{0}{0} $$

  • We've now reduced the independence question to the familiar \({\bf A x}= {\bf b}\)!

  • In general, for \(n\) vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\), we'd put the vectors in columns $$ \mat{ & & & \\ \vdots & \vdots & \ldots & \vdots \\ {\bf v}_1 & {\bf v}_2 & \ldots & {\bf v}_n\\ \vdots & \vdots & \ldots & \vdots \\ & & & } \mat{\alpha_1\\ \alpha_2\\ \alpha_3\\ \vdots\\ \alpha_n} \eql \mat{0\\ 0\\ 0\\ \vdots\\ 0} $$ The vectors will be a linearly independent collection if the only solution to the above equation is the zero solution: \({\bf\alpha} = {\bf 0}\).

  • At this point, several interesting questions arise:
    • What would happen if we computed the RREF of the matrix?
    • What do we know about the solution to \({\bf A x}= {\bf 0}\)?
      (In this case: \({\bf A \alpha}= {\bf 0}\))
 

Recall (Proposition 5.7):

  • When \(RREF({\bf A}) = {\bf I}\), the only solution to \({\bf A x}= {\bf 0}\) is \({\bf x}= {\bf 0}\)

  • That is, when every row and column has a single pivot.

  • Just as importantly, it goes the other way too: if \({\bf x}= {\bf 0}\) is the only solution, then \(RREF({\bf A}) = {\bf I}\).
 

Now we can connect the two:

  • Thus, vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are linearly independent if and only if the RREF of the matrix $$ \mat{ & & & \\ \vdots & \vdots & \ldots & \vdots \\ {\bf v}_1 & {\bf v}_2 & \ldots & {\bf v}_n\\ \vdots & \vdots & \ldots & \vdots \\ & & & } $$ is the identity matrix.

  • Note that we haven't proved anything new. That hard work was done earlier when we showed that \(RREF({\bf A}) = {\bf I}\) if and only it the only solution to \({\bf A x}= {\bf 0}\) is \({\bf x}= {\bf 0}\).

  • Instead, we've just reinterpreted the definition.

  • That is, we write the condition $$ \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \ldots + \alpha_n {\bf v}_n \eql {\bf 0} $$ as $$ \mat{ & & & \\ \vdots & \vdots & \ldots & \vdots \\ {\bf v}_1 & {\bf v}_2 & \ldots & {\bf v}_n\\ \vdots & \vdots & \ldots & \vdots \\ & & & } \mat{\alpha_1\\ \alpha_2\\ \alpha_3\\ \vdots\\ \alpha_n} \eql \mat{0\\ 0\\ 0\\ \vdots\\ 0} $$ and use the fact proved earlier that \({\bf \alpha}={\bf 0}\) if and only if the RREF is the identity matrix.

  • The interpretation in terms of the RREF gives us a method to determine whether a collection of vectors is linearly independent.
    • Put the vectors in a matrix.
    • Compute the RREF.
    • See if the RREF = \({\bf I}\).
 

What does this tell us when there is dependence among the vectors?

  • So far, it would appear, all the RREF will do is tell us whether or not there is dependence.

  • But with a little more work, as we will see, the RREF will identify independent vectors.
 


6.2   RREF and independent vectors

 

As a preliminary step, let's prove the following about any RREF's rows and columns:

  • Proposition 6.1: Treating each row of any RREF as a vector, the pivot rows are independent. All other rows are dependent on the pivot rows.

  • Proposition 6.2: Treating each column of an RREF as a vector, the pivot columns are independent. All other columns are dependent on the pivot columns.

  • To understand the statements, let's look at an example: $$ \mat{ 1 & 0 & -1 & 3 & 0\\ 0 & 1 & 2 & 1 & 4\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0} $$

  • First, let's look at the rows:
    • There are four row vectors: $$\eqb{ {\bf r}_1 & \eql & (1,0,-1,3,0)\\ {\bf r}_2 & \eql & (0,1,2,1,4)\\ {\bf r}_3 & \eql & (0,0,0,0,0)\\ {\bf r}_4 & \eql & (0,0,0,0,0) }$$
    • Proposition 6.1 says that \({\bf r}_1, {\bf r}_2\) are independent (as a group).
    • And that \({\bf r}_3, {\bf r}_4\) are each dependent on \({\bf r}_1, {\bf r}_2\).
     

    In-Class Exercise 5: Use the definition of linear independence to show both parts of Proposition 6.1 are true for this example, and then generalize to prove 6.1 for all RREFs.
     

  • Recall:
    • We usually think of dependence as a direct relationship.
    • But a zero vector as one of vectors allows its coefficient be nonzero.
    • That is, no collection with the zero vector can be independent.

  • Now let's understand what Proposition 6.2 is saying (about the columns):
    • The RREF's columns form 5 vectors: $$ {\bf c}_1 \eql \vecfour{1}{0}{0}{0} \;\;\;\;\;\;\;\;\; {\bf c}_2 \eql \vecfour{0}{1}{0}{0} \;\;\;\;\;\;\;\;\; {\bf c}_3 \eql \vecfour{-1}{2}{0}{0} \;\;\;\;\;\;\;\;\; {\bf c}_4 \eql \vecfour{3}{1}{0}{0} \;\;\;\;\;\;\;\;\; {\bf c}_5 \eql \vecfour{0}{4}{0}{0} $$
    • Proposition 6.2 says that vectors \({\bf c}_1, {\bf c}_2\) are independent.
    • And that any of \({\bf c}_3, {\bf c}_3, {\bf c}_5\) can be expressed as a linear combination of \({\bf c}_1, {\bf c}_2\).
     

    In-Class Exercise 6: Use the definition of linear independence to show both parts of Proposition 6.2 are true for this example, and then generalize to prove 6.2 for all RREFs.
     

 

Next, let's look at an important idea: rank

  • Suppose we have the following vectors $$ {\bf v}_1 \eql \vecfour{1}{-1}{0}{0} \;\;\;\;\;\;\;\;\; {\bf v}_2 \eql \vecfour{1}{0}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_3 \eql \vecfour{1}{1}{2}{0} \;\;\;\;\;\;\;\;\; {\bf v}_4 \eql \vecfour{0}{1}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_5 \eql \vecfour{3}{-1}{2}{0} $$

  • Consider \({\bf v}_2\) and \({\bf v}_4\). Then, it's easy to see that
    • \({\bf v}_2\) and \({\bf v}_4\) are independent.
    • \({\bf v}_1 = {\bf v}_2 - {\bf v}_4\)
    • \({\bf v}_3 = {\bf v}_2 + {\bf v}_4\)
    • \({\bf v}_5 = 3{\bf v}_2 - {\bf v}_4\)
    This suggests that we need at least two of these vectors to express the others.

  • Similarly, it's easy to show that
    • \({\bf v}_1\) and \({\bf v}_2\) are independent.
    • \({\bf v}_3 = -{\bf v}_1 + 2{\bf v}_2\)
    • \({\bf v}_4 = -{\bf v}_1 + {\bf v}_2\)
    • \({\bf v}_5 = {\bf v}_1 + 2{\bf v}_2\)
     

    In-Class Exercise 7: Show that the above is true given what was shown earlier for \({\bf v}_2\) and \({\bf v}_4\).
     

  • More such tedious calculations with these vectors reveal that the largest independent subset possible from the vectors above is of size 2.
    • No subset of size 1 is enough to express the others.
    • Any subset of size 3 or larger will have dependence.

  • Now, any subset of a linearly independent collection is linearly independent.
     

    In-Class Exercise 8: Prove the following: if the vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are independent, then so is any subset of these vectors.
     

  • Thus, for any general collection of vectors, what we need to do is identify the largest subset that's independent.

  • In some ways this characterizes the amount of independence present in a collection.

  • Definition: The rank of a collection of vectors is the size of the largest subset of independent vectors.
 

And now for two remarkable results:

  • Suppose that \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is a collection of vectors and \({\bf A}\) is a matrix that has these vectors as columns.

  • Proposition 6.3:. If columns \(i_1,i_2, \ldots, i_k\) are the pivot columns of \(RREF({\bf A})\), then \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) are independent vectors amongst the set.

  • Proposition 6.4: The rank of the collection \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is the number of pivot columns of \(RREF({\bf A})\).

  • The statements are a bit complicated, with double subscripts, so let's understand what they're saying through an example:
    • Consider the vectors $$ {\bf v}_1 \eql \vecfour{1}{-1}{0}{0} \;\;\;\;\;\;\;\;\; {\bf v}_2 \eql \vecfour{1}{0}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_3 \eql \vecfour{1}{1}{2}{0} \;\;\;\;\;\;\;\;\; {\bf v}_4 \eql \vecfour{0}{1}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_5 \eql \vecfour{3}{-1}{2}{0} $$
    • Let's put them in a matrix as columns: $$ \mat{1 & 1 & 1 & 0 & 3\\ -1 & 0 & 1 & 1 & -1\\ 0 & 1 & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0} $$
    • The RREF turns out to be: $$ \mat{{\bf 1} & 0 & -1 & -1 & 1\\ 0 & {\bf 1} & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0} $$
    • Then, \({\bf v}_1, {\bf v}_2\) are a linearly independent subset, from Proposition 6.3.
    • Note: \({\bf v}_1, {\bf v}_2\) are not columns of the RREF: they are columns of \({\bf A}\).
    • Proposition 6.4 says that no other linearly independent subset is larger.
    • That is, the rank of the set is 2.

  • We will prove 6.3 in two parts:
    • Part I: any vector corresponding to a non-pivot column can be expressed in terms of the vectors corresponding to the pivot columns, i.e., \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\)
    • Part II: \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) are linearly independent.

  • Proposition 6.3:. If columns \(i_1,i_2, \ldots, i_k\) are the pivot columns of \(RREF({\bf A})\), where \({\bf A}\) is formed with vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) as a columns, then \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) are independent vectors amongst the set.

  • Proof of Part I: Suppose each vector has \(m\) dimensions. Now consider the solution to \({\bf Ax}={\bf 0}\), where \({\bf A}\) has \(m\) rows and \(n\) columns. First, let's look at the case when \({\bf Ax}={\bf 0}\) has \({\bf x}={\bf 0}\) as the only solution:
    • In this case, all the columns are pivot columns.
    • And therefore \({\bf x}={\bf 0}\) is the only solution to \(x_1{\bf v}_1 + x_2{\bf v}_2 + \ldots + x_n{\bf v}_n = {\bf 0}\), because this is just a restatement of \({\bf Ax}={\bf 0}\).
    • This means all of \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are independent, and the proposition is true.
    Now suppose \({\bf Ax}={\bf 0}\) has a solution \({\bf x}\neq {\bf 0}\):
    • In this case, the pivot columns \(i_1,i_2, \ldots, i_k\) correspond to basic variables.
    • Now let \(x_j\) be any free variable in the (non-zero) solution of \({\bf Ax}={\bf 0}\).
    • We'll write the solution as: $$ x_1{\bf v}_1 + x_2{\bf v}_2 + \ldots + x_n{\bf v}_n = {\bf 0} $$ and express $$ x_j{\bf v}_j = -x_1{\bf v}_1 - x_2{\bf v}_2 - \ldots - x_{j-1}{\bf v}_{j-1} - x_{j+1}{\bf v}_{j+1} - \ldots - x_n{\bf v}_n $$
    • Set \(x_j=1\) and all the other free variables to \(0\).
    • This results in $$ {\bf v}_j = -x_{i_1}{\bf v}_{i_1} - x_{i_2}{\bf v}_{i_2} - \ldots - x_{i_k}{\bf v}_{i_k} $$
    • That is, \({\bf v}_j\) is expressed as a linear combination of \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\).
    • In this way, any vector corresponding to a non-pivot column can be expressed in terms of \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\).

  • Before proving Part II, let's see why it's needed at all:
    • We know that the pivot columns of the RREF are independent
            \(\rhd\) You proved this earlier in Proposition 6.1.
    • But this doesn't mean that the original columns corresponding to these in \({\bf A}\) are independent.
    • Also, just because we expressed the other vectors in terms of these doesn't mean they (the columns of \({\bf A}\) corresponding to pivots) are independent.
    • That is, there could be dependence between them.

  • The proof of Part II is a bit subtle.

  • Let's first illustrate the idea through an example:
    • Suppose we have \(n=5\) vectors, and that the first three columns while the next two are non-pivot columns.
    • Now consider \({\bf Ax} = {\bf 0}\). That is, $$ x_1{\bf v}_1 + x_2{\bf v}_2 + x_3{\bf v}_3 + x_4{\bf v}_4 + x_5{\bf v}_5 = {\bf 0} $$
    • We're interested in the case \({\bf x}\neq{\bf 0}\) (because the case \({\bf x}={\bf 0}\) is covered above).
    • The first three in \({\bf x}=(x_1,x_2,x_3,x_4,x_5)\) are basic variables, whereas \(x_4,x_5\) are free variables.
    • Suppose now that we set \(x_4=x_5=0\). Does this imply that \(x_1=x_2=x_3=0\)? If it did, then it means that the solution to $$ x_1{\bf v}_1 + x_2{\bf v}_2 + x_3{\bf v}_3 \eql {\bf 0} $$ is \(x_1=x_2=x_3=0\), meaning they are independent. How do we argue that it it implies that \(x_1=x_2=x_3=0\)? Here's where the insight comes in.
    • Suppose we found a solution where \(x_1\neq 0\). Then, \({\bf Ax}={\bf 0}\) has a nonzero solution with \(x_4=x_5=0\).
    • Which means \(RREF({\bf A}){\bf x}={\bf 0}\) has the same solution (with \(x_1\neq 0\)), which contradicts the fact that the RREF's pivot columns are independent.

  • The last step in reasoning above was a bit complicated so let's illustrate that with our running example:
    • Suppose with free variables set to 0, there is a non-zero solution with \(x_1 \neq 0\), then it would look like this: $$ \mat{1 & 1 & 1 & 0 & 3\\ -1 & 0 & 1 & 1 & -1\\ 0 & 1 & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0} \mat{x_1\neq 0\\ x_2 \\ x_3=0\\ x_4=0\\ x_5=0} \eql \mat{0\\ 0\\ 0\\ 0\\ 0} $$
    • But this would imply the same solution for the RREF (because the free variables are set to 0): $$ \mat{{\bf 1} & 0 & -1 & -1 & 1\\ 0 & {\bf 1} & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0} \mat{x_1\neq 0\\ x_2 \\ x_3=0\\ x_4=0\\ x_5=0} \eql \mat{0\\ 0\\ 0\\ 0\\ 0} $$ But this is not possible, since the RREF's pivot columns are independent.

  • Proof of Part II: Let's formalize the above argument (in steps):
    • Let \(j_1, j_2, \ldots, j_{n-k}\) be the indices of the non-pivot columns. Thus, \(x_{j_1},x_{j_2},x_{j_{n-k}}\) are free variables.
    • The indices \(i_1, i_2, \ldots, i_k\) are the pivot columns and so \(x_{i_1},x_{i_2},x_{i_k}\) are basic variables.
    • Let \({\bf x}\) be a vector where the free variables are set to zero and then consider \({\bf Ax}={\bf 0}\).
    • If there's a non-zero solution, this means there's some variable \(x_{i_p}\neq 0\) amongst the basic variables.
    • Now for the key insight: this same solution must work for the equations \(RREF({\bf A}){\bf x}={\bf 0}\) since row reductions do not change the set of solutions.
    • Thus, we will have found a linear combination of the RREF columns with one nonzero coefficient that equals the zero vector (remembering that the coefficients of the non-pivot columns are already zero by construction). But this is a contradiction since the RREF's pivot columns are independent.

  • Phew!

  • What about Proposition 6.4?
    • We can see that because \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) are linearly independent, the rank of the collection \({\bf v}_{1}, {\bf v}_{2}, \ldots, {\bf v}_{n}\) is at least \(k\).
    • But could it be larger than \(k\)?
    • That is, what if there is some subset other than \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) that's independent and larger than \(k\)?
    • To make this more concrete: suppose the first three vectors correspond to the pivot columns, could it be that the next four vectors are independent amongst themselves even though each of those depend on the first three?

  • Unfortunately, as it turns out, the proof is not that simple and will have to await a more dramatic result.
 


6.3   Span, space and dimension

 

Let us revisit an idea we've seen before: span

 

We will now define an accompanying term: subspace

  • Definition: a subspace is a set of vectors closed under linear combinations.

  • Typically we use boldface capitals for notating subspaces.

  • What this means:
    • Suppose the set \({\bf W}\) is a subspace.
    • Pick any two vectors \({\bf u}, {\bf v} \in {\bf W}\).
    • Every linear combination \(\alpha {\bf u} +\beta {\bf v}\) is in \({\bf W}\).

  • At first, this sounds endless:
    • You pick two vectors and put all its linear combinations in \({\bf W}\).
    • Then you pick any of those linear combinations and put their linear combinations in \({\bf W}\).
    • And so on.

  • It is true that the set is infinite, but there are boundaries.

  • For example, we saw that \({\bf u}=(1,1,0), {\bf v}=(-1,1,0), {\bf w}=(0,-1,0)\) generates any vector in the (x,y)-plane through linear combinations.

  • You can't generate anything outside the plane from these three.

  • At the same time, the (x,y)-plane is a subspace.

  • Even more importantly, just two of them are enough to generate the subspace (for the plane example).
 

This leads to the next important concept: dimension of a subspace

  • Definition: The dimension of a subspace is the minimum number of vectors needed to generate the subspace via linear combinations.

  • We saw earlier that for the subspace generated by \({\bf u}=(1,1,0), {\bf v}=(-1,1,0), {\bf w}=(0,-1,0)\)
    • \({\bf u}=(1,1,0), {\bf v}=(-1,1,0)\) are enough to generate the subspace
    • Alternatively, \({\bf e}_1=(1,0,0), {\bf e}_2=(0,1,0)\) are sufficient.
    • No single vector by itself can generate the (x,y)-plane.
       

      In-Class Exercise 12: Why?

  • Thus, the dimension of the (x,y)-plane is 2.
          \(\rhd\) Exactly 2 vectors are needed and sufficient.

  • Notation: dim(\({\bf W}\)) = the dimension of subspace \({\bf W}\).

  • Note: the word dimension has two meanings:
    1. The dimension of a vector: the number of elements (size) of the vector, as in \({\bf v}=(1,3,2,0,0,9)\) has 6 dimensions.
    2. The dimension of a space: the minimum number of vectors needed to span the space.

  • To better understand what a subspace is, let's also consider an example of an infinite collection of vectors that's not a subspace:
    • Consider all the 3D vectors in the positive quadrant of the (x,y)-plane.
    • Thus, all vectors of the form \((x,y,0)\) where \(x\geq 0, y\geq 0\).
    • If we pick \({\bf u}=(1,2,0)\) and \({\bf v}=(1,1,0)\) as two vectors from this set, are all their linear combinations in the set (the positive quadrant)?
    • Clearly not: \(-3{\bf u} + 1{\bf v} = (-2,-5,0)\) is not in the positive quadrant.
 

Now for an interesting result:

  • Proposition 6.5: If \(\mbox{span}({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n)\) has dimension \(n\), the vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are linearly independent.

  • Let's first see what this means through an example.

  • Consider \({\bf W}=\mbox{span}({\bf u}, {\bf v}, {\bf w})\) where \({\bf u}=(1,1,0), {\bf v}=(-1,1,0), {\bf w}=(0,-1,0)\).
    • We saw earlier that \(\mbox{dim}({\bf W})=2\).
    • By Proposition 6.5, \({\bf u}, {\bf v}, {\bf w}\) should not be independent.
    • Which is indeed the case.
     

    In-Class Exercise 13: Show that \({\bf u}, {\bf v}\) are independent and that \({\bf w}\) is a linear combination of \({\bf u}\) and \({\bf v}\).
     

    In-Class Exercise 14: Prove Proposition 6.5. Hint: start by assuming that one vector among \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is dependent on the others. Can the span be generated by the others?
     

  • What 6.5 is really saying: if you have a bunch of vectors that span a space and if there's dependence between them, you need fewer of them (the independent ones) to achieve the same span.
 

With this background, we are in a position to prove a remarkable result.
 


6.4   Row space and column space of a matrix

 

First, some definitions:

  • Definition: The rowspace of a matrix is the span of its rows, with the rows treated as vectors.

  • Definition: The column space of a matrix, abbreviated as colspace, is the span of its columns, with the columns treated as vectors.

  • Thus, \(\mbox{dim}(\mbox{rowspace}({\bf A}))\) is the minimum number of vectors needed to span the rowspace.

  • Similarly, \(\mbox{dim}(\mbox{colspace}({\bf A}))\) is the minimum number of vectors needed to span the column space.
 

Now for a surprising and powerful result:

  • Theorem 6.6: \(\mbox{dim}(\mbox{rowspace}({\bf A})) = \mbox{dim}(\mbox{colspace}({\bf A})) \)

  • Let's pause for a moment and understand what it's saying, via an example: $$ {\bf A} \eql \mat{1 & 1 & 1 & 0 & 3\\ -1 & 0 & 1 & 1 & -1\\ 0 & 1 & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0} $$

  • Here, the rowspace is the set of all vectors that can be generated by $$\eqb{ {\bf r}_1 & \eql & (1,1,1,0,3)\\ {\bf r}_2 & \eql & (-1,0,1,1,-1)\\ {\bf r}_3 & \eql & (0,1,2,1,2) }$$
    • Since \({\bf r}_4 = (0,0,0,0,0)\) doesn't add anything to a linear combination, it's not needed.
    • Notice the row space is in 5D (the vectors are 5-dimensional).
    • It will turn out that \(\mbox{dim}(\mbox{rowspace}({\bf A}))=2\).
    • Thus, even though the vectors are in 5D, fewer are needed to span the rowspace.
    • This is like the 3D examples where two vectors were enough for the plane.

  • The colspace is the span of $$\eqb{ {\bf c}_1 & \eql & (1,-1,0,0)\\ {\bf c}_2 & \eql & (1,0,1,0)\\ {\bf c}_3 & \eql & (1,1,2,0)\\ {\bf c}_4 & \eql & (0,1,1,0)\\ {\bf c}_5 & \eql & (3,-1,2,0) }$$
    • Here, the colspace is in 4D (the vectors in it are 4-dimensional).
    • It will turn out that \(\mbox{dim}(\mbox{colspace}({\bf A}))=2\).
    • Thus, even though the vectors are in 4D, fewer are needed to span the colspace.

  • At this point, it's worth looking at the RREF: $$ \mat{{\bf 1} & 0 & -1 & -1 & 1\\ 0 & {\bf 1} & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0} $$

  • If you've guessed that the number of pivot rows (which equals the number of pivot columns) is the dimension, you would be right.

  • But we need to prove this.

  • An important caution:
    • Look at the space spanned by the the pivot columns above.
    • It's the plane \((x_1,x_2,0,0)\).
    • This is NOT the space spanned by the columns \({\bf c}_1,\ldots,{\bf c}_5\). For example, the vector \({\bf c}=(2,1,3,0) = {\bf c}_2 + {\bf c}_3\) is in the colspace but not in space spanned by the pivot columns.
    • However, the dimensions of these two spaces are the same (equal to 2): meaning, at most two vectors are needed to span.
 

We will prove Theorem 6.6 in steps:

  • First, recall the three row operations in creating an RREF:
    • Swapping two rows.
    • Scaling a row by a number.
    • Adding to a row a scalar multiple of another row.

  • Now consider an \(m\times n\) matrix \({\bf A}\) with rows \({\bf r}_1, \ldots, {\bf r}_m\) and its rowspace: \(\mbox{span}({\bf r}_1, \ldots, {\bf r}_m)\).

  • Suppose we apply a row operation to \({\bf A}\) and \({\bf A}^\prime\) is the resulting matrix.
     

    In-Class Exercise 15: Show that the rowspace of \({\bf A}^\prime\) is the same as the rowspace of \({\bf A}\). Do this for each of three types of row operations.
     

  • Since the rowspace does not change with row operations, the rowspace of a matrix \({\bf A}\) is the same as the rowspace of its RREF.

  • That is, $$ \mbox{rowspace}({\bf A}) \eql \mbox{rowspace}(RREF({\bf A})) $$

  • Thus $$ \mbox{dim}(\mbox{rowspace}({\bf A})) \eql \mbox{dim}(\mbox{rowspace}(RREF({\bf A}))) $$

  • But, in the RREF, only the pivot rows are nonzero.

  • And the pivot rows are independent.
     

    In-Class Exercise 16: Why are the pivot rows independent?
     

  • Thus, \(\mbox{dim}(\mbox{rowspace}({\bf A}))\) = the number of pivot rows.

  • Now, we know that the number of pivot rows is the same as the number of pivot columns.

  • What we do know is that $$ \mbox{dim}(\mbox{colspace}(RREF({\bf A}))) \eql \mbox{number of pivot columns} $$ We also know that $$ \mbox{colspace}({\bf A}) \; \neq \; \mbox{colspace}(RREF({\bf A})) $$ And thus, we cannot (yet) conclude $$ \mbox{dim}(\mbox{colspace}({\bf A})) \eql \mbox{number of pivot columns} $$

  • Another way of thinking about this issue:
    • We showed that row ops don't change the row space.
    • But row ops do change the column space and so the column space of the matrix is not necessarily the same as the column space of its RREF.

  • What we did show earlier (Proposition 6.3): $$\mbox{dim}(\mbox{colspace}({\bf A})) \geq \mbox{dim}(\mbox{colspace}(RREF({\bf A}))) $$ Which means $$\mbox{dim}(\mbox{colspace}({\bf A})) \geq \mbox{dim}(\mbox{rowspace}({\bf A})) $$ Why? Recall:
    • The columns of A corresponding to the pivots are independent.
    • Thus, the dimension of the colspace is at least that much.
    • The number of pivots is the dimension of the rowspace (since \(\mbox{rowspace}({\bf A})=\mbox{rowspace}(RREF{\bf A})\)).

  • To prove the last step, we'll use a really simple, clever trick.

  • For any matrix \({\bf A}\), consider its transpose \({\bf A}^T\).

  • For example: $$ {\bf A} \eql \mat{1 & 1 & 1 & 0 & 3\\ -1 & 0 & 1 & 1 & -1\\ 0 & 1 & 2 & 1 & 2\\ 0 & 0 & 0 & 0 & 0} \;\;\;\;\;\;\;\; {\bf A}^T \eql \mat{1 & -1 & 0 & 0\\ 1 & 0 & 1 & 0\\ 1 & 1 & 2 & 0\\ 0 & 1 & 1 & 0\\ 3 & -1 & 2 & 0} $$

  • Clearly, $$\eqb{ \mbox{colspace}({\bf A}) & \eql & \mbox{rowspace}({\bf A}^T)\\ \mbox{rowspace}({\bf A}) & \eql & \mbox{colspace}({\bf A}^T)\\ }$$

  • So, it must be that $$ \mbox{dim}(\mbox{colspace}({\bf A})) \eql \mbox{dim}(\mbox{rowspace}({\bf A}^T)) \;\; \leq \;\; \mbox{dim}(\mbox{colspace}({\bf A}^T)) \eql \mbox{dim}(\mbox{rowspace}({\bf A})) $$

  • So, we have both:
    • \( \mbox{dim}(\mbox{rowspace}({\bf A})) \leq \mbox{dim}(\mbox{colspace}({\bf A})) \)
            \(\rhd\) (via the independence of pivot columns)
    • \( \mbox{dim}(\mbox{rowspace}({\bf A})) \geq \mbox{dim}(\mbox{colspace}({\bf A})) \)
            \(\rhd\) (via the transpose)

  • In other words $$ \mbox{dim}(\mbox{rowspace}({\bf A})) \eql \mbox{dim}(\mbox{colspace}({\bf A})) \bigsp\Box $$

  • Which proves Theorem 6.6.

  • The proof of Proposition 6.4 is now trivial because we've proved something far stronger.

  • Let's remind ourselves ...
    Proposition 6.4: The rank of the collection \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is the number of pivot columns of \(RREF({\bf A})\).
     

    In-Class Exercise 17: Complete the proof of Proposition 6.4.

 


Finally, let's reinterpret the familiar \({\bf Ax}={\bf b}\):

  • Consider this example: $$ \mat{2 & -3 & 2\\ -2 & 1 & 0\\ 1 & 1 & -1} \vecthree{x_1}{x_2}{x_3} \eql \vecthree{5}{-1}{0} $$

  • Written as a linear combination, this is the same as: $$ x_1 \vecthree{2}{-2}{1} \plss x_2 \vecthree{-3}{1}{1} \plss x_3 \vecthree{2}{0}{-1} \eql \vecthree{5}{-1}{0} $$

  • Suppose we call the column vectors \({\bf v}_1, {\bf v}_2, {\bf v}_3\): $$ {\bf v}_1 \eql \vecthree{2}{-2}{1} \;\;\;\;\;\;\;\; {\bf v}_2 \eql \vecthree{-3}{1}{1} \;\;\;\;\;\;\;\; {\bf v}_3 \eql \vecthree{2}{0}{-1} $$

  • Thus, the question
    Is there an \({\bf x}=(x_1,x_2,x_3)\) such that \({\bf Ax}={\bf b}\)?
    is the same as the question
    Are there scalars \(x_1,x_2,x_3\) such that \({\bf b} = x_1{\bf v}_1 + x_2{\bf v}_2 + x_3{\bf v}_3\)?

  • That is, is \({\bf b} \in \mbox{colspace}({\bf A})\)?

  • If there's no solution, the vector \({\bf b} \notin \mbox{colspace}({\bf A})\).
 

In-Class Exercise 18: Prove or disprove the following. Suppose \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) is a collection of \(m\)-dimensional linearly independent vectors and \({\bf u} \in \mbox{span}({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n)\). Then there is exactly one linear combination of \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) that will produce \({\bf u}\). Consider both cases, \(m=n\) and \(m\neq n\).
 

 


6.5   Basis

 

We've encountered this idea before, but let's revisit:

  • Definition: A basis for a subspace is any minimal collection of vectors that span the subspace.

  • For example, consider the (x,y) plane in 3D.

  • We saw earlier that $$ {\bf u}=(1,1,0), \;\; {\bf v}=(-1,1,0), \;\; {\bf w}=(0,-1,0) $$ span this subspace, but are too many.

  • Also:
    • \({\bf u}\) and \({\bf v}\) are sufficient to span the subspace.
    • No single vector can span the subspace.

  • Thus, the collection \(\{ {\bf u}, {\bf v}\}\) forms a basis for the (x,y)-plane.
     

    In-Class Exercise 19: What is the more natural basis (3D vectors) for the (x,y) plane?

 

Computing a basis:

  • Suppose we are given any collection of vectors, not necessarily independent, and we want to compute a basis of the space spanned by the vectors.

  • That is, suppose we are given vectors \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\).

  • Define the span as $$ {\bf W} \eql \mbox{span} ({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n) $$

  • We'd like a basis for the subspace \({\bf W}\).

  • Based on what we now know about the rowspace and colspace, there are two different ways of identifying a basis for \({\bf W}\).

  • Method 1: use rows
    • Create a matrix \({\bf A}\) with \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) as its rows.
    • Compute \(RREF({\bf A})\).
    • The pivot rows form a basis for the space.

  • Why does this work?
    • The rowspace is exactly \({\bf W}\) by design.
    • The pivot rows span the rowspace.
            \(\rhd\) No more vectors are needed.
    • The dimension of the rowspace is the number of pivot rows
            \(\rhd\) At least that many vectors are needed.

  • Note: the basis so produced might have no vectors in common with the set \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\).

  • Method 2: use columns
    • Create a matrix \({\bf A}\) with \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) as its columns.
    • Compute \(RREF({\bf A})\).
    • The original columns corresponding to the pivot columns form a basis for the colspace.

  • Why does this work?
    • The colspace is exactly \({\bf W}\) by design.
    • Suppose there are \(k\) pivot columns are \(i_1, i_2, \ldots, i_k\).
    • Proposition 6.3 showed that \({\bf v}_{i_1}, {\bf v}_{i_2}, \ldots, {\bf v}_{i_k}\) are independent.
    • Theorem 6.6 showed that \(\mbox{dim}({\bf W}) = k\).
    • So, no smaller basis is possible.

  • One advantage of the column method is that the basis is formed from a subset of \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\).
 

Note the difference between rows and columns:

  • For rows:
    • The rowspace of the original matrix \({\bf A}\) is the same as the rowspace of the \(RREF({\bf A})\).
    • That is, any vector that's a linear combination of the rows of \({\bf A}\) is also a linear combination (different scalars) of the rows of \(RREF({\bf A})\).

  • This is not true for columns, as we've seen:
    • The colspace of \({\bf A}\) is not necessarily the the colspace of \(RREF({\bf A})\).
 

Let's connect dimension and basis directly:

  • Recall, for a subspace \({\bf W}\)
    • \(\mbox{dim}({\bf W})\) is the minimum number of vectors needed to span \({\bf W}\).
    • A basis is one such minimum collection.

  • Thus, the size of a basis is the dimension.

  • Are the vectors in a basis independent?

  • Proposition 6.7: The vectors in a basis are linearly independent.
    Proof: If \({\bf u}_1, {\bf u}_2, \ldots, {\bf u}_n\) form a basis for subspace \({\bf W}\) but are not all linearly independent, then it is possible to express some \({\bf u}_j\) in terms of the others, as in $$ {\bf u}_j \eql \beta_1 {\bf u}_1 + \ldots + \beta_{j-1}{\bf u}_{j_1} + \beta_{j+1}{\bf u}_{j+1} + \ldots + \beta_n{\bf u}_n $$ Since \({\bf u}_1, {\bf u}_2, \ldots, {\bf u}_n\) form a basis, any \({\bf w} \in {\bf W}\) can be expressed in terms of these vectors, as in: $$ {\bf w} \eql \alpha_1 {\bf u}_1 + \ldots + \alpha_{j-1}{\bf u}_{j_1} + \alpha_j{\bf u}_j + \alpha_{j+1}{\bf u}_{j+1} + \ldots + \alpha_n{\bf u}_n $$ Substituting for \({\bf u}_j\): $$\eqb{ {\bf w} & \eql & \alpha_1 {\bf u}_1 + \ldots + \alpha_{j-1}{\bf u}_{j-1} + (\beta_1 {\bf u}_1 + \ldots + \beta_{j-1}{\bf u}_{j_1} + \beta_{j+1}{\bf u}_{j+1} + \ldots + \beta_n{\bf u}_n) + \alpha_{j+1}{\bf u}_{j+1} + \ldots + \alpha_n{\bf u}_n\\ & \eql & (\alpha_1+\beta_1) {\bf u}_1 + \ldots + (\alpha_{j-1}+\beta_{j-1}){\bf u}_{j-1} + (\alpha_{j+1}+\beta_{j+1}){\bf u}_{j+1} + \ldots + (\alpha_n + \beta_n){\bf u}_n\\ & \eql & \gamma_1 {\bf u}_1 + \ldots + \gamma_{j-1}{\bf u}_{j-1} + \gamma_{j+1}{\bf u}_{j+1} + \ldots + \gamma_n {\bf u}_n\\ & \eql & \mbox{ a linear combination of } {\bf u}_1 + \ldots + {\bf u}_{j-1} + {\bf u}_{j+1} + \ldots + {\bf u}_{n} }$$ with coefficients \(\gamma_i \defn (\alpha_i + \beta_i)\). That is, we've found a smaller set of vectors that spans \({\bf W}\), a contradiction. \(\Box\)
 


6.6   Orthogonality and linear independence

 

Recall what it means for two vectors to be orthogonal:

  • Vectors \({\bf u}\) and \({\bf v}\) are orthogonal if \({\bf u} \cdot {\bf v} = 0\).
          \(\rhd\) The dot-product is zero.

  • The intuition?
    • The angle between \({\bf u}\) and \({\bf v}\) is $$ \cos(\theta) \eql \frac{{\bf v} \cdot {\bf u}}{ |{\bf v}| |{\bf u}|} $$
    • Which is zero if \({\bf u} \cdot {\bf v} = 0\).
 

Orthogonal vectors are linearly independent:

  • Proposition 6.8: If \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are pairwise orthogonal, they are linearly independent.
     

    In-Class Exercise 20: Prove this result:

    • Apply the definition of linear independence to \({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\).
    • Then compute the dot product with \({\bf v}_1\) on both sides.
    • What does this imply for the value of \(\alpha_1\)?
    • Complete the proof.
     

    In-Class Exercise 21: Does the same proof work for complex vectors? Recall that the dot-product for complex vectors is a bit different. If the proof fails, why does it fail? If the proof works, explain all the steps in terms of complex vectors.

 


6.7   Summary and study tips

 

Let's summarize a few key points in this module:

  • In this module, we developed a bit of the theory of linear algebra.

  • Spaces (or subspaces), span, rank, dimension, and basis are key concepts in linear algebra.

  • Although the concepts are usually defined for collections of vectors, they're all related through matrices and their RREFs.

  • First consider linear independence (LI):
    • Informally, if $$ {\bf v}_2 \eql 3 {\bf v}_1 + 4 {\bf v}_3? $$ then \({\bf v}_2\) is a linear combination of \({\bf v}_1\) and \({\bf v}_3\). and therefore, they aren't independent.
    • Formally, nonzero vectors \(({\bf v}_1, {\bf v}_2, \ldots, {\bf v}_n\) are independent if the only solution to the equation $$ \alpha_1 {\bf v}_1 + \alpha_2 {\bf v}_2 + \ldots + \alpha_n {\bf v}_n \eql {\bf 0} $$ is \(\alpha_1 = \alpha_2 = \ldots = \alpha_n = 0\).
    • Or in matrix form (with the vectors as columns): $$ \mat{ & & & \\ \vdots & \vdots & \ldots & \vdots \\ {\bf v}_1 & {\bf v}_2 & \ldots & {\bf v}_n\\ \vdots & \vdots & \ldots & \vdots \\ & & & } \mat{\alpha_1\\ \alpha_2\\ \alpha_3\\ \vdots\\ \alpha_n} \eql \mat{0\\ 0\\ 0\\ \vdots\\ 0} $$ implies \({\bf \alpha} = {\bf 0}\)
    • Which is true only when the RREF is full-rank.

  • Let's look at an example of dependence.
    • Consider the vectors $$ {\bf v}_1 \eql \vecfour{1}{-1}{0}{0} \;\;\;\;\;\;\;\;\; {\bf v}_2 \eql \vecfour{1}{0}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_3 \eql \vecfour{1}{1}{2}{0} \;\;\;\;\;\;\;\;\; {\bf v}_4 \eql \vecfour{0}{1}{1}{0} \;\;\;\;\;\;\;\;\; {\bf v}_5 \eql \vecfour{3}{-1}{2}{0} $$
    • Put them into matrix \({\bf A}\) as columns and compute \(RREF({\bf A})\):

  • The span of a collection of vectors is all the vectors you get as linear combinations of them: $$\mbox{span}({\bf u},{\bf v}) \eql \{ {\bf w}: {\bf w}= \alpha{\bf u} + \beta{\bf v} \}$$

  • Consider: the rowspace (span of rows) and colspace (span of columns) via our example:

  • What does this have to do with the RREF?

  • Lastly, an important result: any collection of orthogonal vectors is linearly independent.

  • The most important concepts with practical value are: rank, linear-independence, basis, and what happens as a result of orthogonality.

  • The others (subspace, dimension) are useful in the theory.
 

Study tips:

  • This module has both many concepts and theoretical results.

  • Remember: one of the best ways to help grapple with theoretical results is to try and write your own example that might illustrate each theoretical result.


© 2016, Rahul Simha