Section 9 Important Definitions

9.1 Vector Spaces

span

A set of vectors \(\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\) span a vector space \(V\) if for every \(\mathsf{v} \in V\) there exist a set of scalars (weights) \(c_1, c_2, \ldots, c_n \in \mathbb{R}\) such that \[ \mathsf{v} = c_1 \mathsf{v}_1 + c_2 \mathsf{v}_2 + \cdots + c_n \mathsf{v}_n. \] Connection to Matrices: If \(A = [\mathsf{v}_1 \mathsf{v}_2 \cdots \mathsf{v}_n]\) is the matrix with these vectors in the columns, then this is the same as saying that \(\mathsf{x} = [c_1, \ldots, c_n]^{\top}\) is a solution to \(A x = \mathsf{v}\).

linear independence

A set of vectors \(\mathsf{v}_1, \mathsf{v}_2,\ldots, \mathsf{v}_n\) are linearly independent if the only way to write \[ \mathsf{0} = c_1 \mathsf{v}_1 + c_2 \mathsf{v}_2 + \cdots + c_n \mathsf{v}_n \] is with \(c_1 = c_2 = \cdots = c_n = 0\).
Connection to Matrices: If \(A = [\mathsf{v}_1 \mathsf{v}_2 \cdots \mathsf{v}_n]\) is the matrix with these vectors in the columns, then this is the same as saying that \(A x = \mathsf{0}\) has only the trivial solution.

linear dependence

Conversely, a set of vectors \(\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\) are linearly dependent if there exist scalars \(c_1, c_2,\ldots, c_n \in \mathbb{R}\) that are not all equal to 0 such that \[ \mathsf{0} = c_1 \mathsf{v}_1 + c_2 \mathsf{v}_2 + \cdots + c_n \mathsf{v}_n \] This is called a dependence relation among the vectors.
Connection to Matrices: If \(A = [\mathsf{v}_1 \mathsf{v}_2 \cdots \mathsf{v}_n]\) is the matrix with these vectors in the columns, then this is the same as saying that \(\mathsf{x} = [c_1, c_2, \ldots, c_n]^{\top}\) is a nontrivial solution to \(A \mathsf{x} = \mathsf{0}\).

linear transformation

A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation when:

  • \(T(\mathsf{u} + \mathsf{v}) = T(\mathsf{u}) + T(\mathsf{v})\) for all \(\mathsf{u}, \mathsf{v} \in \mathbb{R}^n\) (preserves addition)
  • \(T(c \mathsf{u} ) = c T(\mathsf{u})\) for all \(\mathsf{u} \in \mathbb{R}^n\) and \(c \in \mathbb{R}\) (preserves scalar multiplication). It follows from these that also \(T(\mathsf{0}) = \mathsf{0}\).
one-to-one

A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is a one-to-one when:

for all \(\mathsf{y} \in \mathbb{R}^m\) there is at most one \(\mathsf{x} \in \mathbb{R}^n\) such that \(T(\mathsf{x}) = \mathsf{y}\).
onto

A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is a onto when:

for all \(\mathsf{y} \in \mathbb{R}^m\) there is at least one \(\mathsf{x} \in \mathbb{R}^n\) such that \(T(\mathsf{x}) = \mathsf{y}\).
subspace

A subset \(S \subseteq \mathbb{R}^n\) is a subspace when:

  • \(\mathsf{u} + \mathsf{v} \in S\) for all \(\mathsf{u}, \mathsf{v} \in S\) (closed under addition)
  • \(c \mathsf{u} \in S\) for all \(\mathsf{u}\in S\) and \(c \in \mathbb{R}\) (closed under scalar multiplication) It follows from these that also \(\mathsf{0} \in S\).
basis

A basis of a vector space (or subspace) \(V\) is a set of vectors \(\mathcal{B} = \{\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\}\) in \(V\) such that

  • \(\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\) span \(V\)
  • \(\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\) are linearly independent Equivalently, one can say that \(\mathcal{B} = \{\mathsf{v}_1, \mathsf{v}_2, \ldots, \mathsf{v}_n\}\) is a basis of \(V\) if for every vector \(\mathsf{v} \in V\) there is a unique set of scalars \(c_1, \ldots, c_n\) such that \[ \mathsf{v} = c_1 \mathsf{v}_1 + c_2 \mathsf{v}_2 + \cdots + c_n \mathsf{v}_n. \] (The fact that there is a set of vectors comes from the span; the fact that they are unique comes from linear independence).
dimension

The dimension of a subspace \(W\) is the number of vectors in any basis of \(W\). This is also the fewest number of vectors required to span the subspace.

9.2 Matrices

invertible

The square \(n \times n\) matrix \(A\) is invertible when there exists an \(n \times n\) matrix \(A^{-1}\) such that \(A A^{-1} = I = A^{-1} A\). The Invertible Matrix Theorem collects over two dozen equivalent conditions, each of which guarantees that \(A\) is invertible.

null space

The null space \(\mbox{Nul}(A) \subset \mathbb{R}^n\) of the \(m \times n\) matrix \(A\) is the set of solutions to the homogeneous equation \(A \mathsf{x} = \mathbf{0}\)> We also write this as \[ \mbox{Nul}(A) = \{ \mathsf{x} \in \mathbb{R}^n : A \mathsf{x} = \mathbf{0} \} \] Connection to Linear Transformations: If \(T(\mathsf{x}) = A \mathsf{x}\), then the kernel of \(T\) is the null space of matrix \(A\).

column space

The column space \(\mbox{Col}(A) \subset \mathbb{R}^m\) of the \(m \times n\) matrix \(A\) is the set of all linear combinations of the columns of \(A\). For \(A = \begin{bmatrix} \mathsf{a}_1 & \mathsf{a}_2 & \cdots & \mathsf{a}_n \end{bmatrix}\), we have \[ \mbox{Col}(A) = \mbox{span} ( \mathsf{a}_1, \mathsf{a}_2, \ldots , \mathsf{a}_n ) \] We also write this as \[ \mbox{Col}(A) = \{ \mathsf{b} \in \mathbb{R}^m : \mathsf{b} = A \mathsf{x} \mbox{ for some } \mathsf{x} \in \mathbb{R}^n \}. \] Connection to Linear Transformations: If \(T(\mathsf{x}) = A \mathsf{x}\), then the range (also called the image) of \(T\) is the column space of matrix \(A\).

rank

The rank of the \(m \times n\) matrix \(A\) is the dimension of the column space of \(A\). This is also the number of pivot columns of the matrix.

eigenvalue and eigenvector

For a square \(n \times n\) matrix \(A\), the scalar \(\lambda \in \mathbb{R}\) is an eigenvalue for \(A\) when there exists a nonzero vector \(\mathsf{x} \in \mathbb{R}^n\) such that \(A \mathsf{x} = \lambda \mathsf{x}\). The nonzero vector \(\mathsf{x}\) is the eigenvector for eigenvalue \(\lambda\). The collection of all of these eigenvalues and eigenvectors is called the eigensystem of A.

diagonalization

A square \(n \times n\) matrix is diagonalizable when \(A = P D P^{-1}\) where \(D\) is a diagonal matrix and \(P\) is an invertible matrix. In this case, the eigenvalues of \(A\) are the diagonal entries of \(D\) and their corresponding eigenvectors are the columns of \(P\).

dominant eigenvalue

The eigenvalue \(\lambda\) of the square matrix \(A\) is the dominant eigenvalue when \(| \lambda | > | \mu |\) where \(\mu\) is any other eigenvalue of \(A\). The dominant eigenvalue determines the long-term behavior of \(A^t\) as \(t \rightarrow \infty\).

9.3 Orthogonality

length

The length of a vector \(\mathsf{v}\) is \[ \| \mathsf{v} \| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}. \]

distance and angle

The distance between vectors \(\mathsf{u}\) and \(\mathsf{v}\) is \[ \mbox{dist}(\mathsf{u},\mathsf{v}) = \| \mathsf{u} - \mathsf{v} \|. \] The angle \(\theta\) between these vectors is determined by \[ \cos \theta = \frac{\mathsf{u} \cdot \mathsf{v}}{ \| \mathsf{u} \| \, \| \mathsf{v} \|}. \]

orthogonal

The vectors \(\mathsf{u}\) and \(\mathsf{v}\) are orthogonal when \(\mathsf{u} \cdot \mathsf{v} = 0\). This means that either one of them is the zero vector, or they are perpendicular to one another.

orthogonal complement

If \(W \subset \mathbb{R}^n\) is a subspace, then its orthogonal complement \(W^{\perp}\) is the set of all vectors in \(\mathsf{R}^n\) that are orthogonal to \(W\). We also write \[ W^{\perp} = \{ \mathsf{v} \in \mathbb{R}^n : \mathsf{v} \cdot \mathsf{w} \mbox{ for all } \mathsf{w} \in W \}. \]

orthonormal set

A collection of vectors \(\mathsf{u}_1, \mathsf{u}_2, \ldots, \mathsf{u}_k\) are an orthonormal set when every vector has length 1 and the vectors are pairwise orthogonal. orthogonal matrix

orthogonal matrix

A square \(n \times n\) matrix \(P\) is an orthogonal matrix when its columns are an orthonormal set. As a result, we have \(P^{-1} = P^{\top}\).

projection and residual

The orthogonal projection of vector \(\mathsf{y}\) into a subspace \(W\) is the unique vector \(\hat{\mathsf{y}} \in W\) such that \(\mathsf{z} = \mathsf{y} - \hat{\mathsf{y}} \in W^{\perp}\). The vector \(\mathsf{z}\) is called the residual vector for the projection.

9.4 Spectral Decompostion

orthogonal diagonalization

Every symmetric \(n \times n\) matrix is orthogonally diagonalizable, meaning that we have \(A = P D P^{\top}\) where \(D\) is a diagonal matrix and \(P\) is an orthogonal matrix. The diagonal entries of \(D\) are the eigenvalues of \(A\) and the columns of \(P\) are the corresponding orthonormal eigenvectors. Furthermore, the eigenvalues of \(A\) are nonnegative.

spectral decomposition

A symmetric matrix \(A\) can be written as a linear combination of rank 1 matrices derived from the orthonormal eigensystem of \(A\). In particular, we have \[ A = \lambda_1 \mathsf{u}_1 \mathsf{u}_1^{\top} + \lambda_2 \mathsf{u}_2 \mathsf{u}_2^{\top} + \cdots + \lambda_n \mathsf{u}_n \mathsf{u}_n^{\top}. \] This linear combination of rank 1 vectors is called the spectral decomposition of \(A\).

singular value decomposition (SVD)

Any \(m \times n\) matrix \(A\) of rank \(r\) can be factored into its singular value decomposition \(U \Sigma V^{\top}\) where

  • \(U\) is an \(m \times m\) orthogonal matrix,
  • \(\Sigma\) is a matrix whose nonzero entries are the positive numbers \(\sigma_1, \ldots , \sigma_r\), which appear in decreasing order on the diagonal, and
  • \(V\) is an \(n \times n\) orthogonal matrix. The nonzero entries of \(\Sigma\) are called the singular values of \(A\). The columns of \(U\) are the left singular vectors and the rows of \(V^{\top}\) are the right singular vectors.
SVD spectral decomposition

Any \(m \times n\) matrix \(A\) of rank \(r\) can be written as a linear combination of rank 1 matrices derived from the singular value decomposition of \(A\). In particular, we have \[ A = \sigma_1 \mathsf{u}_1 \mathsf{v}_1^{\top} + \sigma_2 \mathsf{u}_2 \mathsf{v}_2^{\top} + \cdots + \sigma_r \mathsf{u}_r \mathsf{v}_r^{\top}. \] This linear combination of rank 1 vectors is called the (SVD) spectral decomposition of \(A\).