# Week 10 - Vector Spaces, Orthogonality, and Linear Least-Squares

## 1. Solving underdetermined systems

• Important attributes of a linear system $Ax = b$ and associated matrix A:
• (example:$\left(\begin{array}{c c c c}1 & 3 & 1 & 2 \\ 2 & 6 & 4 & 8 \\ 0 & 0 & 2 & 4\end{array}\right)\left(\begin{array}{c} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3\end{array}\right) = \left(\begin{array}{c} 1 \\ 3 \\ 1\end{array}\right)$)
• The row-echelon form of the system.
• The pivots.
• the first nonzero entry in each row: 1, 2.
• The free variables.
• the columns that has no pivots: $\chi_1, \chi_3$
• The dependent variables.
• the columns that has pivots: $\chi_0, \chi_2$
• A specific solution.
• Often called a particular solution.
• The most straightforward way is to set the free variables equal to zero
• => $\chi_1 = \chi_3 = 0$
• => $\left(\begin{array}{c c c c}1 & 3 & 1 & 2 \\ 0 & 0 & 2 & 4\end{array}\right)\left(\begin{array}{c} \chi_0 \\ 0 \\ \chi_2 \\ 0\end{array}\right) = \left(\begin{array}{c} 1 \\ 1\end{array}\right)$
• => $x_p = \left(\begin{array}{c} 1/2 \\ 0 \\ 1/2 \\ 0\end{array}\right)$
• A basis for the null space.
• Often called the kernel of the matrix.
• $\chi_0 + 3\chi_1 + \chi_2 + 2\chi_3 = 0, 2\chi_2 + 4\chi_3 = 0$ => $\chi_2 = -2\chi_3, \chi_0 = -3\chi_1$
• $\begin{bmatrix} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3 \end{bmatrix} = \chi_1 \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + \chi_3\begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix}$
• So the basic for $\mathcal{N}(A) = \text{Span }(\begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix})$
• A general solution.
• Often called a complete solution.
• given by:
• $\begin{bmatrix} 1/2 \\ 0 \\ 1/2 \\ 0 \end{bmatrix} + \beta_0 \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + \beta_1 \begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix}$
• A basis for the column space, $\mathcal{C}(A)$.
• Often called the range of the matrix.
• equal to the number of dependent variables.
• The columns that have pivots in them are linearly independent. The corresponding columns in the original matrix are also linearly independent:
• A basis for the row space, $\mathcal{R}(A) = \mathcal{C}(A^T)$.
• The row space is the subspace of all vectors that can be created by taking linear combinations of the rows of a matrix.
• List the rows that have pivots in the row echelon form as column vectors:
• Notice these are the first and third row of A.
• The dimension of the row and column space.
• = number of pivots
• = 2
• The rank of the matrix.
• = number of pivots
• = 2
• The dimension of the null space.
• = the number of non-pivots columns
• = 2

## 2. Orthogonal Vectors & Orthogonal Spaces

• Vectors x and y are considered to be orthogonal (perpendicular) if they meet at a right angle: $x^T y = 0$

### 2.1. Normal Vector

• The normal vector, often simply called the "normal," to a surface is a vector which is perpendicular to the surface at a given point.
• For example:
• Define the plane as format: $Ax + By + Cz = D$
• Vector $\vec{n} = \begin{bmatrix} a \\ b \\ c \end{bmatrix}$ is normal to the plane.
• Vector $\vec{x_0} = \begin{bmatrix} x_0 \\ y_0 \\ z_0 \end{bmatrix}$ is pointing to the plane.
• Vector $\vec{x} = \begin{bmatrix} x \\ y \\ z \end{bmatrix}$ is pointing to the plane.
• Then $\vec{x} - \vec{x_0}$ should be on the plane and perpendicular to $\vec{n}$
• Then $(\vec{x} - \vec{x_0})^T \vec{n} = 0$
• $\begin{bmatrix} x - x_0 \\ y - y_0 \\ z - z_0 \end{bmatrix}^T \begin{bmatrix} a \\ b \\ c \end{bmatrix} = 0$
• $a(x - x_0) + b(y - y_0) + c(z - z_0) = 0$
• So we can use $ax + by + cz = ax_0 + by_0 + cz_0$ to represent the plane.

#### Cross Product

• the cross product or vector product is a binary operation on two vectors in three-dimensional space ($\mathbb{R}^3$) and is denoted by the symbol ×.
• $\begin{bmatrix} a_0 \\ a_1 \\ a_2 \end{bmatrix} \times \begin{bmatrix} b_0 \\ b_1 \\ b_2 \end{bmatrix} = \begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}$
• Given two linearly independent vectors a and b, the cross product, a × b, is a vector that is perpendicular to both a and b and thus normal to the plane containing them.
• Because $\begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}^T \begin{bmatrix} a_0 \\ a_1 \\ a_2 \end{bmatrix} = 0$ and $\begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}^T \begin{bmatrix} b_0 \\ b_1 \\ b_2 \end{bmatrix} = 0$
• So we can use vectors a and b to get n. ($n = a \times b$)

#### Visualizing a column space as a plane in R3

• For example:
• $A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 2 & 1 & 4 & 3 \\ 3 & 4 & 1 & 2 \end{bmatrix}$. $\text{rref }(A) = \begin{bmatrix} 1 & 0 & 3 & 2 \\ 0 & 1 & -2 & -1 \\ 0 & 0 & 0 & 0 \end{bmatrix}$
• rref: reduced row-echelon form.
• $\mathcal{C}(A) = \text{Span }(\begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix}, \begin{bmatrix}1 \\ 1 \\ 4\end{bmatrix} )$
• Define $n$ is the normal vector to $\mathcal{C}(A)$, And vector $\begin{bmatrix}x \\ y \\ z\end{bmatrix}$ is point to the surface. Then:
• Use cross product, we get $n = \begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix} \times \begin{bmatrix}1 \\ 1 \\ 4\end{bmatrix} = \begin{bmatrix}5 \\ -1 \\ -1\end{bmatrix}$
• $n \cdot (\begin{bmatrix}x \\ y \\ z\end{bmatrix} - \begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix}) = 0$
• $5x - y - z = 0$ <=> $\mathcal{C}(A)$

### 2.2. Orthogonal Spaces

• Definition: Let $V, W \subset \mathbb{R}^n$ be subspaces. Then $V$ and $W$ are said to be orthogonal iff $v \in V$ and $w \in W$ implies $v^T w = 0$. Denoted by $V \perp W$
• Definition: Given subspace $V \subset \mathbb{R}^n$, the set of all vectors in $\mathbb{R}^n$ that are orthogonal to $V$ is denoted by $V^{\perp}$ (pronounced as “V-perp”).

### 2.3. Fundamental Spaces

• Recall some definitions. Let $A \in \mathbb{R}^{m \times n}$ and have k pivots. Then:
• Column space: $\mathcal{C}(A) = \{y|y = Ax\}\subset \mathbb{R}^m$.
• dimension: k
• Null space: $\mathcal{N}(A) = \{x|Ax = 0\} \subset \mathbb{R}^n$.
• dimension: n - k
• $0$ is vector $\in \mathbb{R}^n$
• Row space: $\mathcal{R}(A) = \mathcal{C}(A^T) =\{y|y = A^T x\} \subset \mathbb{R}^n$.
• dimension: k
• Left null space: $\mathcal{N}(A^T) = \{x|x^T A = 0^T\} \subset \mathbb{R}^m$.
• dimension: m - k
• $0$ is vector $\in \mathbb{R}^m$
• Theorem: Let $A \in \mathbb{R}^{m \times n}$. Then:

• $\mathcal{R}(A) \perp \mathcal{N}(A)$.
• every $x \in \mathbb{R}^n$ can be written as $x = x_r + x_n$ where $x_r \in \mathcal{R}(A)$ and $x_n \in \mathcal{N}(A)$.
• $A$ is a one-to-one, onto mapping from $\mathcal{R}(A)$ to $\mathcal{C}(A)$.
• $\mathcal{N}(A^T)$ is orthogonal to $\mathcal{C}(A)$ and the dimension of $\mathcal{N}(A^T)$ equals $m-r$, where $r$ is the dimension of $\mathcal{C}(A)$.
• For example: $A = \begin{bmatrix}2 & -1 & -3 \\ -4 & 2 & 6\end{bmatrix}$
• $T(x) = Ax$, $T: \mathbb{R}^3 \Rightarrow \mathbb{R}^2$
• $\mathcal{C}(A) = \text{Span }(\begin{bmatrix} 2 \\ -4 \end{bmatrix}) \subseteq \mathbb{R}^2$
• $\mathcal{N}(A^T) = \text{Span }(\begin{bmatrix} 2 \\ 1 \end{bmatrix}) \subseteq \mathbb{R}^2$
• $\mathcal{N}(A) = \text{Span }(\begin{bmatrix} \frac{1}{2} \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} \frac{3}{2} \\ 0 \\ 1 \end{bmatrix}) \subseteq \mathbb{R}^3$
• $\mathcal{R}(A) = \mathcal{C}(A^T) = \text{Span }(\begin{bmatrix} 2 \\ -1 \\ -3 \end{bmatrix}) \subseteq \mathbb{R}^3$
• => $\mathcal{R}(A) \perp \mathcal{N}(A)$
• => $\mathcal{C}(A) \perp \mathcal{N}(A^T)$

## 3. Approximating a Solution

• Find a line $y = \gamma_0 + \gamma_1 x$ to interpolate these points:
• $x = \left(\begin{array}{c} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3 \end{array}\right) = \left(\begin{array}{c}1 \\ 2 \\ 3 \\ 4\end{array}\right) \text{ and } y = \left(\begin{array}{c} \psi_0 \\ \psi_1 \\ \psi_2 \\ \psi_3 \end{array}\right) = \left(\begin{array}{c}1.97 \\ 6.97 \\ 8.89 \\ 10.01\end{array}\right)$
• Clearly, there is no line could go through all these points, then what is the best approximation?
• Set $A = \begin{bmatrix}1 & 1 \\ 1 & 2 \\ 1 & 3 \\ 1 & 4\end{bmatrix}, b = \begin{bmatrix}1.97 \\ 6.97 \\ 8.89 \\ 10.01\end{bmatrix}$
• We've learned before that $Ax=b$ has a solution iff $b \in \mathcal{C}(A)$. In other words, b is in the plane of $\text{Span}(a_1, a_2,\ldots, a_n)$.
• So, here we are solving $Ax \approx b$.
• Set the projection of b = $z$, $A\hat{x} = z$

• We can get

• $b = z + w$ where $w^T v = 0$ for all $v \in \mathcal{C}(A)$.
• Also $w \subset \mathcal{C}(A)^{\perp}$ => $w \subset \mathcal{N}(A^T)$. So, $A^Tw = 0$(same as $w^T A = 0^T$), which means
• $0 = A^Tw = A^T(b - z) = A^T(b - A \hat{x})$
• Rewrite it, we get $A^TA \hat{x} = A^T b$.
• This is known as the normal equation associated with the problem $A\hat{x} \approx b$.
• Although $A^TA$ is nonsingular, then
• $\hat{x} = (A^T A)^{-1} A^T b$
• And the vector $z \in \mathcal{C}(A)$ closest to $b$ is given by
• $z = A \hat{x} = A (A^T A)^{-1} A^T b$
• This shows that if A has linearly independent columns, then $z = A \hat{x} = A (A^T A)^{-1} A^T b$ is the vector in the columns space closest to b. This is the projection of b onto the column space of A.
• And the “best” solutionis known as the “linear least-squares” solution.

## 5. Words

• orthogonality [,ɔ:θɔɡə'næləti] n. [数] 正交性；相互垂直