Week 10 - Vector Spaces, Orthogonality, and Linear Least-Squares

1. Solving underdetermined systems

  • Important attributes of a linear system Ax=bAx = b and associated matrix A:
    • (example:(131226480024)(χ0χ1χ2χ3)=(131)\left(\begin{array}{c c c c}1 & 3 & 1 & 2 \\ 2 & 6 & 4 & 8 \\ 0 & 0 & 2 & 4\end{array}\right)\left(\begin{array}{c} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3\end{array}\right) = \left(\begin{array}{c} 1 \\ 3 \\ 1\end{array}\right))
    • The row-echelon form of the system.
    • The pivots.
      • the first nonzero entry in each row: 1, 2.
    • The free variables.
      • the columns that has no pivots: χ1,χ3\chi_1, \chi_3
    • The dependent variables.
      • the columns that has pivots: χ0,χ2\chi_0, \chi_2
    • A specific solution.
      • Often called a particular solution.
      • The most straightforward way is to set the free variables equal to zero
        • => χ1=χ3=0\chi_1 = \chi_3 = 0
        • => (13120024)(χ00χ20)=(11)\left(\begin{array}{c c c c}1 & 3 & 1 & 2 \\ 0 & 0 & 2 & 4\end{array}\right)\left(\begin{array}{c} \chi_0 \\ 0 \\ \chi_2 \\ 0\end{array}\right) = \left(\begin{array}{c} 1 \\ 1\end{array}\right)
        • => xp=(1/201/20)x_p = \left(\begin{array}{c} 1/2 \\ 0 \\ 1/2 \\ 0\end{array}\right)
    • A basis for the null space.
      • Often called the kernel of the matrix.
      • χ0+3χ1+χ2+2χ3=0,2χ2+4χ3=0\chi_0 + 3\chi_1 + \chi_2 + 2\chi_3 = 0, 2\chi_2 + 4\chi_3 = 0 => χ2=2χ3,χ0=3χ1\chi_2 = -2\chi_3, \chi_0 = -3\chi_1
      • [χ0χ1χ2χ3]=χ1[3100]+χ3[0021]\begin{bmatrix} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3 \end{bmatrix} = \chi_1 \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + \chi_3\begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix}
      • So the basic for N(A)=Span ([3100],[0021])\mathcal{N}(A) = \text{Span }(\begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix})
    • A general solution.
      • Often called a complete solution.
      • given by:
        • [1/201/20]+β0[3100]+β1[0021]\begin{bmatrix} 1/2 \\ 0 \\ 1/2 \\ 0 \end{bmatrix} + \beta_0 \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + \beta_1 \begin{bmatrix} 0 \\ 0 \\ -2 \\ 1\end{bmatrix}
    • A basis for the column space, C(A)\mathcal{C}(A).
      • Often called the range of the matrix.
      • equal to the number of dependent variables.
      • The columns that have pivots in them are linearly independent. The corresponding columns in the original matrix are also linearly independent:
    • A basis for the row space, R(A)=C(AT)\mathcal{R}(A) = \mathcal{C}(A^T).
      • The row space is the subspace of all vectors that can be created by taking linear combinations of the rows of a matrix.
      • List the rows that have pivots in the row echelon form as column vectors:
        • Notice these are the first and third row of A.
    • The dimension of the row and column space.
      • = number of pivots
      • = 2
    • The rank of the matrix.
      • = number of pivots
      • = 2
    • The dimension of the null space.
      • = the number of non-pivots columns
      • = 2

2. Orthogonal Vectors & Orthogonal Spaces

  • Vectors x and y are considered to be orthogonal (perpendicular) if they meet at a right angle: xTy=0x^T y = 0

2.1. Normal Vector

  • The normal vector, often simply called the "normal," to a surface is a vector which is perpendicular to the surface at a given point.
  • For example:
    • Define the plane as format: Ax+By+Cz=DAx + By + Cz = D
      • Vector n=[abc]\vec{n} = \begin{bmatrix} a \\ b \\ c \end{bmatrix} is normal to the plane.
      • Vector x0=[x0y0z0]\vec{x_0} = \begin{bmatrix} x_0 \\ y_0 \\ z_0 \end{bmatrix} is pointing to the plane.
      • Vector x=[xyz]\vec{x} = \begin{bmatrix} x \\ y \\ z \end{bmatrix} is pointing to the plane.
    • Then xx0\vec{x} - \vec{x_0} should be on the plane and perpendicular to n\vec{n}
    • Then (xx0)Tn=0(\vec{x} - \vec{x_0})^T \vec{n} = 0
    • [xx0yy0zz0]T[abc]=0\begin{bmatrix} x - x_0 \\ y - y_0 \\ z - z_0 \end{bmatrix}^T \begin{bmatrix} a \\ b \\ c \end{bmatrix} = 0
    • a(xx0)+b(yy0)+c(zz0)=0a(x - x_0) + b(y - y_0) + c(z - z_0) = 0
    • So we can use ax+by+cz=ax0+by0+cz0ax + by + cz = ax_0 + by_0 + cz_0 to represent the plane.

Cross Product

  • the cross product or vector product is a binary operation on two vectors in three-dimensional space (R3\mathbb{R}^3) and is denoted by the symbol ×.
    • [a0a1a2]×[b0b1b2]=[a1b2a2b1a2b0a0b2a0b1a1b0]\begin{bmatrix} a_0 \\ a_1 \\ a_2 \end{bmatrix} \times \begin{bmatrix} b_0 \\ b_1 \\ b_2 \end{bmatrix} = \begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}
  • Given two linearly independent vectors a and b, the cross product, a × b, is a vector that is perpendicular to both a and b and thus normal to the plane containing them.
    • Because [a1b2a2b1a2b0a0b2a0b1a1b0]T[a0a1a2]=0\begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}^T \begin{bmatrix} a_0 \\ a_1 \\ a_2 \end{bmatrix} = 0 and [a1b2a2b1a2b0a0b2a0b1a1b0]T[b0b1b2]=0\begin{bmatrix} a_1 b_2 - a_2 b_1 \\ a_2 b_0 - a_0 b_2 \\ a_0 b_1 - a_1 b_0 \end{bmatrix}^T \begin{bmatrix} b_0 \\ b_1 \\ b_2 \end{bmatrix} = 0
  • So we can use vectors a and b to get n. (n=a×bn = a \times b)

Visualizing a column space as a plane in R3

  • For example:
  • A=[111121433412]A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 2 & 1 & 4 & 3 \\ 3 & 4 & 1 & 2 \end{bmatrix}. rref (A)=[103201210000]\text{rref }(A) = \begin{bmatrix} 1 & 0 & 3 & 2 \\ 0 & 1 & -2 & -1 \\ 0 & 0 & 0 & 0 \end{bmatrix}
    • rref: reduced row-echelon form.
  • C(A)=Span ([123],[114])\mathcal{C}(A) = \text{Span }(\begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix}, \begin{bmatrix}1 \\ 1 \\ 4\end{bmatrix} )
  • Define nn is the normal vector to C(A)\mathcal{C}(A), And vector [xyz]\begin{bmatrix}x \\ y \\ z\end{bmatrix} is point to the surface. Then:
    • Use cross product, we get n=[123]×[114]=[511] n = \begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix} \times \begin{bmatrix}1 \\ 1 \\ 4\end{bmatrix} = \begin{bmatrix}5 \\ -1 \\ -1\end{bmatrix}
    • n([xyz][123])=0 n \cdot (\begin{bmatrix}x \\ y \\ z\end{bmatrix} - \begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix}) = 0
    • 5xyz=05x - y - z = 0 <=> C(A)\mathcal{C}(A)

2.2. Orthogonal Spaces

  • Definition: Let V,WRnV, W \subset \mathbb{R}^n be subspaces. Then VV and WW are said to be orthogonal iff vVv \in V and wWw \in W implies vTw=0v^T w = 0. Denoted by VWV \perp W
  • Definition: Given subspace VRnV \subset \mathbb{R}^n, the set of all vectors in Rn\mathbb{R}^n that are orthogonal to VV is denoted by VV^{\perp} (pronounced as “V-perp”).

2.3. Fundamental Spaces

  • Recall some definitions. Let ARm×nA \in \mathbb{R}^{m \times n} and have k pivots. Then:
    • Column space: C(A)={yy=Ax}Rm\mathcal{C}(A) = \{y|y = Ax\}\subset \mathbb{R}^m.
      • dimension: k
    • Null space: N(A)={xAx=0}Rn\mathcal{N}(A) = \{x|Ax = 0\} \subset \mathbb{R}^n.
      • dimension: n - k
      • 00 is vector Rn\in \mathbb{R}^n
    • Row space: R(A)=C(AT)={yy=ATx}Rn\mathcal{R}(A) = \mathcal{C}(A^T) =\{y|y = A^T x\} \subset \mathbb{R}^n.
      • dimension: k
    • Left null space: N(AT)={xxTA=0T}Rm\mathcal{N}(A^T) = \{x|x^T A = 0^T\} \subset \mathbb{R}^m.
      • dimension: m - k
      • 00 is vector Rm\in \mathbb{R}^m
  • Theorem: Let ARm×nA \in \mathbb{R}^{m \times n}. Then:

    • R(A)N(A)\mathcal{R}(A) \perp \mathcal{N}(A).
    • every xRnx \in \mathbb{R}^n can be written as x=xr+xnx = x_r + x_n where xrR(A)x_r \in \mathcal{R}(A) and xnN(A)x_n \in \mathcal{N}(A).
    • AA is a one-to-one, onto mapping from R(A)\mathcal{R}(A) to C(A)\mathcal{C}(A).
    • N(AT)\mathcal{N}(A^T) is orthogonal to C(A)\mathcal{C}(A) and the dimension of N(AT)\mathcal{N}(A^T) equals mrm-r, where rr is the dimension of C(A)\mathcal{C}(A).
  • For example: A=[213426]A = \begin{bmatrix}2 & -1 & -3 \\ -4 & 2 & 6\end{bmatrix}
    • T(x)=AxT(x) = Ax, T:R3R2T: \mathbb{R}^3 \Rightarrow \mathbb{R}^2
    • C(A)=Span ([24])R2\mathcal{C}(A) = \text{Span }(\begin{bmatrix} 2 \\ -4 \end{bmatrix}) \subseteq \mathbb{R}^2
    • N(AT)=Span ([21])R2\mathcal{N}(A^T) = \text{Span }(\begin{bmatrix} 2 \\ 1 \end{bmatrix}) \subseteq \mathbb{R}^2
    • N(A)=Span ([1210],[3201])R3\mathcal{N}(A) = \text{Span }(\begin{bmatrix} \frac{1}{2} \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} \frac{3}{2} \\ 0 \\ 1 \end{bmatrix}) \subseteq \mathbb{R}^3
    • R(A)=C(AT)=Span ([213])R3\mathcal{R}(A) = \mathcal{C}(A^T) = \text{Span }(\begin{bmatrix} 2 \\ -1 \\ -3 \end{bmatrix}) \subseteq \mathbb{R}^3
      • => R(A)N(A)\mathcal{R}(A) \perp \mathcal{N}(A)
      • => C(A)N(AT)\mathcal{C}(A) \perp \mathcal{N}(A^T)

3. Approximating a Solution

  • Find a line y=γ0+γ1xy = \gamma_0 + \gamma_1 x to interpolate these points:
    • x=(χ0χ1χ2χ3)=(1234) and y=(ψ0ψ1ψ2ψ3)=(1.976.978.8910.01)x = \left(\begin{array}{c} \chi_0 \\ \chi_1 \\ \chi_2 \\ \chi_3 \end{array}\right) = \left(\begin{array}{c}1 \\ 2 \\ 3 \\ 4\end{array}\right) \text{ and } y = \left(\begin{array}{c} \psi_0 \\ \psi_1 \\ \psi_2 \\ \psi_3 \end{array}\right) = \left(\begin{array}{c}1.97 \\ 6.97 \\ 8.89 \\ 10.01\end{array}\right)
  • Clearly, there is no line could go through all these points, then what is the best approximation?
  • Set A=[11121314],b=[1.976.978.8910.01]A = \begin{bmatrix}1 & 1 \\ 1 & 2 \\ 1 & 3 \\ 1 & 4\end{bmatrix}, b = \begin{bmatrix}1.97 \\ 6.97 \\ 8.89 \\ 10.01\end{bmatrix}
  • We've learned before that Ax=bAx=b has a solution iff bC(A)b \in \mathcal{C}(A). In other words, b is in the plane of Span(a1,a2,,an)\text{Span}(a_1, a_2,\ldots, a_n).
  • So, here we are solving AxbAx \approx b.
  • Set the projection of b = zz, Ax^=zA\hat{x} = z

  • We can get

    • b=z+wb = z + w where wTv=0 w^T v = 0 for all vC(A)v \in \mathcal{C}(A).
  • Also wC(A)w \subset \mathcal{C}(A)^{\perp} => wN(AT)w \subset \mathcal{N}(A^T). So, ATw=0A^Tw = 0(same as wTA=0Tw^T A = 0^T), which means
    • 0=ATw=AT(bz)=AT(bAx^)0 = A^Tw = A^T(b - z) = A^T(b - A \hat{x})
    • Rewrite it, we get ATAx^=ATbA^TA \hat{x} = A^T b.
    • This is known as the normal equation associated with the problem Ax^bA\hat{x} \approx b.
  • Although ATAA^TA is nonsingular, then
    • x^=(ATA)1ATb\hat{x} = (A^T A)^{-1} A^T b
  • And the vector zC(A)z \in \mathcal{C}(A) closest to bb is given by
    • z=Ax^=A(ATA)1ATbz = A \hat{x} = A (A^T A)^{-1} A^T b
  • This shows that if A has linearly independent columns, then z=Ax^=A(ATA)1ATbz = A \hat{x} = A (A^T A)^{-1} A^T b is the vector in the columns space closest to b. This is the projection of b onto the column space of A.
  • And the “best” solutionis known as the “linear least-squares” solution.

4. Refers

5. Words

  • orthogonality [,ɔ:θɔɡə'næləti] n. [数] 正交性;相互垂直

results matching ""

    No results matching ""