Definition. Let $T:V\to V$ be a linear map on a vector space $V$, we say that $T$ is a projection if $T^2=T$. We say that $T$ is an orthogonal projection if $T$ is a projection and the direct sum \[
V=\ker T \oplus \range T
\] is orthogonal.
Proposition. Let $W$ be a finite dimensional subspace of an inner product space $V$ and let $\{w_1,\dots,w_k\}$ be an orthogonal basis of $W$, then the onto linear map $P:V\to W$ defined by for each $v\in V$,\begin{equation}\label{defined}Let $P$ be an orthogonal projection as in (\ref{defined}), one can easily verify the following simple properties:
Pv=\sum_{i=1}^k \frac{\inner{v,w_i}w_i}{\|w_i\|^2}
\end{equation} is orthogonal projection. Here $\|v\|=\sqrt{\inner{v,v}}$.
- $P^2=P$ (i.e., $P$ is really a projection).
- $\inner{u,Pv}=\inner{Pu,v}$, for all $u,v\in V$.
Pv&=\sum_{i=1}^k \frac{\inner{v,a_i} a_i}{\|a_i\|^2}\\
P'v&=\sum_{i=1}^k \frac{\inner{v,b_i} b_i}{\|b_i\|^2}
\end{align*} for each $v\in V$.
Fix a $v\in V$, the maps $P$ and $P'$ are both orthogonal projection onto $W$. By the properties of orthogonal projection, it is easy to check that for all $w\in W$, \[
v-Pv\perp w\quad \text{and}\quad v-P'v\perp w,
\] (one may need to recall that both of the linear maps $P,P':V\to W$ are onto!) i.e., we have \[
\range (I-P)\perp W\quad\text{and}\quad \range (I-P')\perp W,
\] hence we have for every $w\in W$,\[
\|v-w\|^2 =\|v-Pv+\underbrace{Pv-w}_{\in W}\|^2=\|v-Pv\|^2+ \|Pv-w\|^2\ge \|v-Pv\|^2.
\] Similarly, $\|v-w\|\ge \|v-P'v\|$, hence both $Pv,P'v\in W$ minimize the following:\[
\boxed{\|v-w\|,\quad w\in W.}
\] Since minimizer is unique (cf. my note 6, example 30), we must have $Pv=P'v$.
Problem 2. (a) By Gram-Schmidt process we orthogonalize $u_1$ and $u_2$ to \[
v_1=\matrixx{1\\2\\3}\quad\text{and}\quad v_2=\frac{1}{7}\matrixx{12\\3\\-6}.
\] Where $\|v_1\|^2=14$ and $\|v_2\|^2 = \frac{27}{7}$. If we use this orthogonal basis of $V$ to construct the orthogonal projection: For each $x\in \R^3$,\begin{equation}\label{projection formula}
Px=\frac{\inner{x,v_1}v_1}{\|v_1\|^2}+\frac{\inner{x,v_2}v_2}{\|v_2\|^2},
\end{equation} then $Px$ will be the minimizer of the distance \[
\|x-v\|,\quad v\in \spann \left\{\matrixx{1\\2\\3},\matrixx{4\\5\\6}\right\}.
\] Thus the $u\in V$ we need is $Py\in V$, since $\|y-Py\|$ is least possible among $\{\|y-v\|:v\in V\}$. Now \begin{align*}
u&=\frac{\inner{y,v_1}v_1}{\|v_1\|^2}+\frac{\inner{y,v_2}v_2}{\|v_2\|^2}\\
&= \frac{\matrixx{7\\4\\7} \cdot \matrixx{1\\2\\3} \matrixx{1\\2\\3}}{14}+\frac{\frac{1}{7} \matrixx{7\\4\\7} \cdot \matrixx{12\\3\\-6}}{\frac{27}{7}} \frac{1}{7}\matrixx{12\\3\\-6} = \matrixx{6\\6\\6}.
\end{align*}
(b) Let's construct $A=\matrixx{1&4\\ 2&5\\3&6}$, then $\col A=V$. The system $Ax=y$ has no solution, but the best possible that we can do is to choose an element $a\in \col A$ such that $a$ is closest to $y$. To do this, consider\begin{equation}\label{solve this}
A^TAx=A^Ty,
\end{equation} then the solution $x_0$ to this system will satisfy $\|y-v\|\ge \|y-Ax_0\|$, for all $v\in \col A$. Thus what we want is $u=Ax_0$.
Let's try to solve (\ref{solve this}), a direct computation shows that (\ref{solve this}) is the same as \[
\matrixx{14&32\\32&77}x=\matrixx{36\\90},
\] a direct calculation yields $x=\matrixx{-2\\2}$. Hence \[
u=Ax =\matrixx{1&4\\2&5\\3&6} \matrixx{-2\\2}=\matrixx{6\\6\\6}.
\]
(c) An element $a\in \col A$ minimizes the distance $\{\|y-v\|:v\in \col A\}$ if and only if \[
y-a\perp v,\quad \forall v\in \col A.
\] Essentially this is the idea that we use to construct normal equation in (b). But this ``orthogonality method" also works for minimizing problem that is not in $\R^n$. In particular, it works for some finite dimensional function space (in which the language of matrices is not readily applicable, unless one has an attempt to translate everything to matrices), say the collection of all polynomials with degree at most $n$.
It is worth illustrating how simple the orthogonality method is.
Let $a=x\matrixx{1\\2\\3}+y\matrixx{4\\5\\6}$ that minimizes the distance, then \[
\innerr{y-a,\matrixx{1\\2\\3}}=\innerr{y-a,\matrixx{4\\5\\6}} =0\implies \begin{cases}
14x+32y=36,\\
32x+77y=90,
\end{cases}
\] thus $x=-2$ and $y=2$, so $u=\matrixx{6\\6\\6}$.
(d) The projection matrix is given by \[
P=A(A^TA)^{-1}A^T =\frac{1}{6}\matrixx{5&2&-1\\2&2&2\\-1&2&5},
\] so $u=Py = \matrixx{6\\6\\6}$.
Of course the computation ($P$) above is extremely tedious.
Given an orthonormal basis, why don't we just find the standard matrix of the orthogonal projection performed by using this basis? Recall the formula in (\ref{projection formula}), we have (on simplification) \[
P(x) =\frac{x\cdot \matrixx{1\\2\\ 3}}{14}\matrixx{1\\2\\3}+\frac{x\cdot \matrixx{4\\1\\-2}}{21}\matrixx{4\\1\\-2},
\] so we get\begin{align*}
Pe_1& = \frac{1}{14}\matrixx{1\\2\\3}+\frac{4}{21}\matrixx{4\\1\\-2}=\frac{1}{6}\matrixx{5\\2\\-1},\\
Pe_2 &=\frac{2}{14}\matrixx{1\\2\\3}+\frac{1}{21}\matrixx{4\\1\\-2}=\frac{1}{6}\matrixx{2\\2\\2},\\
Pe_3&=\frac{3}{14}\matrixx{1\\2\\3}+\frac{(-2)}{21}\matrixx{4\\1\\-2}=\frac{1}{6}\matrixx{-1\\2\\5},
\end{align*} so the standard matrix of $P$ is \[
[P]=\frac{1}{6}\matrixx{5&2&-1\\2&2&2\\-1&2&5}.
\]
Remark. If $v_1,v_2\in \R^3$ given are already orthonormal at the beginning, then the projection matrix is extremely easy to compute. Namely, let $A=\matrixx{v_1&v_2}$, then \[
P= A(A^TA)^{-1} A^T = AA^T.
\]
Problem 3. Let $V=\{p\in \mathcal P_3:p(0)=p'(0)=0\}$, define on $\mathcal P_3$ the inner product \[
\inner{p,q}=\int_0^1p(x)q(x)\,dx,
\] then the norm induced by this inner product is $\|p\|=\brac{\inner{p,p}}^{1/2}=\brac{\int_0^1 p^2\,dx}^{1/2}$, so the problem is to find a $p_0\in V$ that minimizes \[
\|2+3x-p\|,\quad p\in V.
\] Of course this solution can be easily obtained by orthogonal projection or the orthogonality of "minimizer" :).
Method 1 (Projection). Since $V=\spann\{x^2,x^3\}$, we try to orthogonalize these two vectors:
Let $u_1=x^2$ and $u_2=x^3$.
Choose $v_1=u_1$, and $\|v_1\|^2 = \frac{1}{5}$.
Next we compute $v_2$ that is orthogonal to $v_1$: \[
v_2=u_2-\frac{\inner{u_2,v_1}v_1}{\|v_1\|^2} = x^3 -\frac{\brac{\int_0^1 (x^3\cdot x^2)\,dx }x^2}{\frac{1}{5}}=x^3-\frac{5}{6}x^2.
\] Also we have $\|v_2\|^2 =\frac{1}{252}$.
Finally we do orthogonal projection to obtain $p_0$: \[
p_0=\frac{\inner{2+3x,v_1}v_1}{\|v_1\|^2}+\frac{\inner{2+3x,v_2}v_2}{\|v_2\|^2}=-\frac{203}{10}x^3+24x^2.
\]
Method 2 (Orthogonality). $p_0\in V$ is a minimizer if and only if \[
\inner{2+3x-p_0,x^2}=\inner{2+3x-p_0,x^3}=0,
\] set $p_0=a_2x^2+a_3x^3$, we get \begin{cases}
\frac{a_2}{5}+\frac{a_3}{6}=\frac{17}{12},\\
\frac{a_2}{6}+\frac{a_3}{7}=\frac{11}{10},
\end{cases} on solving the equations, we get $a_2=24$ and $a_3=-203/10$.
Problem 4. We have \begin{align*}
A\text{ is injective}&\iff \nul A=\{0\}\\
&\iff (\nul A)^\perp = (\{0\})^\perp \\
&\iff \col A^T =\R^n\\
&\iff A\text{ is surjective}.
\end{align*} Here the second $\iff$ follows from the following result:
Hence the direction can go backwards.Theorem 1. Let $W$ be a finite dimensional subspace of an inner product space $V$, then \[
(W^\perp)^\perp =W.
\]
Problem 5. The implication says that\begin{equation}\label{we will return}
(\spann\{a_1\})^\perp \cap (\spann\{a_2\})^\perp \cap\cdots \cap (\spann\{a_m\})^\perp\subseteq (\spann\{b\})^\perp.
\end{equation} Actually we see that \begin{align*}
&x\in (\spann\{a_1\})^\perp \cap (\spann\{a_2\})^\perp \cap\cdots \cap (\spann\{a_m\})^\perp\\
&\iff \inner{x, \alpha_1a_1}=\inner{x,\alpha_2a_2}=\cdots =\inner{x,\alpha_m a_m}=0,\quad \forall\alpha_i\in\R, \forall i\\
&\iff \inner{x, \alpha_1a_1+\alpha_2 a_2+\cdots +\alpha_m a_m}=0,\quad \forall \alpha_i\in \R,\forall i\\
&\iff \inner{ x,v}=0,\quad \forall v\in \spann\{a_1,a_2,\dots,a_m\}\\
&\iff x\in (\spann\{a_1,a_2,\dots,a_m\})^\perp,
\end{align*} so (\ref{we will return}) can be written as\begin{equation}\label{take perp finally}
(\spann\{a_1,a_2,\dots,a_m\})^\perp\subseteq (\spann\{b\})^\perp.
\end{equation} Next we observe the following result (exercise, this follows quite directly from definition):
Hence by using theorem 2 and then theorem 1 to (\ref{take perp finally}),\[Theorem 2. Let $X,Y$ be two subsets of an inner product space $V$, we have \[
X\subseteq Y\implies X^\perp \supseteq Y^\perp.
\]
\spann \{a_1,a_2,\dots,a_m\}=((\spann\{a_1,a_2,\dots,a_m\})^\perp)^\perp \supseteq ( (\spann\{b\})^\perp)^\perp = \spann\{b\}.
\]
Problem 6.
Warning: A matrix $A$ is an orthogonal projection doesn't mean it is orthogonal ($A^TA=I$).Recall the definition of orthogonal projection at the very beginning of this post.
(ii) $\Rightarrow$ (i): Suppose $A$ is symmetric, we show that $\nul A\perp \col A$. Let $x\in \nul A$ and $y\in \col A$, then $y=Ay_0$, for some $y_0\in \R^n$, so \[
\inner{x,y}=\inner{x,Ay_0}=\inner{A^Tx,y_0}=\inner{Ax,y_0}=\inner{0,y_0}=0,
\] as desired.
(i) $\Rightarrow$ (ii): Since $A^2=A$, we have $\R^n = \nul A+\col A$. As $A$ is an orthogonal projection, we have $\nul A\perp \col A$. Let $\{u_1,\dots,u_k\}$ be an orthonormal basis of $\nul A$ and let $\{v_1,\dots,v_m\}$ be an orthonormal basis of $\col A$, then \[
\alpha =\{u_1,\dots,u_k,v_1,\dots,v_m\}
\] is an orthonormal basis of $\R^n$, moreover, each of $u_i$'s and $v_i$'s is eigenvector of $A$. If we let $P=\matrixx{u_1&\cdots &u_k&v_1&\cdots &v_m}$, then \[
P^{-1}AP =\matrixx{0&\\ &\ddots \\&&0\\&&&1\\&&&&\ddots\\&&&&&1}=D,
\]so $A=PDP^{-1}$. Recall that $\alpha$ is orthonormal, $P^TP=I$, so we have $A=PDP^T$ and\[
A^T = (PDP^T)^T=PD^TP^T = PDP^T=A.
\]
No comments:
Post a Comment