東京大学情報理工学系研究科コンピュータ科学専攻 2017年2月実施問題1

Author

kainoj

Description

For a positive integer $p$, the $p$-norm $\|\boldsymbol{x}\|_p$ of an $n$-dimensional real vector

\[ \boldsymbol{x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]

is defined by

\[ \|\boldsymbol{x}\|_p := \left( \sum_{j=1}^n |x_j|^p \right)^{1/p}. \]

Answer the following questions

(1) Prove that

\[ \|\boldsymbol{x}\|_2 \le \|\boldsymbol{x}\|_1 \le \sqrt{n} \|\boldsymbol{x}\|_2 \]

holds for every $n$-dimensional real vector $\boldsymbol{x}$. You may use the Cauchy-Schwarz inequality:

\[ |\boldsymbol{x} \cdot \boldsymbol{y}| \le \|\boldsymbol{x}\|_2 \|\boldsymbol{y}\|_2 \]

for any $n$-dimensional real vectors $\boldsymbol{x}$ and $\boldsymbol{y}$. Here $\boldsymbol{x} \cdot \boldsymbol{y}$ stands for the inner product of vectors $\boldsymbol{x}$ and $\boldsymbol{y}$.

(2) Define the $p$-norm $\|A\|_p$ of an $n \times n$ real matrix $A$ by

\[ \|A\|_p := \max_{\boldsymbol{x} \neq \boldsymbol{0}} \frac{\|Ax\|_p}{\|\boldsymbol{x}\|_p}, \]

where $\boldsymbol{x}$ ranges over the set of $n$-dimensional real vectors.

(2.1) Prove that if $\|A\|_p < 1$ then $\lim_{k \to \infty} \|A^k x_0\|_p = 0$ for every $n$-dimensional real vector $x_0$.
(2.2) Suppose that $A$ is an $n \times n$ real symmetric matrix. Prove that $\|A\|_2$ is the maximum of the absolute values of the eigenvalues of $A$.

(3) Consider solving an $n$-dimensional linear system $A \boldsymbol{x} = \boldsymbol{b}$, where $A$ is a non-singular real symmetric matrix, and $\boldsymbol{x}$ and $\boldsymbol{b}$ are unknown and constant real vectors, respectively.

Given an initial vector $\boldsymbol{x}^{(0)}$, the vector $\boldsymbol{x}^{(j)}$ $(j = 1, 2, \ldots)$ is computed by

\[ \boldsymbol{x}^{(j)} = \boldsymbol{b} + (I - A)\boldsymbol{x}^{(j-1)}, \]

where $I$ stands for the identity matrix.

Give a necessary and sufficient condition on $A$ such that the sequence $\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}, \ldots$ converges to the true solution for every initial vector $\boldsymbol{x}^{(0)}$.

Kai

(1)

\[ ||x||_2 = \sqrt{\sum_{i=1}^{n} x_{i}^2} \leq \sum_{i=1}^{n} \sqrt{x_{i}^2} = \sum_{i=1}^{n} |x_{i}| = ||x||_{1} \]

On the other hand, let $y\in \mathbb{R}^n$ be a unit vector:

\[ \begin{aligned} ||x||_1 = | x \cdot y | &\leq ||x||_{2} ||y||_{2} \\ &= ||x||_{2} \sqrt{1+1+\dots+1} \\ &= \sqrt{n} ||x||_{2} \end{aligned} \]

The first inequality comes from Cauchy-Schwarz inequality: $|xy| \leq ||x||_2 ||y||_2$.

(2)

(2.1)

Matrix $A$ defines some transformation and $||A||_p$ measures how original vector will be stretched by the transformation. If $||A||_p < 1$ then the transformation \textit{shrinks} the vector. By applying the transformation multiple times, we will shrink the vector more and more. In particular, applying it infinitely many times will shrink the vector to zero. Formally:

\[ \begin{aligned} 0 &\leq \lim_{k \to \infty} ||A^k x_0||_p \\ &\leq \lim_{k \to \infty} ||A^k||_p || x_0||_p \\ &= || x_0||_p \lim_{k \to \infty} ||A^k||_p \\ &= || x_0||_p \: 0 \\ &= 0 \end{aligned} \]

First equality holds because norm is always non-negative, second is a property of a norm. Moreover, limit $\lim_{k \to \infty} ||A^k||_p = 0$, because:

\[ \lim_{k \to \infty} ||A^k||_p \leq \lim_{k \to \infty} (||A||_p)^k =0 \]

NB: $||AB||_p \leq ||A||_p ||B||_p$ for any matrices $A,B \in \mathbb{R}^{n\times n}$.

(2.2)

Since $A$ is symmetric, then it can be diagonalized $A=Q^T\Lambda Q$, where $Q$ is orthogonal matrix (i.e. $Q^T Q = I $). Note that

\[ \max_{x} \frac{||Ax||_p}{||x||_p} = \max_{x: ||x||_p = 1} ||Ax||_p \]

Let $x$ be a unit vector which maximizes the above. We have:

\[ \begin{aligned} ||Ax||_2 &= (Ax)^T (Ax) \\ &= x^T Q^T \Lambda Q Q^T \Lambda Q x \\ &= x^T Q^T \Lambda^2 Q x \end{aligned} \]

By substituting $y = Qx$:

\[ \begin{aligned} ||Ax||_2 &= y^T \Lambda^2 y \\ &= \sum \lambda_i^2 y_i^2 \\ &\leq \sum \lambda_{max} y_i^2 \\ &=\lambda_{max} \sum y_i^2 \\ &= \lambda_{max} \: y^T y \\ &= \lambda_{max} \end{aligned} \]

Because $y^T y = (Qx)^T(Qx) = x^T Q^T Q x = x^T x = 1$. On the other hand, let $q$ be eigenvector corresponding to $\lambda_{max}$. Of course $||q||_2 =1$, because $Q$ is orthogonal.

\[ \begin{aligned} ||Aq||_2 &\geq ||Aq||_2 \\ &= ||\lambda_{max} q ||_2 \\ &= \lambda_{max} ||q||_2 \\ &= \lambda_{max} \end{aligned} \]

(3)

Let $e^{(j)} = x^{(j)} - x$ be the error vector. We want $\lim_{j\to \infty} e^{(j)} = 0$. In other words, we want $e^{(j+1)} < e^{(j)}$ for all sufficiently big $j$.

\[ \begin{aligned} e^{(j+1)} = x^{(j+1)} - x &= b - x + (I -A) x^{(j)} \\ &= Ax - x + (I -A) x^{(j)} \\ &= -(I-A)x + (I -A) x^{(j)} \\ &= (I -A)(x^{(j)} - x) \\ \end{aligned} \]

Here, we obtain: $e^{(j+1)} = (I -A) e^{(j)}$. Taking a norm both sides:

\[ \begin{aligned} ||e^{(j+1)}|| &= ||(I -A) e^{(j)}|| \\ &\leq ||I -A|| \: || e^{(j)}|| \end{aligned} \]

Thus, the sequence converge to real solution if

\[ ||I -A|| < 1 \]

Knowledge

More reading: http://runge.math.smu.edu/Courses/Math3315_Spring10/iterative_linear.pdf

東京大学 情報理工学系研究科 コンピュータ科学専攻 2017年2月実施 問題1