Skip to content

東京大学 情報理工学系研究科 コンピュータ科学専攻 2017年2月実施 問題1

Author

kainoj

Description

For a positive integer \(p\), the \(p\)-norm \(\|\boldsymbol{x}\|_p\) of an \(n\)-dimensional real vector

\[ \boldsymbol{x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]

is defined by

\[ \|\boldsymbol{x}\|_p := \left( \sum_{j=1}^n |x_j|^p \right)^{1/p}. \]

Answer the following questions

(1) Prove that

\[ \|\boldsymbol{x}\|_2 \le \|\boldsymbol{x}\|_1 \le \sqrt{n} \|\boldsymbol{x}\|_2 \]

holds for every \(n\)-dimensional real vector \(\boldsymbol{x}\). You may use the Cauchy-Schwarz inequality:

\[ |\boldsymbol{x} \cdot \boldsymbol{y}| \le \|\boldsymbol{x}\|_2 \|\boldsymbol{y}\|_2 \]

for any \(n\)-dimensional real vectors \(\boldsymbol{x}\) and \(\boldsymbol{y}\). Here \(\boldsymbol{x} \cdot \boldsymbol{y}\) stands for the inner product of vectors \(\boldsymbol{x}\) and \(\boldsymbol{y}\).

(2) Define the \(p\)-norm \(\|A\|_p\) of an \(n \times n\) real matrix \(A\) by

\[ \|A\|_p := \max_{\boldsymbol{x} \neq \boldsymbol{0}} \frac{\|Ax\|_p}{\|\boldsymbol{x}\|_p}, \]

where \(\boldsymbol{x}\) ranges over the set of \(n\)-dimensional real vectors.

  • (2.1) Prove that if \(\|A\|_p < 1\) then \(\lim_{k \to \infty} \|A^k x_0\|_p = 0\) for every \(n\)-dimensional real vector \(x_0\).
  • (2.2) Suppose that \(A\) is an \(n \times n\) real symmetric matrix. Prove that \(\|A\|_2\) is the maximum of the absolute values of the eigenvalues of \(A\).

(3) Consider solving an \(n\)-dimensional linear system \(A \boldsymbol{x} = \boldsymbol{b}\), where \(A\) is a non-singular real symmetric matrix, and \(\boldsymbol{x}\) and \(\boldsymbol{b}\) are unknown and constant real vectors, respectively.

Given an initial vector \(\boldsymbol{x}^{(0)}\), the vector \(\boldsymbol{x}^{(j)}\) \((j = 1, 2, \ldots)\) is computed by

\[ \boldsymbol{x}^{(j)} = \boldsymbol{b} + (I - A)\boldsymbol{x}^{(j-1)}, \]

where \(I\) stands for the identity matrix.

Give a necessary and sufficient condition on \(A\) such that the sequence \(\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}, \ldots\) converges to the true solution for every initial vector \(\boldsymbol{x}^{(0)}\).

Kai

(1)

\[ ||x||_2 = \sqrt{\sum_{i=1}^{n} x_{i}^2} \leq \sum_{i=1}^{n} \sqrt{x_{i}^2} = \sum_{i=1}^{n} |x_{i}| = ||x||_{1} \]

On the other hand, let \(y\in \mathbb{R}^n\) be a unit vector:

\[ \begin{aligned} ||x||_1 = | x \cdot y | &\leq ||x||_{2} ||y||_{2} \\ &= ||x||_{2} \sqrt{1+1+\dots+1} \\ &= \sqrt{n} ||x||_{2} \end{aligned} \]

The first inequality comes from Cauchy-Schwarz inequality: \(|xy| \leq ||x||_2 ||y||_2\).

(2)

(2.1)

Matrix \(A\) defines some transformation and \(||A||_p\) measures how original vector will be stretched by the transformation. If \(||A||_p < 1\) then the transformation \textit{shrinks} the vector. By applying the transformation multiple times, we will shrink the vector more and more. In particular, applying it infinitely many times will shrink the vector to zero. Formally:

\[ \begin{aligned} 0 &\leq \lim_{k \to \infty} ||A^k x_0||_p \\ &\leq \lim_{k \to \infty} ||A^k||_p || x_0||_p \\ &= || x_0||_p \lim_{k \to \infty} ||A^k||_p \\ &= || x_0||_p \: 0 \\ &= 0 \end{aligned} \]

First equality holds because norm is always non-negative, second is a property of a norm. Moreover, limit \(\lim_{k \to \infty} ||A^k||_p = 0\), because:

\[ \lim_{k \to \infty} ||A^k||_p \leq \lim_{k \to \infty} (||A||_p)^k =0 \]

NB: \(||AB||_p \leq ||A||_p ||B||_p\) for any matrices \(A,B \in \mathbb{R}^{n\times n}\).

(2.2)

Since \(A\) is symmetric, then it can be diagonalized \(A=Q^T\Lambda Q\), where \(Q\) is orthogonal matrix (i.e. $Q^T Q = I $). Note that

\[ \max_{x} \frac{||Ax||_p}{||x||_p} = \max_{x: ||x||_p = 1} ||Ax||_p \]

Let \(x\) be a unit vector which maximizes the above. We have:

\[ \begin{aligned} ||Ax||_2 &= (Ax)^T (Ax) \\ &= x^T Q^T \Lambda Q Q^T \Lambda Q x \\ &= x^T Q^T \Lambda^2 Q x \end{aligned} \]

By substituting \(y = Qx\):

\[ \begin{aligned} ||Ax||_2 &= y^T \Lambda^2 y \\ &= \sum \lambda_i^2 y_i^2 \\ &\leq \sum \lambda_{max} y_i^2 \\ &=\lambda_{max} \sum y_i^2 \\ &= \lambda_{max} \: y^T y \\ &= \lambda_{max} \end{aligned} \]

Because \(y^T y = (Qx)^T(Qx) = x^T Q^T Q x = x^T x = 1\). On the other hand, let \(q\) be eigenvector corresponding to \(\lambda_{max}\). Of course \(||q||_2 =1\), because \(Q\) is orthogonal.

\[ \begin{aligned} ||Aq||_2 &\geq ||Aq||_2 \\ &= ||\lambda_{max} q ||_2 \\ &= \lambda_{max} ||q||_2 \\ &= \lambda_{max} \end{aligned} \]

(3)

Let \(e^{(j)} = x^{(j)} - x\) be the error vector. We want \(\lim_{j\to \infty} e^{(j)} = 0\). In other words, we want \(e^{(j+1)} < e^{(j)}\) for all sufficiently big \(j\).

\[ \begin{aligned} e^{(j+1)} = x^{(j+1)} - x &= b - x + (I -A) x^{(j)} \\ &= Ax - x + (I -A) x^{(j)} \\ &= -(I-A)x + (I -A) x^{(j)} \\ &= (I -A)(x^{(j)} - x) \\ \end{aligned} \]

Here, we obtain: \(e^{(j+1)} = (I -A) e^{(j)}\). Taking a norm both sides:

\[ \begin{aligned} ||e^{(j+1)}|| &= ||(I -A) e^{(j)}|| \\ &\leq ||I -A|| \: || e^{(j)}|| \end{aligned} \]

Thus, the sequence converge to real solution if

\[ ||I -A|| < 1 \]

Knowledge

More reading: http://runge.math.smu.edu/Courses/Math3315_Spring10/iterative_linear.pdf