東京大学 情報理工学系研究科 コンピュータ科学専攻 2017年2月実施 問題1
Author
Description
For a positive integer \(p\), the \(p\)-norm \(\|\boldsymbol{x}\|_p\) of an \(n\)-dimensional real vector
is defined by
Answer the following questions
(1) Prove that
holds for every \(n\)-dimensional real vector \(\boldsymbol{x}\). You may use the Cauchy-Schwarz inequality:
for any \(n\)-dimensional real vectors \(\boldsymbol{x}\) and \(\boldsymbol{y}\). Here \(\boldsymbol{x} \cdot \boldsymbol{y}\) stands for the inner product of vectors \(\boldsymbol{x}\) and \(\boldsymbol{y}\).
(2) Define the \(p\)-norm \(\|A\|_p\) of an \(n \times n\) real matrix \(A\) by
where \(\boldsymbol{x}\) ranges over the set of \(n\)-dimensional real vectors.
- (2.1) Prove that if \(\|A\|_p < 1\) then \(\lim_{k \to \infty} \|A^k x_0\|_p = 0\) for every \(n\)-dimensional real vector \(x_0\).
- (2.2) Suppose that \(A\) is an \(n \times n\) real symmetric matrix. Prove that \(\|A\|_2\) is the maximum of the absolute values of the eigenvalues of \(A\).
(3) Consider solving an \(n\)-dimensional linear system \(A \boldsymbol{x} = \boldsymbol{b}\), where \(A\) is a non-singular real symmetric matrix, and \(\boldsymbol{x}\) and \(\boldsymbol{b}\) are unknown and constant real vectors, respectively.
Given an initial vector \(\boldsymbol{x}^{(0)}\), the vector \(\boldsymbol{x}^{(j)}\) \((j = 1, 2, \ldots)\) is computed by
where \(I\) stands for the identity matrix.
Give a necessary and sufficient condition on \(A\) such that the sequence \(\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}, \ldots\) converges to the true solution for every initial vector \(\boldsymbol{x}^{(0)}\).
Kai
(1)
On the other hand, let \(y\in \mathbb{R}^n\) be a unit vector:
The first inequality comes from Cauchy-Schwarz inequality: \(|xy| \leq ||x||_2 ||y||_2\).
(2)
(2.1)
Matrix \(A\) defines some transformation and \(||A||_p\) measures how original vector will be stretched by the transformation. If \(||A||_p < 1\) then the transformation \textit{shrinks} the vector. By applying the transformation multiple times, we will shrink the vector more and more. In particular, applying it infinitely many times will shrink the vector to zero. Formally:
First equality holds because norm is always non-negative, second is a property of a norm. Moreover, limit \(\lim_{k \to \infty} ||A^k||_p = 0\), because:
NB: \(||AB||_p \leq ||A||_p ||B||_p\) for any matrices \(A,B \in \mathbb{R}^{n\times n}\).
(2.2)
Since \(A\) is symmetric, then it can be diagonalized \(A=Q^T\Lambda Q\), where \(Q\) is orthogonal matrix (i.e. $Q^T Q = I $). Note that
Let \(x\) be a unit vector which maximizes the above. We have:
By substituting \(y = Qx\):
Because \(y^T y = (Qx)^T(Qx) = x^T Q^T Q x = x^T x = 1\). On the other hand, let \(q\) be eigenvector corresponding to \(\lambda_{max}\). Of course \(||q||_2 =1\), because \(Q\) is orthogonal.
(3)
Let \(e^{(j)} = x^{(j)} - x\) be the error vector. We want \(\lim_{j\to \infty} e^{(j)} = 0\). In other words, we want \(e^{(j+1)} < e^{(j)}\) for all sufficiently big \(j\).
Here, we obtain: \(e^{(j+1)} = (I -A) e^{(j)}\). Taking a norm both sides:
Thus, the sequence converge to real solution if
Knowledge
More reading: http://runge.math.smu.edu/Courses/Math3315_Spring10/iterative_linear.pdf