東京大学新領域創成科学研究科メディカル情報生命専攻 2018年8月実施問題8

Author

Description

If an $n \times n$ real symmetric matrix $\mathbf{A}$ satisfies condition $\mathbf{v}^T \mathbf{A} \mathbf{v} > 0$ for any non-zero $n$-dimensional real column vector $\mathbf{v}$, $\mathbf{A}$ is called a positive definite matrix ($\mathbf{v}^T$ represents the transpose of $\mathbf{v}$). Answer the following questions.

(1) Show the diagonal elements $A_{kk} \ (k = 1, \ldots, n)$ of real positive definite matrix $\mathbf{A}$ are positive.

(2) Show the eigenvalues of real symmetric matrix $\mathbf{A}$ are real.

(3) Show the eigenvalues of positive definite matrix $\mathbf{A}$ are positive.

(4) Let $S = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v}^T \mathbf{v} = 1 \}$ be the set of non-zero $n$-dimensional real column vectors $\mathbf{v}$ of unit length. Show $\lambda_1 = \max_{\mathbf{v} \in S} \mathbf{v}^T \mathbf{A} \mathbf{v}$ is the largest eigenvalue of positive definite matrix $\mathbf{A}$.

(5) Suppose the eigenvectors of positive definite matrix $\mathbf{A}$ are all different. Furthermore, suppose you know the largest eigenvalue $\lambda_1$ and its associated eigenvector $\mathbf{v}_1$. Explain how to compute the second largest eigenvalue $\lambda_2$ using $\lambda_1$ and $\mathbf{v}_1$ without computing the third largest or smaller eigenvalues.

如果一个 $n \times n$ 实对称矩阵 $\mathbf{A}$ 满足对于任意非零 $n$ 维实列向量 $\mathbf{v}$，有 $\mathbf{v}^T \mathbf{A} \mathbf{v} > 0$，则称 $\mathbf{A}$ 为正定矩阵 ($\mathbf{v}^T$ 表示 $\mathbf{v}$ 的转置)。回答以下问题。

(1) 证明实正定矩阵 $\mathbf{A}$ 的对角元素 $A_{kk} \ (k = 1, \ldots, n)$ 是正的。

(2) 证明实对称矩阵 $\mathbf{A}$ 的特征值是实数。

(3) 证明正定矩阵 $\mathbf{A}$ 的特征值是正的。

(4) 设 $S = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v}^T \mathbf{v} = 1 \}$ 为非零 $n$ 维实列向量 $\mathbf{v}$ 的单位长度集合。证明 $\lambda_1 = \max_{\mathbf{v} \in S} \mathbf{v}^T \mathbf{A} \mathbf{v}$ 是正定矩阵 $\mathbf{A}$ 的最大特征值。

(5) 假设正定矩阵 $\mathbf{A}$ 的特征向量都不相同。此外，假设你知道最大特征值 $\lambda_1$ 及其关联的特征向量 $\mathbf{v}_1$。解释如何在不计算第三大或更小的特征值的情况下，使用 $\lambda_1$ 和 $\mathbf{v}_1$ 计算第二大特征值 $\lambda_2$。

Kai

(1)

Consider the standard basis vector $\mathbf{e}_k$ where the $k$-th element is 1 and all other elements are 0. Then,

\[ \mathbf{e}_k^T \mathbf{A} \mathbf{e}_k = A_{kk}. \]

Since $\mathbf{A}$ is positive definite, we have:

\[ \mathbf{e}_k^T \mathbf{A} \mathbf{e}_k > 0. \]

Thus,

\[ A_{kk} > 0. \]

(2)

For a real symmetric matrix $\mathbf{A}$, consider its eigenvalue equation:

\[ \mathbf{A} \mathbf{x} = \lambda \mathbf{x}, \]

where $\mathbf{x}$ is an eigenvector and $\lambda$ is the corresponding eigenvalue. Since $\mathbf{A}$ is symmetric, we have:

\[ \mathbf{x}^T \mathbf{A} \mathbf{y} = \mathbf{y}^T \mathbf{A} \mathbf{x} \]

for any vectors $\mathbf{x}$ and $\mathbf{y}$. Setting $\mathbf{y} = \mathbf{x}$, we get:

\[ \mathbf{x}^T \mathbf{A} \mathbf{x} = \mathbf{x}^T (\lambda \mathbf{x}) = \lambda (\mathbf{x}^T \mathbf{x}). \]

Since $\mathbf{x}^T \mathbf{x} > 0$ (as $\mathbf{x} \neq 0$), $\lambda$ must be real.

(3)

Let $\mathbf{A}$ be a positive definite matrix and $\lambda$ be an eigenvalue with corresponding eigenvector $\mathbf{x}$:

\[ \mathbf{A} \mathbf{x} = \lambda \mathbf{x}. \]

Taking the transpose of $\mathbf{x}$, we have:

\[ \mathbf{x}^T \mathbf{A} \mathbf{x} = \lambda \mathbf{x}^T \mathbf{x}. \]

Since $\mathbf{A}$ is positive definite, $\mathbf{x}^T \mathbf{A} \mathbf{x} > 0$ and $\mathbf{x}^T \mathbf{x} > 0$. Thus,

\[ \lambda > 0. \]

(4)

Definitions and Setup

Let $\mathbf{A}$ be an $n \times n$ real symmetric positive definite matrix.
The set $S = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v}^T \mathbf{v} = 1 \}$ represents the set of unit vectors in $\mathbb{R}^n$.
We aim to show that the maximum value of the quadratic form $\mathbf{v}^T \mathbf{A} \mathbf{v}$ over the set $S$ is the largest eigenvalue $\lambda_1$ of $\mathbf{A}$.

Step-by-Step Proof

Spectral Theorem Application: Since $\mathbf{A}$ is a real symmetric matrix, by the spectral theorem, it can be diagonalized as:

\[ \mathbf{A} = \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^T, \]

where $\mathbf{Q}$ is an orthogonal matrix (i.e., $\mathbf{Q}^T = \mathbf{Q}^{-1}$) and $\mathbf{\Lambda}$ is a diagonal matrix containing the eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_n$ of $\mathbf{A}$, with $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n > 0$ since $\mathbf{A}$ is positive definite.

Quadratic Form Transformation: For any unit vector $\mathbf{v} \in \mathbb{R}^n$ (i.e., $\mathbf{v} \in S$), we can express $\mathbf{v}$ in terms of the orthonormal basis formed by the columns of $\mathbf{Q}$:

$$ \mathbf{v} = \mathbf{Q} \mathbf{y},

$$

where $\mathbf{y}$ is a unit vector ($\mathbf{y}^T \mathbf{y} = 1$) in $\mathbb{R}^n$. Substituting this into the quadratic form gives:

\[ \mathbf{v}^T \mathbf{A} \mathbf{v} = (\mathbf{Q} \mathbf{y})^T \mathbf{A} (\mathbf{Q} \mathbf{y}) = \mathbf{y}^T \mathbf{Q}^T \mathbf{A} \mathbf{Q} \mathbf{y} = \mathbf{y}^T \mathbf{\Lambda} \mathbf{y}. \]

Maximization over Unit Vectors: The expression $\mathbf{y}^T \mathbf{\Lambda} \mathbf{y}$ can be written as:

\[ \mathbf{y}^T \mathbf{\Lambda} \mathbf{y} = \sum_{i=1}^n \lambda_i y_i^2, \]

where $y_i$ are the components of the vector $\mathbf{y}$. Given that $\sum_{i=1}^n y_i^2 = 1$ (since $\mathbf{y}$ is a unit vector), we need to maximize the weighted sum of the eigenvalues with respect to $y_i^2$.

Eigenvalue Maximization:

To maximize $\sum_{i=1}^n \lambda_i y_i^2$, note that the maximum value occurs when $y_1^2 = 1$ and $y_2^2 = y_3^2 = \cdots = y_n^2 = 0$ (because $\lambda_1$ is the largest eigenvalue). Hence,

\[ \max_{\mathbf{y}^T \mathbf{y} = 1} \sum_{i=1}^n \lambda_i y_i^2 = \lambda_1. \]

Therefore,

\[ \max_{\mathbf{v} \in S} \mathbf{v}^T \mathbf{A} \mathbf{v} = \lambda_1. \]

Conclusion: The maximum value of the quadratic form $\mathbf{v}^T \mathbf{A} \mathbf{v}$ over the unit sphere $S$ is indeed the largest eigenvalue $\lambda_1$ of the positive definite matrix $\mathbf{A}$.

(5)

Given:

$\mathbf{A}$ is a positive definite matrix.
$\mathbf{A}$ has distinct eigenvalues $\lambda_1 > \lambda_2 > \cdots > \lambda_n > 0$.
The largest eigenvalue $\lambda_1$ and its associated eigenvector $\mathbf{v}_1$ are known.

We aim to find the second largest eigenvalue $\lambda_2$ using $\lambda_1$ and $\mathbf{v}_1$.

Orthogonal Projection Method

Orthogonality of Eigenvectors: Since $\mathbf{A}$ is symmetric, its eigenvectors corresponding to distinct eigenvalues are orthogonal. Thus, $\mathbf{v}_1$ is orthogonal to all other eigenvectors of $\mathbf{A}$.
Defining the Subspace Orthogonal to $\mathbf{v}_1$: Let $S_1$ be the subspace of $\mathbb{R}^n$ consisting of vectors orthogonal to $\mathbf{v}_1$:

\[ S_1 = \{\mathbf{v} \in \mathbb{R}^n \mid \mathbf{v}^T \mathbf{v}_1 = 0 \}. \]

Rayleigh Quotient in the Subspace: The Rayleigh quotient $R(\mathbf{v})$ for a vector $\mathbf{v}$ is given by:

\[ R(\mathbf{v}) = \frac{\mathbf{v}^T \mathbf{A} \mathbf{v}}{\mathbf{v}^T \mathbf{v}}. \]

We need to maximize this quotient over the subspace $S_1$.

Projection Method: To find $\lambda_2$, we consider the effect of $\mathbf{v}_1$ and deflate $\mathbf{A}$ by projecting it onto the subspace orthogonal to $\mathbf{v}_1$.

Define the matrix $\mathbf{A}'$ as:

\[ \mathbf{A}' = \mathbf{A} - \lambda_1 \mathbf{v}_1 \mathbf{v}_1^T. \]

This matrix $\mathbf{A}'$ removes the influence of $\lambda_1$ and $\mathbf{v}_1$. The matrix $\mathbf{A}'$ still has the same eigenvectors as $\mathbf{A}$, but the eigenvalue $\lambda_1$ associated with $\mathbf{v}_1$ is replaced with 0.

Finding $\lambda_2$: The second largest eigenvalue $\lambda_2$ of $\mathbf{A}$ will be the largest eigenvalue of $\mathbf{A}'$ in the subspace orthogonal to $\mathbf{v}_1$.

To find $\lambda_2$, consider any vector $\mathbf{u}$ in $S_1$:

\[ \mathbf{u} = \mathbf{v} - (\mathbf{v}^T \mathbf{v}_1) \mathbf{v}_1. \]

Here, $\mathbf{u}$ is the projection of $\mathbf{v}$ onto the subspace orthogonal to $\mathbf{v}_1$. Since $\mathbf{v}$ is in $S_1$, $\mathbf{u} = \mathbf{v}$.

Maximizing the Rayleigh Quotient: The Rayleigh quotient in the subspace $S_1$ for the matrix $\mathbf{A}'$ becomes:

\[ R'(\mathbf{v}) = \frac{\mathbf{v}^T \mathbf{A}' \mathbf{v}}{\mathbf{v}^T \mathbf{v}} = \frac{\mathbf{v}^T (\mathbf{A} - \lambda_1 \mathbf{v}_1 \mathbf{v}_1^T) \mathbf{v}}{\mathbf{v}^T \mathbf{v}}. \]

Since $\mathbf{v} \perp \mathbf{v}_1$, we have $\mathbf{v}^T \mathbf{v}_1 = 0$, and thus:

\[ R'(\mathbf{v}) = \frac{\mathbf{v}^T \mathbf{A} \mathbf{v}}{\mathbf{v}^T \mathbf{v}}. \]

Therefore, the maximum value of $R'(\mathbf{v})$ in the subspace $S_1$ gives $\lambda_2$:

\[ \lambda_2 = \max_{\mathbf{v} \in S_1} \frac{\mathbf{v}^T \mathbf{A} \mathbf{v}}{\mathbf{v}^T \mathbf{v}}. \]

Conclusion: By maximizing the Rayleigh quotient in the subspace orthogonal to the eigenvector $\mathbf{v}_1$ associated with $\lambda_1$, we find the second largest eigenvalue $\lambda_2$.

Knowledge

正定矩阵特征值和特征向量 Rayleigh商正交性

难点思路

计算次大特征值的过程可能是一个难点，尤其是如何正确地使用正交性和 Rayleigh 商。

解题技巧和信息

正定矩阵的定义和性质非常重要，尤其是对特征值的影响。
计算特征值时，Rayleigh 商是一个有效工具。
正交性在分解和简化问题中非常有用。

重点词汇

Positive definite matrix 正定矩阵
Eigenvalue 特征值
Eigenvector 特征向量
Rayleigh quotient Rayleigh 商
Orthogonal 正交的

参考资料

Linear Algebra and Its Applications by Gilbert Strang, Chapter 6.
Introduction to Linear Algebra by Gilbert Strang, Chapter 7.

東京大学 新領域創成科学研究科 メディカル情報生命専攻 2018年8月実施 問題8