東京大学 新領域創成科学研究科 メディカル情報生命専攻 2017年8月実施 問題8
Author
Description
Let \(\mathbf{A}\) be an \(n \times m\) real matrix with positive rank \(r\). Such a matrix has a singular value decomposition \(\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T\), where \(\mathbf{U}\) and \(\mathbf{V}\) are \(n \times r\), \(m \times r\) real matrices, respectively, and satisfy \(\mathbf{U}^T \mathbf{U} = \mathbf{I}_r\), \(\mathbf{V}^T \mathbf{V} = \mathbf{I}_r\) (\(\mathbf{I}_d\): \(d \times d\) unit matrix, \(\mathbf{M}^T\): transpose of matrix \(\mathbf{M}\)). \(\mathbf{\Sigma}\) is an \(r \times r\) real diagonal matrix whose diagonal elements \(\Sigma_{kk} = \sigma_k\) (\(k = 1, \ldots, r\)) satisfy \(\sigma_1 \geq \cdots \geq \sigma_r > 0\).
(1) Describe all the positive eigenvalues and associated normalized eigenvectors of matrix \(\mathbf{A}^T \mathbf{A}\).
(2) Let \(T_{\mathbf{A}}: \mathbb{R}^m \to \mathbb{R}^n\) be a linear mapping defined by \(T_{\mathbf{A}} (\mathbf{x}) = \mathbf{A} \mathbf{x}\). Describe the conditions on \(n, m, r\) such that \(T_{\mathbf{A}}\) is surjective. Also, describe the conditions on \(n, m, r\) such that \(T_{\mathbf{A}}\) is injective.
(3) The pseudoinverse of \(\mathbf{A}\) is defined by \(\mathbf{A}^+ = \mathbf{V} \mathbf{\Sigma}^{-1} \mathbf{U}^T\). Let \(\mathbf{B} = (\mathbf{I}_m - \mathbf{A}^+ \mathbf{A})\) and define linear mapping \(T_{\mathbf{B}}: \mathbb{R}^m \to \mathbb{R}^m\) by \(T_{\mathbf{B}} (\mathbf{x}) = \mathbf{B} \mathbf{x}\). Show that image \(\mathrm{Im}(T_{\mathbf{B}}) = \{\mathbf{B} \mathbf{x} \mid \mathbf{x} \in \mathbb{R}^m\}\) is linearly isomorphic to kernel \(\mathrm{Ker}(T_{\mathbf{A}}) = \{\mathbf{x} \in \mathbb{R}^m \mid \mathbf{A} \mathbf{x} = \mathbf{0}_d\}\) (\(\mathbf{0}_d\): \(d\) dimensional zero vector).
(4) Show that \(\mathbf{x} = \mathbf{x}_1 + \mathbf{x}_2\) (\(\mathbf{x}_1 = \mathbf{B} \mathbf{x}\), \(\mathbf{x}_2 = (\mathbf{x} - \mathbf{x}_1)\)) is an orthogonal decomposition.
(5) For a given \(\mathbf{b} \in \mathbb{R}^n\), let \(\mathbf{x}_0 = \mathbf{A}^+ \mathbf{b} \in \mathbb{R}^m\). Show that \(\mathbf{x} = \mathbf{x}_0\) minimizes \((\mathbf{A} \mathbf{x} - \mathbf{b})^T (\mathbf{A} \mathbf{x} - \mathbf{b})\). (Hint: \(\mathbf{A} \mathbf{x} - \mathbf{b} = \mathbf{A} (\mathbf{x} - \mathbf{x}_0) + (\mathbf{A} \mathbf{x}_0 - \mathbf{b})\))
设 \(\mathbf{A}\) 为一个 \(n \times m\) 的实矩阵,且正秩为 \(r\)。这样的矩阵有一个奇异值分解 \(\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T\),其中 \(\mathbf{U}\) 和 \(\mathbf{V}\) 分别是 \(n \times r\)、\(m \times r\) 的实矩阵,并且满足 \(\mathbf{U}^T \mathbf{U} = \mathbf{I}_r\),\(\mathbf{V}^T \mathbf{V} = \mathbf{I}_r\)(\(\mathbf{I}_d\):\(d \times d\) 单位矩阵,\(\mathbf{M}^T\):矩阵 \(\mathbf{M}\) 的转置)。\(\mathbf{\Sigma}\) 是一个 \(r \times r\) 的实对角矩阵,其对角元素 \(\Sigma_{kk} = \sigma_k\)(\(k = 1, \ldots, r\))满足 \(\sigma_1 \geq \cdots \geq \sigma_r > 0\)。
(1) 描述矩阵 \(\mathbf{A}^T \mathbf{A}\) 的所有正特征值和相关的归一化特征向量。
(2) 令 \(T_{\mathbf{A}}: \mathbb{R}^m \to \mathbb{R}^n\) 为由 \(T_{\mathbf{A}} (\mathbf{x}) = \mathbf{A} \mathbf{x}\) 定义的线性映射。描述 \(n, m, r\) 的条件,使得 \(T_{\mathbf{A}}\) 是满射。同时,描述 \(n, m, r\) 的条件,使得 \(T_{\mathbf{A}}\) 是单射。
(3) \(\mathbf{A}\) 的伪逆定义为 \(\mathbf{A}^+ = \mathbf{V} \mathbf{\Sigma}^{-1} \mathbf{U}^T\)。令 \(\mathbf{B} = (\mathbf{I}_m - \mathbf{A}^+ \mathbf{A})\) 并定义线性映射 \(T_{\mathbf{B}}: \mathbb{R}^m \to \mathbb{R}^m\) 由 \(T_{\mathbf{B}} (\mathbf{x}) = \mathbf{B} \mathbf{x}\)。证明 \(\mathrm{Im}(T_{\mathbf{B}}) = \{\mathbf{B} \mathbf{x} \mid \mathbf{x} \in \mathbb{R}^m\}\) 在线性上同构于 \(\mathrm{Ker}(T_{\mathbf{A}}) = \{\mathbf{x} \in \mathbb{R}^m \mid \mathbf{A} \mathbf{x} = \mathbf{0}_d\}\)(\(\mathbf{0}_d\):\(d\) 维零向量)。
(4) 证明 \(\mathbf{x} = \mathbf{x}_1 + \mathbf{x}_2\)(\(\mathbf{x}_1 = \mathbf{B} \mathbf{x}\),\(\mathbf{x}_2 = (\mathbf{x} - \mathbf{x}_1)\))是一个正交分解。
(5) 对于给定的 \(\mathbf{b} \in \mathbb{R}^n\),令 \(\mathbf{x}_0 = \mathbf{A}^+ \mathbf{b} \in \mathbb{R}^m\)。证明 \(\mathbf{x} = \mathbf{x}_0\) 最小化 \((\mathbf{A} \mathbf{x} - \mathbf{b})^T (\mathbf{A} \mathbf{x} - \mathbf{b})\)。 (提示:\(\mathbf{A} \mathbf{x} - \mathbf{b} = \mathbf{A} (\mathbf{x} - \mathbf{x}_0) + (\mathbf{A} \mathbf{x}_0 - \mathbf{b})\))
Kai
(1)
Given the singular value decomposition (SVD) of \(\mathbf{A}\) as \(\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T\), we can express \(\mathbf{A}^T \mathbf{A}\) as follows:
The matrix \(\mathbf{\Sigma}^2\) is diagonal with the diagonal elements \(\sigma_k^2\) (\(k = 1, \ldots, r\)). Thus, the positive eigenvalues of \(\mathbf{A}^T \mathbf{A}\) are exactly the \(\sigma_k^2\), and the associated normalized eigenvectors are the columns of \(\mathbf{V}\).
(2)
Surjective (onto): The mapping \(T_{\mathbf{A}}: \mathbb{R}^m \to \mathbb{R}^n\) is surjective if the range of \(\mathbf{A}\) spans \(\mathbb{R}^n\), i.e., \(\mathbf{A}\) has full row rank. This occurs when \(r = n \leq m\).
Injective (one-to-one): The mapping \(T_{\mathbf{A}}\) is injective if the kernel of \(\mathbf{A}\) contains only the zero vector, i.e., \(\mathbf{A}\) has full column rank. This occurs when \(r = m \leq n\).
(3)
The pseudoinverse \(\mathbf{A}^+\) is defined as \(\mathbf{A}^+ = \mathbf{V} \mathbf{\Sigma}^{-1} \mathbf{U}^T\). Consider \(\mathbf{B} = \mathbf{I}_m - \mathbf{A}^+ \mathbf{A}\).
We need to show that \(\mathrm{Im}(T_{\mathbf{B}})\) is isomorphic to \(\mathrm{Ker}(T_{\mathbf{A}})\). Observe the following:
Thus, \(\mathrm{Im}(\mathbf{B}) \subseteq \mathrm{Ker}(\mathbf{A})\).
Now, consider \(\mathbf{x} \in \mathrm{Ker}(\mathbf{A})\). Then \(\mathbf{A} \mathbf{x} = \mathbf{0}\), and
Thus, \(\mathbf{x} \in \mathrm{Im}(\mathbf{B})\). Therefore, \(\mathrm{Im}(\mathbf{B}) = \mathrm{Ker}(\mathbf{A})\).
(4)
Given \(\mathbf{x} = \mathbf{x}_1 + \mathbf{x}_2\) where \(\mathbf{x}_1 = \mathbf{B} \mathbf{x}\) and \(\mathbf{x}_2 = \mathbf{x} - \mathbf{x}_1\):
To show orthogonality:
Since \(\mathbf{B}\) is symmetric (\(\mathbf{B} = \mathbf{I}_m - \mathbf{A}^+ \mathbf{A}\)):
Thus, \(\mathbf{x}_1\) and \(\mathbf{x}_2\) are orthogonal.
(5)
Let \(\mathbf{x}_0 = \mathbf{A}^+ \mathbf{b}\). We need to show that \(\mathbf{x} = \mathbf{x}_0\) minimizes the expression.
Consider the error:
Since \(\mathbf{x}_0 = \mathbf{A}^+ \mathbf{b}\), we have \(\mathbf{A} \mathbf{x}_0 = \mathbf{b}\), thus:
The norm to be minimized is:
This is minimized when \(\mathbf{x} = \mathbf{x}_0\) since \(\mathbf{A} \mathbf{x}_0 = \mathbf{b}\) and \(\mathbf{A} (\mathbf{x} - \mathbf{x}_0) = \mathbf{0}\).
Knowledge
奇异值分解 线性映射 广义逆矩阵 正交分解 线性代数
重点词汇
- singular value decomposition (SVD) 奇异值分解
- pseudoinverse 广义逆
- surjective 满射
- injective 单射
- orthogonal decomposition 正交分解
参考资料
- "Linear Algebra and Its Applications" by Gilbert Strang, Chapter 7: The Singular Value Decomposition (SVD)
- "Matrix Computations" by Gene H. Golub and Charles F. Van Loan, Chapter 2: Matrix Analysis