Skip to content

東京大学 情報理工学系研究科 コンピュータ科学専攻 2022年2月実施 問題1

Author

zephyr

Description

For a matrix denoted by \(\mathbf{M}\), we write \(\mathbf{M}_{i:j,k:l}\) for the submatrix formed from the rows \(i\) through \(j\) and the columns from \(k\) through \(l\) of the matrix. We also denote \(\mathbf{M}_{i:i,k:l}\) and \(\mathbf{M}_{i:j,k:k}\) by \(\mathbf{M}_{i,k:l}\) and \(\mathbf{M}_{i:j,k}\), respectively. (Note that they are row and column vectors, respectively.)

Assume below that \(\mathbf{A}\) is an \(n \times n\) nonsingular matrix which can be LU factorized. In addition, assume that a positive integer \(w\) exists such that all entries in the \(k\)-th sub- and super-diagonal of \(\mathbf{A}\) are zero if \(k > w\). Namely, if \(|i - j| > w\), \(A_{i,j}\) is zero.

We denote the LU factorization of \(\mathbf{A}\) by \(\mathbf{A} = \mathbf{LU}\), where \(\mathbf{L}\) is a lower triangular matrix with unit diagonal elements, and \(\mathbf{U}\) is an upper triangular matrix.

The following algorithm P computes \(\mathbf{L}\) and \(\mathbf{U}\) from the input \(\mathbf{A}\) in-place in such a way that the strictly lower triangular part and the upper triangular part of \(\mathbf{A}\) will be \(\mathbf{L}\) and \(\mathbf{U}\), respectively. In other words, \(A_{i,j}\) will be \(L_{i,j}\) for \(i > j\), and \(A_{i,j}\) will be \(U_{i,j}\) for \(i \leq j\), when it terminates.

Algorithm P

1
2
3
4
for (k = 1; k < n; k = k + 1) {
    A[k+1:n,k] <- A[k+1:n,k] * (1 / A[k,k]);
    A[k+1:n,k+1:n] <- A[k+1:n,k+1:n] - A[k+1:n,k] * A[k,k+1:n];
}

Answer the following questions.

(1) Compute the LU factorization of the following matrix.

\[ \begin{pmatrix} 1 & 1 & 0 & 0 \\ -1 & 1 & 1 & 0 \\ 0 & -2 & 2 & 1 \\ 0 & 0 & -3 & 3 \end{pmatrix} \]

(2) Prove that \(L_{i,j} = 0\) and \(U_{i,j} = 0\) if \(|i - j| > w\).

(3) Assume that \(n \gg 1\), \(w \ll n\), and \(m \gg n\). We wish to solve a set of linear systems,

\[ \mathbf{A}\mathbf{x}_i = \mathbf{b}_i, \quad i = 1, 2, \ldots, m, \]

where \(\mathbf{b}_i \, (i = 1, 2, \ldots, m)\) are the constant vectors, and \(\mathbf{x}_i \, (i = 1, 2, \ldots, m)\) are the unknown vectors. Explain the relative merits of the following methods (I) and (II) with respect to the amount of arithmetic operations.

  • (I) Compute \(\mathbf{A}^{-1}\) first, and then compute each \(\mathbf{x}_i\) as \(\mathbf{A}^{-1} \mathbf{b}_i\).
  • (II) Compute the LU factorization of \(\mathbf{A}\) first, and then solve \(\mathbf{L} \mathbf{y}_i = \mathbf{b}_i\) for \(\mathbf{y}_i\). Solve \(\mathbf{U} \mathbf{x}_i = \mathbf{y}_i\) for \(\mathbf{x}_i\) lastly.

对于矩阵 \(\mathbf{M}\),我们写 \(\mathbf{M}_{i:j,k:l}\) 表示从矩阵的第 \(i\) 行到第 \(j\) 行以及第 \(k\) 列到第 \(l\) 列构成的子矩阵。我们还将 \(\mathbf{M}_{i:i,k:l}\)\(\mathbf{M}_{i:j,k:k}\) 分别表示为 \(\mathbf{M}_{i,k:l}\)\(\mathbf{M}_{i:j,k}\)。(注意它们分别是行向量和列向量。)

假设 \(\mathbf{A}\) 是一个 \(n \times n\) 的非奇异矩阵,可以进行 LU 分解。另外,假设存在一个正整数 \(w\),使得当 \(|i - j| > w\) 时,\(\mathbf{A}\) 在第 \(k\) 条下对角线和超对角线上的所有元素都为零。即,如果 \(|i - j| > w\),则 \(A_{i,j}\) 为零。

我们用 \(\mathbf{A} = \mathbf{LU}\) 来表示 \(\mathbf{A}\) 的 LU 分解,其中 \(\mathbf{L}\) 是具有单位对角元素的下三角矩阵,\(\mathbf{U}\) 是上三角矩阵。

下面的算法 P 从输入 \(\mathbf{A}\) 中就地计算 \(\mathbf{L}\)\(\mathbf{U}\),以这样的方式,\(\mathbf{A}\) 的严格下三角部分和上三角部分将分别是 \(\mathbf{L}\)\(\mathbf{U}\)。换句话说,当它终止时,\(A_{i,j}\) 将为 \(L_{i,j}\)(当 \(i > j\))和 \(U_{i,j}\)(当 \(i \leq j\))。

算法 P

1
2
3
4
for (k = 1; k < n; k = k + 1) {
    A[k+1:n,k] <- A[k+1:n,k] * (1 / A[k,k]);
    A[k+1:n,k+1:n] <- A[k+1:n,k+1:n] - A[k+1:n,k] * A[k,k+1:n];
}

回答以下问题。

(1) 计算以下矩阵的 LU 分解。

\[ \begin{pmatrix} 1 & 1 & 0 & 0 \\ -1 & 1 & 1 & 0 \\ 0 & -2 & 2 & 1 \\ 0 & 0 & -3 & 3 \end{pmatrix} \]

(2) 证明 \(L_{i,j} = 0\)\(U_{i,j} = 0\) 如果 \(|i - j| > w\)

(3) 假设 \(n \gg 1\)\(w \ll n\),并且 \(m \gg n\)。我们希望求解一组线性系统,

\[ \mathbf{A} \mathbf{x}_i = \mathbf{b}_i, \quad i = 1, 2, \ldots, m, \]

其中 \(\mathbf{b}_i \, (i = 1, 2, \ldots, m)\) 是常量向量,\(\mathbf{x}_i \, (i = 1, 2, \ldots, m)\) 是未知向量。解释以下方法 (I) 和 (II) 在算术运算量方面的相对优点。

  • (I) 首先计算 \(\mathbf{A}^{-1}\),然后将每个 \(\mathbf{x}_i\) 计算为 \(\mathbf{A}^{-1} \mathbf{b}_i\)
  • (II) 首先计算 \(\mathbf{A}\) 的 LU 分解,然后解 \(\mathbf{L} \mathbf{y}_i = \mathbf{b}_i\) 得到 \(\mathbf{y}_i\)。最后解 \(\mathbf{U} \mathbf{x}_i = \mathbf{y}_i\)

Kai

(1)

We start by applying the given algorithm P to compute the LU factorization of \(\mathbf{A}\).

Step-by-step computation:

  1. Initial matrix \(\mathbf{A}\):
\[ \mathbf{A} = \begin{pmatrix} 1 & 1 & 0 & 0 \\ -1 & 1 & 1 & 0 \\ 0 & -2 & 2 & 1 \\ 0 & 0 & -3 & 3 \end{pmatrix} \]
  1. First iteration (\(k = 1\)):
  2. Update \(A[2:4,1]\) by dividing by \(A[1,1]\):
\[ A[2:4,1] \leftarrow A[2:4,1] \times (1 / A[1,1]) = \begin{pmatrix} -1 \\ 0 \\ 0 \end{pmatrix} \times 1 = \begin{pmatrix} -1 \\ 0 \\ 0 \end{pmatrix} \]
  • Update \(A[2:4,2:4]\):
\[ A[2:4,2:4] \leftarrow A[2:4,2:4] - A[2:4,1] \times A[1,2:4] = \begin{pmatrix} 1 & 1 & 0 \\ -2 & 2 & 1 \\ 0 & -3 & 3 \end{pmatrix} - \begin{pmatrix} -1 \\ 0 \\ 0 \end{pmatrix} \times \begin{pmatrix} 1 & 0 & 0 \end{pmatrix} \]

Simplifying:

\[ A[2:4,2:4] = \begin{pmatrix} 2 & 1 & 0 \\ -2 & 2 & 1 \\ 0 & -3 & 3 \end{pmatrix} \]
  1. Second iteration (\(k = 2\)):
  2. Update \(A[3:4,2]\) by dividing by \(A[2,2]\):
\[ A[3:4,2] \leftarrow A[3:4,2] \times (1 / A[2,2]) = \begin{pmatrix} -2 \\ 0 \end{pmatrix} \times \frac{1}{2} = \begin{pmatrix} -1 \\ 0 \end{pmatrix} \]
  • Update \(A[3:4,3:4]\):
\[ A[3:4,3:4] \leftarrow A[3:4,3:4] - A[3:4,2] \times A[2,3:4] = \begin{pmatrix} 2 & 1 \\ -3 & 3 \end{pmatrix} - \begin{pmatrix} -1 \\ 0 \end{pmatrix} \times \begin{pmatrix} 1 & 0 \end{pmatrix} \]
1
Simplifying:
\[ A[3:4,3:4] = \begin{pmatrix} 3 & 1 \\ -3 & 3 \end{pmatrix} \]
  1. Third iteration (\(k = 3\)):
  2. Update \(A[4,3]\) by dividing by \(A[3,3]\):
\[ A[4,3] \leftarrow A[4,3] \times (1 / A[3,3]) = -3 \times \frac{1}{3} = -1 \]
  • Update \(A[4,4]\):
\[ A[4,4] \leftarrow A[4,4] - A[4,3] \times A[3,4] = 3 - (-1) \times 1 = 4 \]

The resulting matrices \(\mathbf{L}\) and \(\mathbf{U}\) are:

\[ \mathbf{L} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 \\ 0 & -1 & 1 & 0 \\ 0 & 0 & -1 & 1 \end{pmatrix}, \quad \mathbf{U} = \begin{pmatrix} 1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 0 \\ 0 & 0 & 3 & 1 \\ 0 & 0 & 0 & 4 \end{pmatrix} \]

(2)

Given that \(\mathbf{A}\) is banded with bandwidth \(w\), meaning \(A_{i,j} = 0\) for \(|i - j| > w\). The LU factorization \(\mathbf{A} = \mathbf{LU}\) maintains this band structure:

  • Lower Triangular Matrix \(\mathbf{L}\):
  • By definition, \(\mathbf{L}\) has non-zero entries only in the lower triangular part and on the diagonal. For \(|i - j| > w\), \(L_{i,j} = 0\) because \(A_{i,j} = 0\) for these entries. Hence, \(\mathbf{L}\) also maintains the band structure.

  • Upper Triangular Matrix \(\mathbf{U}\):

  • Similarly, \(\mathbf{U}\) has non-zero entries only in the upper triangular part. For \(|i - j| > w\), \(U_{i,j} = 0\) because \(A_{i,j} = 0\) for these entries. Hence, \(\mathbf{U}\) maintains the band structure.

Therefore, \(L_{i,j} = 0\) and \(U_{i,j} = 0\) if \(|i - j| > w\).

(3)

Method (I): Compute \(\mathbf{A}^{-1}\) first, then \(\mathbf{x}_i = \mathbf{A}^{-1} \mathbf{b}_i\)

  • Steps:
  • Compute \(\mathbf{A}^{-1}\), which requires \(O(n^3)\) operations.
  • Compute each \(\mathbf{x}_i = \mathbf{A}^{-1} \mathbf{b}_i\), which requires \(O(n^2)\) operations per \(\mathbf{x}_i\).
  • Total operations for \(m\) systems: \(O(n^3) + m \cdot O(n^2) = O(n^3 + mn^2)\).

Method (II): Compute the LU factorization, then solve \(\mathbf{L} \mathbf{y}_i = \mathbf{b}_i\) and \(\mathbf{U} \mathbf{x}_i = \mathbf{y}_i\)

  • Steps:
  • Compute the LU factorization of \(\mathbf{A}\), which requires \(O(n^3)\) operations.
  • Solve \(\mathbf{L} \mathbf{y}_i = \mathbf{b}_i\) and \(\mathbf{U} \mathbf{x}_i = \mathbf{y}_i\) for each \(\mathbf{x}_i\), which requires \(O(n^2)\) operations per \(\mathbf{x}_i\).

  • Total operations for \(m\) systems: \(O(n^3) + m \cdot O(n^2) = O(n^3 + mn^2)\).

Comparison

  • Arithmetic Operations:
  • Both methods have the same asymptotic complexity \(O(n^3 + mn^2)\).

  • Efficiency in Practice:

  • Method (II) is generally more efficient and numerically stable because it avoids directly computing the inverse of \(\mathbf{A}\), which can be computationally expensive and prone to numerical errors.

  • Storage Requirements:

  • Method (I) requires storing \(\mathbf{A}^{-1}\), which can be memory-intensive.
  • Method (II) only requires storing the LU factors, which is typically more storage-efficient.

Conclusion: Method (II) is preferred due to its numerical stability and potentially lower storage requirements, even though both methods have the same theoretical computational complexity.

Knowledge

LU分解 稀疏矩阵 复杂度分析 递归 内存效率

难点解题思路

对大型稀疏矩阵,直接计算逆矩阵通常会引入数值不稳定性,且计算开销较大。LU 分解通过三角矩阵简化了问题,解决了计算复杂度和数值稳定性的问题。

解题技巧和信息

对于稀疏矩阵的 LU 分解,注意带状结构的保持,有助于减少计算量。在求解线性方程组时,优先考虑分解矩阵的方法以优化计算效率。

重点词汇

  • LU factorization LU 分解
  • band matrix 带状矩阵
  • numerical stability 数值稳定性
  • computational complexity 计算复杂度

参考资料

  1. Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations. Johns Hopkins University Press. Chap. 3.
  2. Trefethen, L. N., & Bau, D. (1997). Numerical Linear Algebra. SIAM. Chap. 21.