東京大学 新領域創成科学研究科 メディカル情報生命専攻 2017年8月実施 問題12
Author
Description
For two strings
Here,
(1) Let
(2) Compute matrices
(3) Suppose matrix
(4) Suppose matrices
对于长度为
这里,
(1) 令
(2) 计算
(3) 假设给定矩阵
(4) 假设给定矩阵
Kai
(1)
Recurrence for
The value of
- If
, then . - If
, then .
This can be summarized as:
Recurrence for
The value of
- If
, then . - If
, then .
This can be summarized as:
(2)
Matrix
Let's fill the matrix
x\y | A | C | A | C | G |
---|---|---|---|---|---|
A | 1 | 1 | 1 | 1 | 1 |
C | 1 | 2 | 2 | 2 | 2 |
T | 1 | 2 | 2 | 2 | 2 |
G | 1 | 2 | 2 | 2 | 3 |
G | 1 | 2 | 2 | 2 | 3 |
Matrix
Let's fill the matrix
x\y | G | C | A | C | A |
---|---|---|---|---|---|
G | 1 | 1 | 1 | 1 | 1 |
G | 1 | 1 | 1 | 1 | 1 |
T | 1 | 1 | 1 | 1 | 1 |
C | 1 | 2 | 2 | 2 | 2 |
A | 1 | 2 | 3 | 3 | 3 |
(3)
To extract one of the longest common subsequences from the matrix
Explanation
- Start from the bottom-right corner of the matrix
, i.e., . - Compare characters of
and : - If
, include in the result and move diagonally to . - If
, move in the direction that gives the larger value (either up or left).
- If
- Continue this process until reaching the top-left corner of the matrix.
- The result will be one of the longest common subsequences.
Pseudocode
function getLCS(x, y, alpha)
i = length(x)
j = length(y)
lcs = ""
while i > 0 and j > 0
if x[i] == y[j]
lcs = x[i] + lcs
i = i - 1
j = j - 1
else if alpha[i-1, j] >= alpha[i, j-1]
i = i - 1
else
j = j - 1
return lcs
(4)
Given matrices
Explanation
- For each position
in : - Combine the length of the prefix up to
( ) with the length of the suffix from onwards ( ) as the length of the common subsequence. - If
, since is included in both the prefix and suffix, subtract 1 from the total length.
- Combine the length of the prefix up to
- The maximum value obtained through this process gives the desired length.
Pseudocode
function maxLengthWithPosition(x, y, alpha, beta, i)
maxLength = 0
for j = 1 to length(y)
currentLength = alpha[i, j] + beta[n-i+1, m-j+1]
if x[i] == y[j]
currentLength = currentLength - 1
maxLength = max(maxLength, currentLength)
return maxLength
Knowledge
动态规划 最长公共子序列 递归
难点思路
对于递归关系的理解和矩阵填充的具体实现可能会比较复杂,需要仔细考虑每一步的递推关系。
解题技巧和信息
- 动态规划表格填充方法:先初始化,然后按照递推关系逐步填充。
- 递归关系需要对字符串字符的匹配情况进行详细考虑,以确保递推关系的正确性。
重点词汇
- common subsequence 公共子序列
- recurrence relation 递推关系
- prefix 前缀
- suffix 后缀
- dynamic programming 动态规划
参考资料
- Introduction to Algorithms, 3rd Edition, Cormen et al., Chapter 15.