東京大学 新領域創成科学研究科 メディカル情報生命専攻 2024年1月実施 問題12
Author
Description
ナノポアシークエンサーは DNA 配列を電流値の変化を通して読み取ることができる。
ある DNA 配列を読み取ったところ、時間
DNA 配列を読み取る速度にはばらつきがあり急な変動があるため、基準配列を探す場合には時間軸方向へのずれを許したい。
このため、

(1)
(2)
(3)
(4)
(5) 基準配列
Kai
1. Finding for
When
- Initialize
to 1. - For each
, compute the dissimilarity . - Find the index
that minimizes .
The resulting
Algorithm
Input: A sequence {A_t}, reference value B_1
Output: Index p_1 that minimizes dissimilarity
1. Initialize min_dissimilarity = infinity
2. Initialize p_1 = 1
3. For i = 1 to n do
d_i = (B_1 - A_i)^2
If d_i < min_dissimilarity then
min_dissimilarity = d_i
p_1 = i
4. Return p_1
2. Finding for
When
- Initialize minimum dissimilarity to infinity and
to 1. - For each
, do: - For each
, do: - Compute the dissimilarity
. - If
, update , , and .
- Compute the dissimilarity
- For each
Algorithm
Input: A sequence {A_t}, reference values {B_1, B_2}, window W
Output: Indices p_1, p_2 that minimize dissimilarity
1. Initialize min_dissimilarity = infinity
2. Initialize p_1, p_2 = 1
3. For i = 1 to n do
For j = i to min(i + W, n) do
d_ij = (B_1 - A_i)^2 + (B_2 - A_j)^2
If d_ij < min_dissimilarity then
min_dissimilarity = d_ij
p_1 = i
p_2 = j
4. Return p_1, p_2
3. Physical Meaning of
The parameter
4. Recursive Formula for
The minimum dissimilarity
5. Algorithm for Minimum Dissimilarity for
To calculate the minimum dissimilarity between the sequence
- Initialize a 2D array
of size with infinity, where represents the minimum dissimilarity for the first elements of the reference sequence mapped to the first elements of the output sequence. - Set
for all . - For each
, do: - For each
, do: - Set
.
- Set
- For each
Time Complexity
The time complexity of this algorithm is
is the length of the reference signal, is the length of the sequence , is the maximum allowed shift between adjacent points.
Knowledge
难点思路
本题的主要难点在于第 4 和第 5 问,需要正确地推导出动态规划的递推关系,并设计出高效的算法。关键是要理解如何处理时间轴偏移的约束条件,以及如何在保证正确性的同时优化算法的时间复杂度。
解题技巧
对于序列对齐问题,动态规划是一种有效的解决方案。特别是在处理时间序列中的波动和对齐问题时,可以灵活调整对齐窗口
重点词汇
- Dissimilarity: 相异度
- Reference Signal: 参考信号
- Window: 窗口
参考资料
- Dynamic Time Warping Algorithm: Introduction to the dynamic time warping algorithm, Wikipedia.
- Time Series Analysis and Its Applications, Third Edition, Robert H. Shumway and David S. Stoffer.