跳到主要内容

京都大学 情報学研究科 知能情報学専攻 2024年2月実施 基礎科目 F2-2

Author

itsuitsuki, 祭音Myyura (assisted by ChatGPT 5.4 Thinking)

Description (English)

We represent an array of length , whose elements are alphabet characters. We call a pattern. An element of can be accessed by , where is an index. denotes a contiguous subarray of starting from index to inclusively, where . denotes the -character prefix of , while is the empty string and . The prefix function for a pattern returns an array of length , such that each element with an index is computed by

where denotes that is a suffix of . That is, is the length of the longest prefix of that is also a proper suffix of . Note that a proper suffix cannot be the whole string.

(1) Compute the results of the prefix function for the pattern aabaacaabaa.

(2) Algorithm 1 is a pseudo-code of an algorithm for computing the results of the prefix function for a pattern . Fill the blanks (a), (b), and (c).

Algorithm 1 COMPUTE-PREFIX-FUNCTION(P):

m = P.length
let π be a new array for keeping the results of the prefix function
π[1] = 0
k = 0
for q = 2 to m do
while k > 0 and P[k + 1] ≠ P[q] do
(a)
end while
if P[k + 1] = P[q] then
(b)
end if
(c)
end for
return π

We represent an array of length and an array of length . The elements of both and are alphabet characters. We call a text and a pattern. We say that a pattern occurs with a shift in a text if and (that is, if , for ). The string-matching problem is the problem of finding all shifts with which a given pattern occurs in a given text .

(3) Algorithm 2 is a pseudo-code of an algorithm for the string-matching problem utilizing the results of the prefix function computed with Algorithm 1. Fill the blanks (d), (e), and (f).

Algorithm 2 STRING-MATCHING(T, P):

n = T.length
m = P.length
π = COMPUTE-PREFIX-FUNCTION(P)
q = 0
for i = 1 to n do
while q > 0 and P[q + 1] ≠ T[i] do
(d)
end while
if P[q + 1] == T[i] then
(e)
end if
if q == m then
print "Pattern occurs with shift" i - m
(f)
end if
end for

(4) Show the time complexity of Algorithm 2 with reasons.

Kai

Q.1

(1) Prefix values for P = aabaacaabaa

Let

P = a a b a a c a a b a a
1 2 3 4 5 6 7 8 9 10 11

The prefix-function values are

π[1..11] = 0, 1, 0, 1, 2, 0, 1, 2, 3, 4, 5.

A compact table is:

q1234567891011
P[q]aabaacaabaa
π[q]01012012345

(2) Fill the blanks of Algorithm 1

This is the standard prefix-function procedure.

  • (a)
k = π[k]
  • (b)
k = k + 1
  • (c)
π[q] = k

Q2

(3) Fill the blanks of Algorithm 2

This is the standard KMP string-matching algorithm.

  • (d)
q = π[q]
  • (e)
q = q + 1
  • (f)
q = π[q]

(4) Time complexity of Algorithm 2

Once the array π has already been computed, Algorithm 2 runs in

O(n)

time.

Reason:

  • The outer for loop runs n times.
  • Inside the loop, q can increase by at most 1 in each iteration.
  • Every time the while loop executes, q is replaced by π[q], which is strictly smaller than q.
  • Therefore the total number of decreases of q over the whole execution is at most the total number of increases of q, which is O(n).

Hence the total running time of Algorithm 2 is linear in the text length.

If the preprocessing step COMPUTE-PREFIX-FUNCTION(P) is also included, then the total time becomes

O(m + n).