東京大学 情報理工学系研究科 コンピュータ科学専攻 2021年2月実施 問題3
Author
Description
Let \(\Sigma\) be the set \(\{a, b\}\) of letters. Given two languages \(L_1, L_2 \subseteq \Sigma^*\), we define \(L_1 \triangleleft L_2\) by:
For example, if \(L_1 = \{ab, bb\}\) and \(L_2 = \{aa, abb, bbab\}\), then
For a finite automaton \(\mathbf{A}\), we write \(\mathbf{L(A)}\) for the language accepted by \(\mathbf{A}\).
Answer the following questions.
(1) Let \(L_3 = \{aa, b, bb\}\) and \(L_4 = \{a, b, ab, bb, aaa, bbab\}\). Give the set \(L_3 \triangleleft L_4\).
(2) Let \(L_5\) and \(L_6\) be the languages expressed by the regular expressions \((a^*b)^*\) and \((abba)^*\), respectively. Express \(L_5 \triangleleft L_6\) by using a regular expression.
(3) Let \(\mathbf{A_1} = (Q_1, \Sigma, \delta_1, q_{1,0}, F_1)\) and \(\mathbf{A_2} = (Q_2, \Sigma, \delta_2, q_{2, 0}, F_2)\) be deterministic finite automata. Here, \(Q_i, \delta_i, q_{i, 0},\) and \(F_i\) are the set of states, the transition function, the initial state, and the set of final states of \(\mathbf{A_i}\) (\(i \in \{1, 2\}\)), respectively. Assume that the transition functions \(\delta_i : Q_i \times \Sigma \rightarrow Q_i\) (\(i \in \{1, 2\}\)) are total. Give a non-deterministic finite automaton that accepts \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\), with a brief explanation. You may use \(\epsilon\)-transitions.
(4) Answer whether the following statement is true:
- "For every context-free language \(L\) and regular language \(R\), \(L \triangleleft R\) is a regular language."
Also, give a proof sketch if the answer is yes, and give a counterexample if the answer is no.
设 \(\Sigma\) 为字母集合 \(\{a, b\}\)。给定两个语言 \(L_1, L_2 \subseteq \Sigma^*\),我们定义 \(L_1 \triangleleft L_2\) 如下:
例如,如果 \(L_1 = \{ab, bb\}\) 且 \(L_2 = \{aa, abb, bbab\}\),则
对于一个有限自动机 \(\mathbf{A}\),我们用 \(\mathbf{L(A)}\) 表示 \(\mathbf{A}\) 接受的语言。回答以下问题。
(1) 设 \(L_3 = \{aa, b, bb\}\) 和 \(L_4 = \{a, b, ab, bb, aaa, bbab\}\)。给出集合 \(L_3 \triangleleft L_4\)。
(2) 设 \(L_5\) 和 \(L_6\) 分别由正则表达式 \((a^*b)^*\) 和 \((abba)^*\) 表示。用正则表达式表示 \(L_5 \triangleleft L_6\)。
(3) 设 \(\mathbf{A_1} = (Q_1, \Sigma, \delta_1, q_{1, 0}, F_1)\) 和 \(\mathbf{A_2} = (Q_2, \Sigma, \delta_2, q_{2, 0}, F_2)\) 为确定性有限自动机。这里,\(Q_i, \delta_i, q_{i, 0},\) 和 \(F_i\) 是 \(\mathbf{A_i}\) 的状态集合、转换函数、初态和终态集合(\(i \in \{1, 2\}\))。假设转换函数 \(\delta_i : Q_i \times \Sigma \rightarrow Q_i\)(\(i \in \{1, 2\}\))是全函数。给出一个非确定性有限自动机,该自动机接受 \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\),并简要解释。你可以使用 \(\epsilon\)-转换。
(4) 回答以下陈述是否正确:
- “对于每个上下文无关语言 \(L\) 和正则语言 \(R\),\(L \triangleleft R\) 是正则语言。”
如果答案是肯定的,请给出证明草图;如果答案是否定的,请给出反例。
Kai
(1)
Let \(L_3 = \{aa, b, bb\}\) and \(L_4 = \{a, b, ab, bb, aaa, bbab\}\). We need to find the set \(L_3 \triangleleft L_4\).
We check each element \(v \in L_3\):
- For \(v = aa\):
- \(v \cdot w = aa \cdot w \in L_4\)
- Possible \(w\) are \(\epsilon, a\)
- For \(v = b\):
- \(v \cdot w = b \cdot w \in L_4\)
- Possible \(w\) are \(\epsilon,b, ab, bab\)
- For \(v = bb\):
- \(v \cdot w = bb \cdot w \in L_4\)
- Possible \(w\) are \(\epsilon, ab\)
Collecting all possible \(w\):
(2)
Let \(L_5 = (a^*b)^*\) and \(L_6 = (abba)^*\). We need to express \(L_5 \triangleleft L_6\) using a regular expression.
\(L_5 \triangleleft L_6 = \{w \in \Sigma^* \mid \exists v \in L_5 \text{ such that } vw \in L_6\}\)
Let's analyze this step by step:
- Words in \(L_5\) are of the form \((a^{n_1}ba^{n_2}b…a^{n_k}b)\) where \(k \geq 0\) and \(n_i \geq 0\) for all \(i\).
- Words in \(L_6\) are repetitions of \(abba\).
- The possible prefixes from \(L_5\) that could start a word in \(L_6\) are:
- \(\epsilon\) (empty string)
- \(ab\)
- \(abb\)
Now, let's consider what \(w\) would be in each case:
- If \(v = \epsilon\), then \(w\) can be any word in \(L_6\), i.e., \((abba)^*\)
- If \(v = ab\), then \(w\) must start with \(ba\) and then continue with \((abba)^*\), i.e., \(ba(abba)^*\)
- If \(v = abb\), then \(w\) must start with \(a\) and then continue with \((abba)^*\), i.e., \(a(abba)^*\)
Therefore, \(L_5 \triangleleft L_6\) can be expressed as the union of these possibilities:
\(L_5 \triangleleft L_6 = (abba)^* \cup ba(abba)^* \cup a(abba)^*\)
This can be written more compactly as:
\(L_5 \triangleleft L_6 = (\epsilon + ba + a)(abba)^*\)
or
\(L_5 \triangleleft L_6 = (ba \cup a)(abba)^*\)
This regular expression captures all possible suffixes that, when concatenated with a word from \(L_5\), result in a word from \(L_6\).
(3)
To construct a non-deterministic finite automaton (NFA) that accepts \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\), we can follow these steps:
a) Start with the structure of \(\mathbf{A_1}\).
b) Add \(\epsilon\)-transitions from each state of \(\mathbf{A_1}\) to the initial state of \(\mathbf{A_2}\).
c) Make all final states of \(\mathbf{A_1}\) non-final.
d) Keep the final states of \(\mathbf{A_2}\) as final.
Formally, we can define the NFA \(\mathbf{A} = (Q, \Sigma, \delta, q_{1,0}, F)\) where:
- \(Q = Q_1 \cup Q_2\)
- \(\Sigma\) is the same as before
- \(\delta(q, a) = \delta_1(q, a)\) for \(q \in Q_1, a \in \Sigma\)
- \(\delta(q, a) = \delta_2(q, a)\) for \(q \in Q_2, a \in \Sigma\)
- \(\delta(q, \epsilon) = \{q | q \in Q_2\}\) for \(q \in Q_1\)
- \(q_{1,0}\) is the initial state of \(\mathbf{A_1}\)
- \(F = F_2\)
This NFA simulates \(\mathbf{A_1}\) until it non-deterministically decides to switch to \(\mathbf{A_2}\) using an \(\epsilon\)-transition, then continues in \(\mathbf{A_2}\) until it reaches a final state of \(\mathbf{A_2}\).
(4)
The statement "For every context-free language \(L\) and regular language \(R\), \(L \triangleleft R\) is a regular language" is false.
Counterexample:
Let \(L = \{a^mb^nc^n \mid m, n \geq 0\}\) (a context-free language that is not regular)
Let \(R = a^*b^*c^*\) (a regular language)
Then \(L \triangleleft R = \{b^nc^n \mid n \geq 0\}\), which is not a regular language.
This counterexample shows that the statement is false in general.
Knowledge
正则语言 上下文无关语言 NFA 正则表达式
难点思路
问题 3 和 4 可能对考生来说比较困难。
对于问题 3,关键是理解如何通过 NFA 的非确定性来模拟 \(\triangleleft\) 操作。我们使用 \(\epsilon\)-转移来允许在 \(\mathbf{A_1}\) 的任何状态 " 猜测 " 是否应该开始匹配 \(\mathbf{A_2}\) 的部分。
对于问题 4,难点在于构造合适的反例。需要选择一个非正则的上下文无关语言,并找到一个合适的正则语言,使得它们的 \(\triangleleft\) 操作结果仍然是非正则的。
解题技巧和信息
- 对于涉及新定义操作的题目,先仔细理解定义,然后尝试用具体的例子来理解操作的含义。
- 在处理正则表达式时,考虑语言中单词的结构和可能的前缀。
- 构造 NFA 时,利用非确定性和 \(\epsilon\)-转移来模拟复杂的操作。
- 在证明语言类别相关的陈述时,寻找反例通常比证明正确性更容易。尝试使用经典的非正则上下文无关语言作为起点。
重点词汇
- finite automaton 有限自动机
- non-deterministic finite automaton (NFA) 非确定性有限自动机
- regular expression 正则表达式
- context-free language 上下文无关语言
- \(\epsilon\)-transition \(\epsilon\)-转移
- counterexample 反例
参考资料
- Introduction to Automata Theory, Languages, and Computation by John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman. Chapter 2 (Finite Automata) and Chapter 4 (Context-Free Grammars).
- Sipser, M. (2012). Introduction to the Theory of Computation. Cengage Learning. Chapter 1 (Regular Languages) and Chapter 2 (Context-Free Languages).