東京大学情報理工学系研究科コンピュータ科学専攻 2021年2月実施問題3

Author

Description

Let \(\Sigma\) be the set \(\{a, b\}\) of letters. Given two languages \(L_1, L_2 \subseteq \Sigma^*\), we define \(L_1 \triangleleft L_2\) by:

\[ L_1 \triangleleft L_2 = \{w \in \Sigma^* \mid \exists v \in L_1.vw \in L_2\}. \]

For example, if \(L_1 = \{ab, bb\}\) and \(L_2 = \{aa, abb, bbab\}\), then

\[ L_1 \triangleleft L_2 = \{b, ab\}. \]

For a finite automaton \(\mathbf{A}\), we write \(\mathbf{L(A)}\) for the language accepted by \(\mathbf{A}\).

Answer the following questions.

(1) Let \(L_3 = \{aa, b, bb\}\) and \(L_4 = \{a, b, ab, bb, aaa, bbab\}\). Give the set \(L_3 \triangleleft L_4\).

(2) Let \(L_5\) and \(L_6\) be the languages expressed by the regular expressions \((a^*b)^*\) and \((abba)^*\), respectively. Express \(L_5 \triangleleft L_6\) by using a regular expression.

(3) Let \(\mathbf{A_1} = (Q_1, \Sigma, \delta_1, q_{1,0}, F_1)\) and \(\mathbf{A_2} = (Q_2, \Sigma, \delta_2, q_{2, 0}, F_2)\) be deterministic finite automata. Here, \(Q_i, \delta_i, q_{i, 0},\) and \(F_i\) are the set of states, the transition function, the initial state, and the set of final states of \(\mathbf{A_i}\) (\(i \in \{1, 2\}\)), respectively. Assume that the transition functions \(\delta_i : Q_i \times \Sigma \rightarrow Q_i\) (\(i \in \{1, 2\}\)) are total. Give a non-deterministic finite automaton that accepts \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\), with a brief explanation. You may use \(\epsilon\)-transitions.

(4) Answer whether the following statement is true:

"For every context-free language \(L\) and regular language \(R\), \(L \triangleleft R\) is a regular language."

Also, give a proof sketch if the answer is yes, and give a counterexample if the answer is no.

设 \(\Sigma\) 为字母集合 \(\{a, b\}\)。给定两个语言 \(L_1, L_2 \subseteq \Sigma^*\)，我们定义 \(L_1 \triangleleft L_2\) 如下：

\[ L_1 \triangleleft L_2 = \{w \in \Sigma^* \mid \exists v \in L_1.vw \in L_2\}。 \]

例如，如果 \(L_1 = \{ab, bb\}\) 且 \(L_2 = \{aa, abb, bbab\}\)，则

\[ L_1 \triangleleft L_2 = \{b, ab\}。 \]

对于一个有限自动机 \(\mathbf{A}\)，我们用 \(\mathbf{L(A)}\) 表示 \(\mathbf{A}\) 接受的语言。回答以下问题。

(1) 设 \(L_3 = \{aa, b, bb\}\) 和 \(L_4 = \{a, b, ab, bb, aaa, bbab\}\)。给出集合 \(L_3 \triangleleft L_4\)。

(2) 设 \(L_5\) 和 \(L_6\) 分别由正则表达式 \((a^*b)^*\) 和 \((abba)^*\) 表示。用正则表达式表示 \(L_5 \triangleleft L_6\)。

(3) 设 \(\mathbf{A_1} = (Q_1, \Sigma, \delta_1, q_{1, 0}, F_1)\) 和 \(\mathbf{A_2} = (Q_2, \Sigma, \delta_2, q_{2, 0}, F_2)\) 为确定性有限自动机。这里，\(Q_i, \delta_i, q_{i, 0},\) 和 \(F_i\) 是 \(\mathbf{A_i}\) 的状态集合、转换函数、初态和终态集合（\(i \in \{1, 2\}\)）。假设转换函数 \(\delta_i : Q_i \times \Sigma \rightarrow Q_i\)（\(i \in \{1, 2\}\)）是全函数。给出一个非确定性有限自动机，该自动机接受 \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\)，并简要解释。你可以使用 \(\epsilon\)-转换。

(4) 回答以下陈述是否正确：

“对于每个上下文无关语言 \(L\) 和正则语言 \(R\)，\(L \triangleleft R\) 是正则语言。”

如果答案是肯定的，请给出证明草图；如果答案是否定的，请给出反例。

Kai

(1)

Let \(L_3 = \{aa, b, bb\}\) and \(L_4 = \{a, b, ab, bb, aaa, bbab\}\). We need to find the set \(L_3 \triangleleft L_4\).

\[ L_3 \triangleleft L_4 = \{ w \in \Sigma^* \mid \exists v \in L_3 \text{ such that } vw \in L_4 \} \]

We check each element \(v \in L_3\):

For \(v = aa\):
\(v \cdot w = aa \cdot w \in L_4\)
Possible \(w\) are \(\epsilon, a\)
For \(v = b\):
\(v \cdot w = b \cdot w \in L_4\)
Possible \(w\) are \(\epsilon,b, ab, bab\)
For \(v = bb\):
\(v \cdot w = bb \cdot w \in L_4\)
Possible \(w\) are \(\epsilon, ab\)

Collecting all possible \(w\):

\[ L_3 \triangleleft L_4 = \{\epsilon, a, b, ab, bab\} \]

(2)

Let \(L_5 = (a^*b)^*\) and \(L_6 = (abba)^*\). We need to express \(L_5 \triangleleft L_6\) using a regular expression.

\(L_5 \triangleleft L_6 = \{w \in \Sigma^* \mid \exists v \in L_5 \text{ such that } vw \in L_6\}\)

Let's analyze this step by step:

Words in \(L_5\) are of the form \((a^{n_1}ba^{n_2}b…a^{n_k}b)\) where \(k \geq 0\) and \(n_i \geq 0\) for all \(i\).
Words in \(L_6\) are repetitions of \(abba\).
The possible prefixes from \(L_5\) that could start a word in \(L_6\) are:
\(\epsilon\) (empty string)
\(ab\)
\(abb\)

Now, let's consider what \(w\) would be in each case:

If \(v = \epsilon\), then \(w\) can be any word in \(L_6\), i.e., \((abba)^*\)
If \(v = ab\), then \(w\) must start with \(ba\) and then continue with \((abba)^*\), i.e., \(ba(abba)^*\)
If \(v = abb\), then \(w\) must start with \(a\) and then continue with \((abba)^*\), i.e., \(a(abba)^*\)

Therefore, \(L_5 \triangleleft L_6\) can be expressed as the union of these possibilities:

\(L_5 \triangleleft L_6 = (abba)^* \cup ba(abba)^* \cup a(abba)^*\)

This can be written more compactly as:

\(L_5 \triangleleft L_6 = (\epsilon + ba + a)(abba)^*\)

or

\(L_5 \triangleleft L_6 = (ba \cup a)(abba)^*\)

This regular expression captures all possible suffixes that, when concatenated with a word from \(L_5\), result in a word from \(L_6\).

(3)

To construct a non-deterministic finite automaton (NFA) that accepts \(\mathbf{L(A_1)} \triangleleft \mathbf{L(A_2)}\), we can follow these steps:

a) Start with the structure of \(\mathbf{A_1}\).

b) Add \(\epsilon\)-transitions from each state of \(\mathbf{A_1}\) to the initial state of \(\mathbf{A_2}\).

c) Make all final states of \(\mathbf{A_1}\) non-final.

d) Keep the final states of \(\mathbf{A_2}\) as final.

Formally, we can define the NFA \(\mathbf{A} = (Q, \Sigma, \delta, q_{1,0}, F)\) where:

\(Q = Q_1 \cup Q_2\)
\(\Sigma\) is the same as before
\(\delta(q, a) = \delta_1(q, a)\) for \(q \in Q_1, a \in \Sigma\)
\(\delta(q, a) = \delta_2(q, a)\) for \(q \in Q_2, a \in \Sigma\)
\(\delta(q, \epsilon) = \{q | q \in Q_2\}\) for \(q \in Q_1\)
\(q_{1,0}\) is the initial state of \(\mathbf{A_1}\)
\(F = F_2\)

This NFA simulates \(\mathbf{A_1}\) until it non-deterministically decides to switch to \(\mathbf{A_2}\) using an \(\epsilon\)-transition, then continues in \(\mathbf{A_2}\) until it reaches a final state of \(\mathbf{A_2}\).

(4)

The statement "For every context-free language \(L\) and regular language \(R\), \(L \triangleleft R\) is a regular language" is false.

Counterexample:

Let \(L = \{a^mb^nc^n \mid m, n \geq 0\}\) (a context-free language that is not regular)

Let \(R = a^*b^*c^*\) (a regular language)

Then \(L \triangleleft R = \{b^nc^n \mid n \geq 0\}\), which is not a regular language.

This counterexample shows that the statement is false in general.

Knowledge

正则语言上下文无关语言 NFA 正则表达式

难点思路

问题 3 和 4 可能对考生来说比较困难。

对于问题 3，关键是理解如何通过 NFA 的非确定性来模拟 \(\triangleleft\) 操作。我们使用 \(\epsilon\)-转移来允许在 \(\mathbf{A_1}\) 的任何状态 " 猜测 " 是否应该开始匹配 \(\mathbf{A_2}\) 的部分。

对于问题 4，难点在于构造合适的反例。需要选择一个非正则的上下文无关语言，并找到一个合适的正则语言，使得它们的 \(\triangleleft\) 操作结果仍然是非正则的。

解题技巧和信息

对于涉及新定义操作的题目，先仔细理解定义，然后尝试用具体的例子来理解操作的含义。
在处理正则表达式时，考虑语言中单词的结构和可能的前缀。
构造 NFA 时，利用非确定性和 \(\epsilon\)-转移来模拟复杂的操作。
在证明语言类别相关的陈述时，寻找反例通常比证明正确性更容易。尝试使用经典的非正则上下文无关语言作为起点。

重点词汇

finite automaton 有限自动机
non-deterministic finite automaton (NFA) 非确定性有限自动机
regular expression 正则表达式
context-free language 上下文无关语言
\(\epsilon\)-transition \(\epsilon\)-转移
counterexample 反例

参考资料

Introduction to Automata Theory, Languages, and Computation by John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman. Chapter 2 (Finite Automata) and Chapter 4 (Context-Free Grammars).
Sipser, M. (2012). Introduction to the Theory of Computation. Cengage Learning. Chapter 1 (Regular Languages) and Chapter 2 (Context-Free Languages).

東京大学 情報理工学系研究科 コンピュータ科学専攻 2021年2月実施 問題3