Theorem: Any regex-decidable language is NFA-decidable.
Proof
Recall the formal definition:
R is a regular expression deciding language \(L(R) \subseteq \Sigma^*\) if one of the following holds:
This inductive definition implies a “tree structure” that we can exploit to construct an NFA!
Theorem: Any regex-decidable language is NFA-decidable.
Proof
Inductive proof:
For any regular expression \(R\), we can construct an NFA \(N\) such that \(L(N) = L(R)\).
The base cases of the inductive definition are the base cases of the proof!
If \(R = a\) for some \(a \in \Sigma\), then \(L(R) = \{a\}\) decided by the NFA
If \(R = \emptystring\), then \(L(R) = \{\emptystring\}\) decided by the NFA
If \(R = \emptyset\), then \(L(R) = \{\}\) decided by the NFA
Theorem: Any regex-decidable language is NFA-decidable.
Proof
Inductive proof:
For any regular expression \(R\), we can construct an NFA \(N\) such that \(L(N) = L(R)\).
The recursive cases of the definition are the inductive steps of the proof!
If \(R = (R_1) \cup (R_2)\) for \(R_1, R_2\) regexes, then \(L(R) = L(R_1) \cup L(R_2)\).
\(L(R_1)\) and \(L(R_2)\) are both NFA-decidable by the inductive hypothesis.
By closure under union, so is \(L(R) = L(R_1) \cup L(R_2)\)!
If \(R = (R_1)(R_2)\) for \(R_1, R_2\) regexes, then \(L(R) = L(R_1) \circ L(R_2)\).
\(L(R_1)\) and \(L(R_2)\) are both NFA-decidable by the inductive hypothesis.
By closure under concatenation, so is \(L(R) = L(R_1) \circ L(R_2)\)!
If \(R = (R_1)^*\) for \(R_1\) a regex, then \(L(R) = L(R_1)^*\).
\(L(R_1)\)is NFA-decidable by the inductive hypothesis.
By closure under Kleene star, so is \(L(R) = L(R_1)^*\)!
This completes the proof!
Theorem: Any NFA-decidable language is regex-decidable.
Starting point: “Expression Automata” (or “Generalized NFAs” in Sipser)
Label the transition arrows of an NFA with regular expressions that match substrings of the input rather than individual symbols.
Theorem: Any NFA-decidable language is regex-decidable.
Starting point: “Expression Automata” (or “Generalized NFAs” in Sipser)
Label the transition arrows of an NFA with regular expressions that match substrings of the input rather than individual symbols.
Proof idea:
Then \(L(N) = L(E) = L(R)\)!
In our first step, we modify \(N\) so that its start/accept states already have the desired form:
Given \(N = (Q, \Sigma, \Delta, q_0, F)\),
construct \(N' = (Q', \Sigma, \Delta', s, F')\):
\(L(N') = L(N)\) because:
There exists a computation sequence of \(N\) \(r_1, r_2, \ldots, r_n\) that accepts \(w\) if and only if \(s, r_1, r_2, \ldots, r_n, a\) is a computation sequence of \(N'\) accepting \(w\).
We then apply incremental simplifications that remove the states “in the box” one by one:
How does this work in general?
The final simplification step operates on an EA that looks like:
We can rip state “i” out of the diagram but still accept strings whose computational sequences through it by changing the connection between \(s\) and \(a\):
Both EAs accept exactly strings of the form
When more than three states remain: