A right-regular grammar (RRG) is a CFG whose rules are all of the form: \[ X \to a Y \qquad \text{or} \qquad X \to \emptystring \]
Theorem: Every DFA-decidable language is RRG-decidable (and thus CFG-decidable).
Given any DFA \(D\), we can convert it into RRG \(G\) such that \(L(G) = L(D)\):
\(G\) generates string \(w\) iff \(D\) accepts \(w\).
Proof
We now prove \(L(G) = L(D)\).
A right-regular grammar (RRG) is a CFG whose rules are all of the form: \[ X \to a Y \qquad \text{or} \qquad X \to \emptystring \]
Theorem: Every DFA-decidable language is RRG-decidable (and thus CFG-decidable).
For \(X \in Q, a \in \Sigma\): \(\ X \to a Y\), where \(Y = \delta(X, a)\)
For \(X \in F\): \(\qquad \quad X \to \emptystring\)
Proof
A right-regular grammar (RRG) is a CFG whose rules are all of the form: \[ X \to a Y \qquad \text{or} \qquad X \to \emptystring \]
Theorem: Every DFA-decidable language is RRG-decidable (and thus CFG-decidable).
For \(X \in Q, a \in \Sigma\): \(\ X \to a Y\), where \(Y = \delta(X, a)\)
For \(X \in F\): \(\qquad \quad X \to \emptystring\)
Proof
Thus \(x \in L(G)\), showing \(L(D) \subseteq L(G)\).
Theorem: Any RRG-decidable language is NFA-decidable (and thus DFA-decidable).
Proof
Corollary: RRGs, DFAs, and NFAs have equivalent computational (decision) power.
Note this is almost the same as the DFA-to-RRG in reverse; nondeterministic transitions like \(X \stackrel{0}{\to} Y\) and \(X \stackrel{0}{\to} Z\) come from rules like \(X \to 0Y \or 0Z\).
What if there are also rules like \(X \to Y\)? Correspond to \(\emptystring\)-transitions \(X \stackrel{\emptystring}{\to} Y\)
A left-regular grammar (LRG) is a CFG whose rules are all of the form: \[ X \to Y a \qquad \text{or} \qquad X \to \emptystring \]
\[\begin{align*} A &\to 0B \\ B &\to A1 \\ A &\to \emptystring \end{align*}\]
\(\setbuild{0^n1^n}{n \in \N}\)… not regular!
Today:
Theorem: Any regex-decidable language is NFA-decidable.
Proof
Recall the formal definition:
\(R\) is a regular expression deciding language \(L(R) \subseteq \Sigma^*\) if one of the following holds:
This inductive definition implies a “tree structure” that we can exploit to construct an NFA!
Theorem: Any regex-decidable language is NFA-decidable.
Proof
Inductive proof:
For any regular expression \(R\), we can construct an NFA \(N\) such that \(L(N) = L(R)\).
The base cases of the inductive definition are the base cases of the proof!
If \(R = a\) for some \(a \in \Sigma\), then \(L(R) = \{a\}\) decided by the NFA
If \(R = \emptystring\), then \(L(R) = \{\emptystring\}\) decided by the NFA
If \(R = \emptyset\), then \(L(R) = \{\}\) decided by the NFA
Theorem: Any regex-decidable language is NFA-decidable.
Proof
Inductive proof:
For any regular expression \(R\), we can construct an NFA \(N\) such that \(L(N) = L(R)\).
The recursive cases of the definition are the inductive steps of the proof!
If \(R = (R_1) \cup (R_2)\) for \(R_1, R_2\) regexes, then \(L(R) = L(R_1) \cup L(R_2)\).
\(L(R_1)\) and \(L(R_2)\) are both NFA-decidable by the inductive hypothesis.
By closure of NFA-decidability under union, so is \(L(R) = L(R_1) \cup L(R_2)\)!
If \(R = (R_1)(R_2)\) for \(R_1, R_2\) regexes, then \(L(R) = L(R_1) \circ L(R_2)\).
\(L(R_1)\) and \(L(R_2)\) are both NFA-decidable by the inductive hypothesis.
By closure of NFA-decidability under concatenation, so is \(L(R) = L(R_1) \circ L(R_2)\)!
If \(R = (R_1)^*\) for \(R_1\) a regex, then \(L(R) = L(R_1)^*\).
\(L(R_1)\)is NFA-decidable by the inductive hypothesis.
By closure of NFA-decidability under Kleene star, so is \(L(R) = L(R_1)^*\)!
This completes the proof!
True or False: Collapsing two NFA states connected by an \(\emptystring\)-transition always preserves its behavior.
Theorem: Any NFA-decidable language is regex-decidable.
Starting point: “Expression Automata” (EA) (or “Generalized NFAs” in Sipser)
Label the transition arrows of an EA with regular expressions that match substrings of the input rather than individual symbols.
Theorem: Any NFA-decidable language is regex-decidable.
Starting point: “Expression Automata” (EA; “Generalized NFAs” in Sipser)
Label the transition arrows of an NFA with regular expressions that match substrings of the input rather than individual symbols.
Proof idea:
Then \(L(N) = L(E) = L(R)\)!
In our first step, we modify \(N\) so that:
Given \(N = (Q, \Sigma, \Delta, q_0, F)\),
construct \(N' = (Q', \Sigma, \Delta', s, F')\):
\(L(N') = L(N)\) because:
there is a computation sequence of \(N\) \(r_1, r_2, \ldots, r_n\) that accepts \(w\) if and only if \(s, r_1, r_2, \ldots, r_n, a\) is a computation sequence of \(N'\) accepting \(w\).
We then apply incremental simplifications that remove the states “in the box” (not \(s\) or \(a\)) one by one:
How does this work in general?
The final simplification step operates on an EA that looks like:
We can rip state “i” out of the diagram but still accept strings whose computation sequences go through \(i\), by changing the transition between \(s\) and \(a\):
Both EAs accept exactly strings of the form \(w \in L(W)\), or \(x y_1 y_2 \dots y_k z\) for some \(k \in \N\) where \(x \in L(X)\), each \(y_i \in L(Y)\), and \(z \in L(Z)\).
When more than three states remain:
Other pairs of states, e.g., (\(s,r\) or \(q,q\)), not listed because either the first has no transition to \(i\) or the second has no transition from \(i\).
Theorem: (not proven in this class) There are NFAs with \(n\) states such that any equivalent regex has length at least \(2^{n-1}\)!
Let \(L\) be a language; the following are equivalent:
From now on, we simply use the term regular.