Automaton simulator usage instructions

This is a help file for the automaton simulator here. It can be used to create and test deterministic and nondeterministic finite automata, regular expressions, context-free grammars, and (deterministic) Turing machines.

This application is mainly tested with the Chrome web browser, and it may have problems with other web browsers.

Please send questions/bug reports to This application was written in Elm.

How to use

Play around! Most features are self-explanatory. Below are some notes about aspects that may require more detailed explanation.

Entering input

If the option "Run immediately" is selected, then all changes to input and automaton descriptions are immediately processed. If you uncheck this box, then a "Run" button will appear, which must be pressed to see if the automaton description is valid, and if so, what is the result of running the automaton on the input. This may be useful if the simulator is taking a long time to process each change, for instance if the Turing machine has an infinite loop.

To load a text file from your computer into the editor window, click the "Choose file" button next to "Upload:". To save the contents of the editor to a file on your computer, click the "Download" button. Pressing the "Load Default" button will replace the editor with a description of the default automaton for that type.

The contents of the editor are automatically saved locally in your browser's local storage, so when you first open the page, the editor should display the same thing you saw last time the page was open. However, this is merely for convenience and should not be relied upon: it is strongly recommended to save the file manually when you are done working. If the app is updated, for instance, a bug or a change in storage format may erase your stored data.

Clearing local storage if app does not load

There may be rare circumstances where the app does not load due to a bug in how it loads the automaton from your browser's local storage. To clear local storage (which should reset the app), follow these steps.

Open the simulator web page.
Open Developer Tools by pressing Ctrl+Shift+I in Chrome or Firefox.
Once Developer Tools are open, select "Application" in Chrome or "Storage" in Firefox.
Click "Local Storage" on the left.
Right-click https://web.cs.ucdavis.edu/ and select "Clear".
Refresh the page.

Example input files

The easiest way to explain the input file format is by example. The files no-000-end.dfa, contains-010.regex, 0-three-from-end.nfa, 0s1s2s.nfa, balanced-parens.cfg, double.tm, and palindrome-single-tape.tm are examples of input files for the finite automata, regular expression, context-free grammar, and Turing machine simulators.

Turing machine simulation time limit

Turing machines are run for at most 10,000 steps.

Turing machine left move on leftmost tape cell

If a Turing machine attempts to move a tape head left when it is already on the leftmost tape cell, then the tape head stays where it is instead.

Step-by-step simulation

The finite automata and Turing machine simulators have a back and forward button to step through individual transitions of the machine. In both cases the current state is highlighted. In the case of Turing machines, the tape(s) are shown with the tape head position(s) highlighted.

In the case of finite automata, the input is shown with a caret symbol ^ placed in between the symbol just read and the symbol about to be read. For example, in reading the string 0010, the finite automaton will have four transitions and visit five states total, with the "read position" being indicated in each case as:

^0 0 1 0
0^0 1 0
0 0^1 0
0 0 1^0
0 0 1 0^

If the input field and editor are not selected, then the , and . (comma and period) keys can also be used to do forward and backward transitions. This can be useful to quickly jump several transitions ahead by holding down the key.

Output conventions for Turing machines

Turing machines have two different notions of "output".

Boolean output, which is indicated by whether the final configuration is in the accept state or the reject state.
string output, used to define Turing machines that compute functions f: Σ^* → Σ^*.

The string output in the final configuration is represented on the last tape. The output is defined to be the string from the tape head to the first blank to the right (represented by an underscore _), not including the blank. This means that although the input is from the input alphabet, the output could have nonblank symbols from the tape alphabet.

For instance, suppose the content of the last tape is bb_bbabbaba_abb_bbb__ and the tape head at the leftmost a. Then the output string is abbaba. If the content of the last tape is _abbaba_abb_bbb__ and the tape head is at position 0 (in other words, if the tape head is scanning a blank symbol), then the output string is the empty string ε.

Notes about file format

Whitespace is mostly ignored. Since whitespace otherwise gets trimmed away, whitespace characters cannot be in any alphabet.

State names can be multi-character strings, with characters that are either alphanumeric, an underscore
_
, a hyphen
-
, parentheses
(
and
)
, or a prime (apostophe)
'
. Alphabet symbols should be single characters, which can be any symbol from the keyboard except for the following, which have special syntactical meaning in the automata files:
- curly braces
  {
  and
  }
  , used to enclose sets
- comma
  ,
  , used to separate lists of items (such as each state in the set of states)
- pipe
  |
  , used for nondeterministic productions in CFGs and union in Regex's (this only applies to CFG and Regex symbols; DFA, NFA, and TM can use
  |
  as an alphabet symbol)
- question mark
  ?
  , used as a special character for Turing machine wildcard transition rules
- whitespace characters (e.g., space, tab, newline)

In regular expressions, only the following characters are allowed: the operators ( ) * + |, alphanumeric characters, and (because of a homework problem involving email addresses) the characters . and @. Note that | is used to mean ∪. Unlike most regular expression libraries in programming languages, spaces are ignored (including newlines), which can help you to visually separate different parts of the regular expression.
Warning: Comments are poorly supported in regular expressions. If you are having trouble, try removing all comments.

The various parts of the automaton (
states
,
input_alphabet
, etc.) must be specified in the same order given in the example files.

.nfa
files are almost the same syntax as
.dfa
files, with two exceptions:
- You can write
  q, -> r;
  to represent an ε-transition from state
  q
  to
  r
  . See 0s1s2s.nfa for an example.
- You can write
  q,0 -> {r,s,t};
  to represent a nondeterministic transition to any of states
  r,s,t
  (from state
  q
  while reading a
  0
  ). See 0-three-from-end.nfa for an example. You can also write several lines to specify multiple output states, such as
  
  q,0 -> r;
  
  q,0 -> s;
  
  q,0 -> t;

.regex
files can define variables with subexpressions. For example

A = 0|1;

B = A2;

34B | 56A

is equivalent to the regex
34((0|1)2) | 56(0|1)
. Each subexpression may reference previous variables. The last expression appears after the last semicolon (with no variable definition or equals sign). This can help with avoiding typing long subexpressions multiple times. Also, you can use longer variable names for subexpressions, which is helpful when the input alphabet contains all single letters. It will not work properly to use single-letter variable names such as A or b in this case. For example

Sigma = q|w|e|r|t|y|u|i|o|p|a|s|d|f|g|h|j|k|l|z|x|c|v|b|n|m|Q|W|E|R|T|Y|U|I|O|P|A|S|D|F|G|H|J|K|L|Z|X|C|V|B|N|M;

Sigma* abc Sigma* abc Sigma* abc Sigma*

matches any word with three appearances of
abc
in it.

Warning: Do not use newlines within a subexpression. Most of the automata simulator is robust to newlines, but using newlines within a subexpression will put the symbols \n into your final regex. Put each subexpression on a single line.

However, be careful with this feature! It works by simply substituting each subexpression in all the remaining ones. So it is possible (and inadvisable) to write a short regex that will blow up the memory requirements exponentially, e.g.,

A = 0123456789;

B = AAAAAAAAAA;

C = BBBBBBBBBB;

D = CCCCCCCCCC;

E = DDDDDDDDDD;

F = EEEEEEEEEE;

G = FFFFFFFFFF;

G

which defines a regular expression with ten million symbols in it.

.cfg
files use character
|
to represent multiple production rules on the same line, e.g.
S -> A|B;
is equivalent to
S -> A;
and
S -> B;
. Note that an ε-production can be included by, e.g.,
S -> A|;
which is equivalent to
S -> A;
and
S -> ;

A k-tape TM is specified with transition rules using strings of length k for the input symbols, the output symbols, and moves. For example, the transition rule
q,001 -> r,111,RSL;
is for a 3-tape TM and specifies that in state
q
, reading a
0
on the first tape, a
0
on the second tape, and a
1
on the third tape, the TM should change to state
r
, write a
1
on the first tape and move its tape head right, a
1
on the second tape and don't move its tape head, and a
1
on the third tape and move its tape head left.

DFA transition functions must be defined explicitly on all inputs. In other words, if you have |Q|=k states and |Σ|=m symbols, there must be exactly k·m transitions for δ, one for each pair (q,b), where q ∈ Q and b ∈ Σ.

In contrast, NFA and TM transition functions do not need to be fully specified; they take default values for any inputs not specified. If Δ(q,b) is left unspecified for an NFA, then it is assumed Δ(q,b) = Ø. If δ(q,b) is left unspecified for a TM, then it is assumed δ(q,b) = (q_r,b,S), where q_r is the reject state.

The files count-a-wildcard.tm and w-marker-w-wildcard.tm show how to use the wildcard symbol
?
to avoid typing many transitions that do the same thing. The symbol matching the wildcard can either be written over (if a non-wildcard symbol appears in the output, e.g.,
q,0? -> r,11,RS;
) or copied (if
?
appears in the output, e.g.,
q,0? -> r,1?,RS;
). For each tape, if a wildcard appears as an output, then it must also appear as an input for the same tape, and in this case it means "copy whatever symbol matched the wildcard in the input". For example, the following is not allowed:
q,00 -> r,1?,RS;
.

If there is a wildcard-containing transition that matches a wildcard-free transition, the latter takes precedence. For example, count-a-wildcard.tm uses the wildcard-free transitions
q1,a_ -> q1,a0,RR;
and
q1,__ -> qA,__,SS;
, if they match, before using the wildcard transition
q1,?_ -> q1,?_,RS;
.

In fact, it is required to use this feature if there is a state
q
such that two wildcard transitions with input state
q
overlap, in the sense that they both match the same tuple of input symbols. For example,
q,0?1 -> r,111,RS;
and
q,??1 -> t,000,LL;
both match tuples
001
and
011
. To disambiguate what to do if
001
or
011
is encountered when in state
q
, there must be wildcard-free transitions with input state
q
for both
001
and
011
. Thus the case above would be an error if there were not also transitions of the form
q,001 -> ...
and
q,011 -> ...
.