Title

ECS 120 Theory of Computation
The Time Hierarchy Theorem and the Complexity Class P
Julian Panetta
University of California, Davis

Recall: Analyzing Function Growth (Asymptotic Analysis)

Big-O notation (“\(f \le g\)”):
Given nondecreasing \(f, g : \N \to \R^+\), we write \(f = O(g)\) if there exists \(c \in \N\) such that \[ f(n) \le c \cdot g(n) \quad \text{for all } n \] We call \(g\) an asymptotic upper bound for \(f\).

Little-O notation (“\(f < g\)”):
Given nondecreasing \(f, g : \N \to \R^+\), we write \(f = o(g)\) if \[ \lim_{n \to \infty} \frac{f(n)}{g(n)} = 0 \]

  • Notations building on these definitions:
    • \(f \ge g\)”: \(\quad f(n) = \Omega(g(n)) \fragment{\iff g(n) = O(f(n))}\)
    • \(f > g\)”: \(\quad f(n) = \omega(g(n)) \fragment{\iff g(n) = o(f(n))}\)
    • \(f \sim g\)”: \(\quad f(n) = \Theta(g(n)) \fragment{\iff f(n) = O(g(n)) \text{ and } g(n) = O(f(n))}\)

Some Practice

Notation Meaning
\(\quad f(n) = O(g(n))\) \(f \le g\)
\(\quad f(n) = \Omega(g(n))\) \(f \ge g\)
\(\quad f(n) = \omega(g(n))\) \(f > g\)
\(\quad f(n) = \Theta(g(n))\) \(f \sim g\)

Which of the following statements are correct?

  • \(n \log n = O(n^2)\)
  • \(n \log \log n = \Omega(n^2)\)
  • \(2 n^c = \Theta(2^{c \log n})\)
  • \(2^{0.0000001 n} = o(n^{100000})\)
  • \(\log(n) = \omega(\sqrt{n})\)

Time Complexity Classes

  • Can every problem be solved in \(O(n)\), \(O(n^2)\), or maybe \(O(n^{100})\) time
    if we simply find the right algorithm?

  • It turns out, no:
    for any time bound \(t(n)\), there are problems that can’t be solved in \(O(t(n))\),
    but can if we allow more time.

    “Given more time, a TM can solve more problems!”

  • We formalize this by introducing notation for “problems solvable in \(O(t(n))\) time.”

    Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \fragment{\texttt{TIME}(t(n)) =} \fragment{\setbuild{L \subseteq \binary^*}{\fragment{L \text{ is a language decided by a TM with running time } O(t(n))}}} \]

    \(\texttt{TIME}(t(n)) \subseteq \powerset(\binary^*)\)

Let \(\texttt{REG}\) denote the class of regular languages.
True or false: \(\texttt{REG} \subseteq \texttt{TIME}(n)\)

  • True
  • False

Time Complexity Classes

  • Can every problem be solved in \(O(n)\), \(O(n^2)\), or maybe \(O(n^{100})\) time
    if we simply find the right algorithm?

  • It turns out, no:
    for any time bound \(t(n)\), there are problems that can’t be solved in \(O(t(n))\),
    but can if we allow more time.

    “Given more time, a TM can solve more problems!”

  • We formalize this by introducing notation for “problems solvable in \(O(t(n))\) time.”

    Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \texttt{TIME}(t(n)) = \setbuild{L \subseteq \binary^*}{L \text{ is a language decided by a TM with running time } O(t(n))} \]

    \(\texttt{TIME}(t(n)) \subseteq \powerset(\binary^*)\)

For all time bounds \(t_1(n)\) and \(t_2(n)\) such that \(t_1(n) = O(t_2(n))\)
\(\texttt{TIME}(t_1(n)) \subseteq \texttt{TIME}(t_2(n))\)

  • True
  • False

Time Complexity Classes

Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \texttt{TIME}(t(n)) = \setbuild{L \subseteq \binary^*}{L \text{ is a language decided by a TM with running time } O(t(n))} \]

Time Hierarchy Theorem:
Let \(t_1(n)\) and \(t_2(n)\) be time bounds such that \(t_1(n) \log(n) = o(t_2(n))\).
Then \(\texttt{TIME}(t_1(n)) \subsetneq \texttt{TIME}(t_2(n))\).

There is a problem that can be solved in \(O(t_2(n))\) time, but not \(O(t_1(n))\) time!

Consequences:

  • \(\texttt{TIME}(n^c) \subsetneq \texttt{TIME}(n^{c + 1})\) for all \(c\)
  • \(\texttt{TIME}(n^c) \subsetneq \texttt{TIME}(2^n)\)
  • \(\texttt{TIME}(2^{cn}) \subsetneq \texttt{TIME}(2^{(c + 1) n})\) for all \(c\)
  • There is an infinite hierarchy of time complexity classes!

The Complexity Class \(\P\)

A special time complexity class containing problems with “efficient” solutions:

We denote by \(\P\) the class of languages decidable in polynomial time by a (deterministic) Turing machine: \[ \P = \bigcup_{k = 1}^\infty \time{n^k} \]

  • Why this notion of efficiency?
    • It is robust to changes in our model of computation.
      (All “reasonable” models of computation are polynomially equivalent.)
    • Exponential time algorithms are often “brute-force” approaches, while polynomial-time algorithms
      (when they exist) involve deeper insights into the problem. (E.g., Dynamic programming)
    • \(\P\) roughly captures the class of problems that are realistically solvable in practice.

Robustness of the Definition of \(\P\)

Remember our model of compute time:

The time complexity of a decider \(M\) is the time bound function \(t : \mathbb{N} \to \mathbb{N}^+\) defined as \[ t(n) = \max_{x \in \binary^n} \texttt{time}_M(x), \]

where \(\texttt{time}_M(x)\) is the number of configurations that \(M\) visits on input \(x\).

  • Specific choices made in this definition:
    • \(M\) is a single-tape, deterministic Turing machine.
    • \(n\) is the length of the input’s encoding as a binary string.
  • What if we make different choices?
    • If we use a two-tape Turing machine, we would gain at most a quadratic speedup.
      Optional Section 9.3: any \(O(t(n))\) two-tape TM can be simulated by an \(O(t(n)^2)\) single-tape TM.
      Therefore membership in \(\P\) is unchanged (\(O(n^k) \to O(n^{2k})\))
    • Any “reasonable” input encoding also does not change the class \(\P\).

Invariance to “Reasonable” Input Encoding

  • Let \(\encoding{\mathcal{X}}_1\) and \(\encoding{\mathcal{X}}_2\) be two different encodings of input object \(\mathcal{X}\) into binary strings.
    Define an “encoding blowup factor” from \(\encoding{\mathcal{X}}_1\) to \(\encoding{\mathcal{X}}_2\) as: \[ f_{1 \to 2}(n) = \max_{\mathcal{X} \text{ s.t. } |\encoding{\mathcal{X}}_1| = n} |\encoding{\mathcal{X}}_2| \]

  • If \(f_{1 \to 2}(n) = O(n^c)\) and \(f_{2 \to 1}(n) = O(n^c)\), each encoding yields the same definition of \(\P\).
    (Assuming encoding/decoding takes polynomial time.)

  • Example of reasonable encodings for graph \(G = (V, E)\):

    data/images/tm/example_graph.svg
    • Node and edge lists:
      \(\encoding{G}_1 = \texttt{ascii\_to\_binary}((1,2,3,4)((1,2),(2,3),(3,1),(1,4)))\)

    • Binary adjacency matrix: \[ \begin{bmatrix} 0 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ \end{bmatrix} \quad \Longrightarrow \quad \encoding{G}_2 = 0111101011001000 \hspace{15em} \]

Invariance to “Reasonable” Input Encoding

  • Let \(\encoding{\mathcal{X}}_1\) and \(\encoding{\mathcal{X}}_2\) be two different encodings of input object \(\mathcal{X}\) into binary strings.
    Define an “encoding blowup factor” from \(\encoding{\mathcal{X}}_1\) to \(\encoding{\mathcal{X}}_2\) as: \[ f_{1 \to 2}(n) = \max_{\mathcal{X} \text{ s.t. } |\encoding{\mathcal{X}}_1| = n} |\encoding{\mathcal{X}}_2| \]

  • If \(f_{1 \to 2}(n) = O(n^c)\) and \(f_{2 \to 1}(n) = O(n^c)\), each encoding yields the same definition of \(\P\).
    (Assuming encoding/decoding takes polynomial time.)

  • Reasonable and unreasonable encodings of \(n \in \N^+\):

    \[ \begin{aligned} \encoding{n}_1 \phantom{|} &= \texttt{bin}(n) \in \{0, 1\}^* \\ |\encoding{n}_1| &= \fragment{\lfloor \log_2(n) \rfloor + 1} \fragment{= O(\log n)} \\ \encoding{n}_2 \phantom{|} &= 1^n \\ |\encoding{n}_2| &= n \end{aligned} \hspace{15em} \]

    \(f_{1 \to 2}(n) = \fragment{O(2^n)}\)

    Dangers of unreasonably inefficient encodings:
    • Using exponential time to do simple arithmetic!
    • Considering truly exponential-time algorithms to take polynomial time with respect to the input size.
  • Reasonable encodings of a list of strings \((010, 1001, 11)\)
    • ASCII with comma delimiters: 8 * (9 + 2) = 88 bits
    • More efficient “bit-doubling” encoding: 00 11 00 01 11 00 00 11 01 11 11 (22 bits)

Definition of Input Size

  • Formally, the “\(n\)” in time bound \(t(n)\) is the length of the binary encoding \(x = \encoding{\mathcal{X}}\)
  • We generally want to think of time complexity at a higher level
    • For a graph algorithm: the complexity in terms of the number of vertices and edges.
    • For a list-processing algorithm: the complexity in terms of the number of elements.
  • The previous slide shows this distinction doesn’t change membership in \(\P\) provided that \(|\encoding{\mathcal{X}}|\) is a polynomial function of the “intuitive input size.”
  • Sometimes we still must be careful…
    (e.g., when factoring an \(n\)-bit integer, the number of bits is the size!)

Practicality of Problems in \(\P\)

  • The robustness of \(\P\) comes from ignoring the exponents of polynomials.
  • In practice, this exponent can matter quite a lot!
    • In algorithms or scientific computing we care very much about \(O(n^2)\) vs. \(O(n^3)\)
    • Even \(O(n^4)\) may be considered too slow for a given application.
  • However:
    • Once we have a polynomial time algorithm (even \(n^{100}\)), we can often find a better one!
    • Over time, problems in \(\P\) become much easier to solve due to Moore’s law.
      • Suppose we have an algorithm with time bound \(t(n) = a n^{k}\).
      • Assume we can afford to solve an input of size \(n_0\) in year 0,
        and the number of operations we can afford to run grows by multiplier \(m\) each year.
      • In year \(y\), we can afford to solve inputs of size \(n_0 m^{y / k}\). This is growing exponentially!
    • The size of inputs we can process with exponential time algorithms only grows linearly over time.

Identifying Time Complexity Classes Within \(P\) Graphically

  • If we were to care about the different time complexity classes \(O(n^k)\) within \(\P\)

    • How can we visualize their differences?
    • How can we experimentally verify that our code has the expected time complexity?
  • We can try plotting the runtime of our algorithm against different input sizes.
    Example: 3sum

    data/images/complexity/3sum_timing_experiment.svg

Identifying Time Complexity Classes Within \(P\) Graphically

  • If we were to care about the different time complexity classes \(O(n^k)\) within \(\P\)

    • How can we visualize their differences?
    • How can we experimentally verify that our code has the expected time complexity?
  • We can try plotting the runtime of our algorithm against different input sizes.

  • More specifically, we can use a log-log plot:
    Example: 3sum

    data/images/complexity/3sum_timing_experiment_log.svg