Title

ECS 120 Theory of Computation
Asymptotic analysis (cont’d) and time complexity
Julian Panetta
University of California, Davis

Last time: Asymptotic analysis

Given nondecreasing \(f, g : \N \to \R^+\), we write \(f = O(g)\) if there is \(c \in \N\) so that, for all \(n \in \N\), \[ f(n) \le c \cdot g(n). \] We call \(g\) an asymptotic upper bound for \(f\). (Like saying \(f\)\(\le\)\(g\))

  • This is “Big-O” notation: \(f = O(g)\) means \(f\) grows no faster than \(g\).
  • Important cases:
    • Polynomial bounds: \(f = O(n^c)\) for \(c > 0\) (“\(f\) is polynomially bounded”)
      • Examples: \(n^2, n^3, n^{2.2}, n^{1000}\)
      • Larger than polynomial: \(n^{\log(n)}\) (exponent is not constant; grows with \(n\))
    • Exponential bounds: \(f = O(2^{c n^\delta})\) for some \(c, \delta > 0\)
      • Examples: \(2^n, 2^{100n}, 2^{0.01 n}, (2^n)^2 = 2^{2n}, 2^{n^2}, 2^\sqrt{n}, e^{n^2} (=2^{c n^2} \text{ for } c=\ln 2)\)
      • Also: \(a^{n^\delta}\) for any \(a > 1\) and \(\delta > 0\)

Would be more technically accurate to write \(f \in O(g)\). Many functions are \(O(g)\), and \(f\) is only equal to one of them.

Last time: Asymptotic analysis

Given nondecreasing \(f, g : \N \to \R^+\), we write \(f = o(g)\) if \(\lim\limits_{n \to \infty} \frac{f(n)}{g(n)} = 0\). (Like saying \(f\)\(<\)\(g\))

  • This is “Little-O” notation: \(f = o(g)\) means \(f\) grows strictly slower than \(g\).
  • We can say algorithm \(A\) is (asymptotically) faster than algorithm \(B\) if \(t_A(n) = o(t_B(n))\).
  • Warning, just because \(f = O(g)\) and \(g \ne O(f)\) does not mean \(f = o(g)\)
data/images/tm/little-o-big-o-counterexample.svg
  • \(f = O(g)\) since \(f(n) \le g(n)\) for all \(n\).
  • \(g \ne O(f)\) since, for any \(c > 0\), \(g(n) > f(n)\) for infinitely many \(n\).
  • \(f \ne o(g)\) since \(\frac{f(n)}{g(n)} = 1\) for infinitely many \(n\).

Strategies for comparing growth rates

  • To show \(f = o(g)\) you can always apply the definition and prove: \(\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0\).
  • But usually there is an easier shortcut that you can apply.
    1. You might notice that \(g(n)\) is of the form \(f(n) \cdot h(n)\) with an unbounded \(h(n)\).
      • Example: \(\quad f(n) = n, \quad g(n) = n \log n, \fragment{\quad g(n) = f(n) \cdot \overbrace{\log n}^{h(n)}}\)
      • Since \(\lim\limits_{n\to\infty} \log n = \infty\) (\(\log n\) is unbounded), we conclude \(n = o(n \log(n))\).
      • Similarly \(\quad n = o(n^2),\quad n^2=o(n^3),\quad n^3=o(n^{3.1}),\quad \dots\)
    2. Simplify by removing constants and lower-order terms (without changing the growth rate):
      • Examples: \(10 n^7 + 100 n^6 \log n + n^2 + 10n = O(n^7), \qquad\quad 2^n + n^{100} + 2^n = O(2^n)\)
      • This is helpful for the common case of an algorithm with multiple stages of different complexities.
    3. Applying a log or raising to a power less than \(1\) shrinks the growth rate.
      \(\log(n) = o(n), \quad \fragment{\sqrt{n} = o(n),} \quad \fragment{\log(n^4) = o(n^4)}\) (actually \(\log(n^4) = 4 \log(n) = O(\log(n))\))
    4. Try taking a log of both functions since \(\log(f(n)) = o(\log(g(n)))\) implies \(f(n) = o(g(n))\).
      • Example: compare \(n^n\) to \(2^{n^2}\) \(\log(n^n) = \fragment{n \log n},\ \log(2^{n^2}) = \fragment{n^2,} \fragment{\text{ and } n \log n = o(n^2)} \fragment{\implies n^n = o(2^{n^2})}\)
      • Warning: \(f(n) = o(g(n))\) does not imply \(\log(f(n)) = o(\log(g(n)))\) (Counterexample: \(2^n = o(2^{2n})\), but \(n \ne o(2n)\))

Asymptotic analysis facts to remember

  • \(1 = o(\log \log n)\) (any unbounded function outgrows any constant)
  • \(\log \log n = o(\log n)\)
  • \(\log_a n = O(\log_b n)\) for any \(a,b > 1\), since \(\log_a n = \frac{\log_b n}{\log_b a}\) and \(\log_b a = O(1)\)
  • \(t(n) = o(t(n)^c)\) for any \(c > 1\) (assuming \(t(n) \ne O(1)\))
    • e.g., \(\log n = o(\log^2 n)\)
  • \(\log^c n = o(n^k)\) for any \(c,k > 0\)
  • \(\sqrt{n} = o(n)\)
  • \(n^c = o(n^k)\) if \(c < k \quad\) (\(\sqrt{n} = o(n)\) is special case \(c=0.5,k=1\))
    • This is special case of \(t(n) = o(t(n)^c)\); e.g., \(n^2 = o(n^3)\) because \((n^2)^{1.5} = n^{2 \cdot 1.5} = n^3\).
  • \(n^c = o(2^{n^\delta})\) for any \(c > 0\) and any \(\delta > 0\)

Convenient notation for “reverse” relations

\[ \begin{aligned} f = O(g) &\iff (\exists c \in \N) (\forall n)\ f(n) \le c \cdot g(n) & f \text{ "$\le$" } g \\ f = o(g) &\iff \lim_{n \to \infty} \frac{f(n)}{g(n)} = 0 & f \text{ "$<$" } g \\ f = \Omega(g) &\iff g = O(f) & f \text{ "$\ge$" } g \\ f = \omega(g) &\iff g = o(f) & f \text{ "$>$" } g \\ f = \Theta(g) &\iff f = O(g) \text{ and } f = \Omega(g) & f \text{ "$=$" } g \end{aligned} \]

  • Very common to see authors write \(f = O(g)\) when they really mean \(f = \Theta(g)\).

Some practice

Notation Meaning
\(\quad f = O(g)\) \(f \text{ "$\le$" } g\)
\(\quad f = o(g)\) \(f \text{ "$\le$" } g\)
\(\quad f = \Omega(g)\) \(f \text{ "$\ge$" } g\)
\(\quad f = \omega(g)\) \(f \text{ "$>$" } g\)
\(\quad f = \Theta(g)\) \(f \text{ "$=$" } g\)

Which of the following statements are correct?

  • \(n \log n = O(n^2)\)
  • \(n \log \log n = \Omega(n^2)\)
  • \(2 n^c = \Theta(2^{c \log n})\)
  • \(2^{0.0000001 n} = o(n^{100000})\)
  • \(\log(n) = \omega(\sqrt{n})\)

Time complexity classes

  • Can every problem be solved in \(O(n)\), \(O(n^2)\), or maybe \(O(n^{100})\) time
    if we simply find the right algorithm?

  • It turns out, no:
    for any time bound \(t\), there are problems that can’t be solved in \(O(t)\),
    but can if we allow more time.

    “Given more time, a TM can solve more problems!”

  • We formalize this by introducing notation for “problems solvable in \(O(t)\) time.”

    Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \fragment{\time{t} =} \fragment{\setbuild{L \subseteq \binary^*}{\fragment{L \text{ is decided by a TM with running time } O(t)}}} \]

    \(\time{t} \subseteq \powerset(\binary^*)\)

Let \(\REG\) denote the class of regular languages.
True or false: \(\REG \subseteq \time{n}\)

  • True
  • False

Time complexity classes

  • Can every problem be solved in \(O(n)\), \(O(n^2)\), or maybe \(O(n^{100})\) time
    if we simply find the right algorithm?

  • It turns out, no:
    for any time bound \(t\), there are problems that can’t be solved in \(O(t)\),
    but can if we allow more time.

    “Given more time, a TM can solve more problems!”

  • We formalize this by introducing notation for “problems solvable in \(O(t)\) time.”

    Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \time{t} = \setbuild{L \subseteq \binary^*}{L \text{ is decided by a TM with running time } O(t)} \]

    \(\time{t} \subseteq \powerset(\binary^*)\)

For all time bounds \(t_1\) and \(t_2\) such that \(t_1 = O(t_2)\),
\(\time{t_1} \subseteq \time{t_2}\)

  • True
  • False

Time complexity classes

Let \(t: \N \to \N^+\) be a time bound. We define the time complexity class \[ \time{t} = \setbuild{L \subseteq \binary^*}{L \text{ is decided by a TM with running time } O(t)} \]

Time Hierarchy Theorem:
Let \(t_1(n)\) and \(t_2(n)\) be time bounds such that \(t_1(n) \log(n) = o(t_2(n))\).
Then \(\time{t_1} \subsetneq \time{t_2}\).

There is a problem that can be solved in \(O(t_2(n))\) time, but not \(O(t_1(n))\) time!

Consequences:

  • \(\time{n^c} \subsetneq \time{n^{k}}\) for all \(c < k\), e.g., there are problems solvable in quadratic time, but not linear time.
  • \(\time{n^c} \subsetneq \time{2^{n^\delta}}\) for all \(c, \delta > 0\)
  • \(\time{2^{cn}} \subsetneq \time{2^{k n}}\) for all \(c < k\)
  • There is an infinite hierarchy of time complexity classes!

The complexity class \(\P\)

A special time complexity class containing problems with “efficient” solutions:

We denote by \(\P\) the class of languages decidable in polynomial time by a (deterministic) Turing machine: \[ \P = \bigcup_{k = 1}^\infty \time{n^k} \]

  • Why this notion of efficiency?
    • It is robust to changes in our model of computation.
      (All “reasonable” models of computation are polynomially equivalent.)
    • Exponential time algorithms are often “brute-force” approaches, while polynomial-time algorithms
      (when they exist) involve deeper insights into the problem. (E.g., Dynamic programming)
    • \(\P\) roughly captures the class of problems that are realistically solvable in practice.

Robustness of the definition of \(\P\)

Remember our model of compute time:

The time complexity of a decider \(M\) is the time bound function \(t : \mathbb{N} \to \mathbb{N}^+\) defined as \[ t(n) = \max_{x \in \binary^n} \texttt{time}_M(x), \]

where \(\texttt{time}_M(x)\) is the number of configurations that \(M\) visits on input \(x\).

  • Specific choices made in this definition:
    • \(M\) is a single-tape, deterministic Turing machine.
    • \(n\) is the length of the input’s encoding as a binary string.
  • What if we make different choices?
    • If we use a two-tape Turing machine, we would gain at most a quadratic speedup.
      Optional Section 9.3: any \(O(t)\) two-tape TM can be simulated by an \(O(t^2)\) single-tape TM.
      Therefore membership in \(\P\) is unchanged (\(O(n^k) \to O(n^{2k})\))
    • Any “reasonable” input encoding also does not change the class \(\P\).

Invariance to “reasonable” input encodings

  • Let \(\encoding{\mathcal{X}}_1\) and \(\encoding{\mathcal{X}}_2\) be two different encodings of input object \(\mathcal{X}\) into binary strings.
    Define an “encoding blowup factor” from \(\encoding{\mathcal{X}}_1\) to \(\encoding{\mathcal{X}}_2\) as: \[ f_{1 \to 2}(n) = \max_{\mathcal{X} \text{ s.t. } |\encoding{\mathcal{X}}_1| = n} |\encoding{\mathcal{X}}_2| \]

  • If \(f_{1 \to 2}(n) = O(n^c)\) and \(f_{2 \to 1}(n) = O(n^c)\) (i.e., each encoding length is within a polynomial of the other), each encoding yields the same definition of \(\P\).
    (Assuming encoding/decoding takes polynomial time.)

  • Example of reasonable encodings for graph \(G = (V, E)\):

    data/images/tm/example_graph.svg
    • Node and edge lists:
      \(\encoding{G}_1 = \texttt{ascii\_to\_binary}(((1,2,3,4), ((1,2),(2,3),(3,1),(1,4))))\)

    • Binary adjacency matrix: \[ \begin{bmatrix} 0 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ \end{bmatrix} \quad \Longrightarrow \quad \encoding{G}_2 = 0111101011001000 \hspace{15em} \]