Title

ECS 120 Theory of Computation
NP-Completeness and Reductions
Julian Panetta
University of California, Davis

Recall \(\NP\): Problems with efficiently verifiable solutions

A polynomial-time verifier for a language \(A\) is a Turing machine \(V\) such that: \[ A = \setbuild{x \in \binary^*}{\left(\exists w \in \binary^{\le |x|^k}\right) \; V \text{ accepts } \encoding{x, w}} \] for some constant \(k\). Furthermore, \(V\) must run in time \(O(|x|^c)\) for some constant \(c\).

Here, \(w\) is called a witness or certificate or proof for \(x\).

\(\NP\) is the class of languages that have polynomial-time verifiers.
We call the language decided by the verifier for \(A \in \NP\) the verification language of \(A\), denoted \(\verifier{A}\).

Recall: Example problems in \(\NP\)

\[\probHamPath = \setbuild{\encoding{G}}{ G \text{ is a directed graph with a Hamiltonian path }}\]

\[ \probClique = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-clique}} \]

\[ \probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a } \textbf{multiset} \text{of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y} \]

The class \(\coNP\)

  • The complements of \(\probHamPath, \probClique, \probSubsetSum\) are not obviously in \(\NP\)
    (and are believed not to be)
    • Why not?
    • Suppose \(G\) has no Hamiltonian path: what’s the short, easy-to-check proof?
    • Only obvious way is the same obvious way to decide \(\probHamPath\): enumerate all \(n!\) permutations of nodes and check that each is not a Hamiltonian path.
  • The class of languages whose complements are in \(\NP\) is called \(\coNP\).
  • We don’t know whether \(\NP = \coNP\) (we doubt it), but note:
    • \(\P = \mathsf{coP}\quad\) (why?)
    • So proving \(\NP \ne \coNP\) would imply \(\P \ne \NP\).

    Proving \(\coNP \ne \NP\) is at least as hard as proving \(\P \ne \NP\).

\(\P \subseteq \NP\)

Theorem: \(\P \subseteq \NP\)

Proof:

  • Let \(L \in \P\), then there is a polynomial-time algorithm A(x) deciding whether \(x \in L\).
  • The following is a verifier for \(L\):
def V(x: str, w: str) -> bool:
    return A(x)
  • Why does this work?
    • \(x \in L \iff A \text{ accepts } x \fragment{\iff (\exists w \in \binary^{|x|})\ V \text{ accepts } \encoding{x,w}.}\)
    • In other words the verifier just ignores the witness and directly “verifies” by calling the decider.
  • Somewhat more obvious that \(\P \subseteq \NP\) using the “nondeterministic polynomial-time Turing machine” characterization of \(\NP\):
    • A deterministic algorithm is a special case of a nondeterministic algorithm that never makes guesses.

The (literal) million-dollar question: \(\P = \NP\)?

NP P P = NP or
  • We showed that \(\P \subseteq \NP\), so either
    • \(\P \subsetneq \NP\); or
    • \(\P = \NP\).
  • We don’t know which is true, but most conjecture that \(\P \subsetneq \NP\).
  • The \(\P = \NP\) question is one of the most important open problems in computer science.
  • Informally, it’s asking: “Is finding solutions as easy as verifying that they work?
    • Intuitively it does seem like it should be harder to find a solution than to verify one…
    • Many important problems are not known to be in \(\P\), but for some reason, most important problems we encounter in practice have the “\(\NP\) property”: You know a good solution when you see one.

\(\NP\) problems are decidable in exponential time

  • For many problems in \(\NP\) we don’t know how to decide them in polynomial time.
  • But at worst, we can use their efficient verifiers to decide them by brute-force search: try all possible witness strings \(w\) and check if any work. How long does this take?

Theorem: \(\NP \subseteq \EXP.\)

Proof:

k = 7 # change this constant depending on L
def exhaustive(x: str) -> bool:
    for i in range(k+1):
        for w in binary_strings_of_length(i):
            if V(x,w):
                return True
    return False
  • Given problem \(L \in \NP\), let \(V\) be its verifier.
  • Thus \(x \in L \iff (\exists w \in \binary^{\le |x|^k}) \; V \text{ accepts } \encoding{x, w}\).
  • Here \(k\) is some constant, and \(V\) runs in time \(O(|x|^c)\) for some other constant \(c\).
  • We can thus decide \(L\) by:
    • running \(V\) on all possible strings \(w\) of length \(\le |x|^k\)
    • accepting \(x\) if any of the \(\encoding{x, w}\) are accepted by \(V\).
  • The number of strings we must try is \(2^0 + 2^1 + \ldots + 2^{|x|^k} \fragment{= 2^{|x|^k + 1} - 1} \fragment{= O(2^{|x|^k}).}\)
  • The total runtime is therefore \(O(|x|^c \cdot 2^{|x|^k}) \fragment{= O(2^{2 |x|^k})\text{; thus } L \in \EXP.}\)

\(\NP\)-completeness

  • The practical benefit of studying \(\NP\) is being able to identify when a problem you want to solve is fundamentally “hard.”
    • Maybe you just haven’t found the right algorithm yet and should keep trying?
    • But if you can prove your problem is “at least as hard” as the hardest problems in \(\NP\), then you can be sure you’re not missing an obvious trick.
    • If \(\P \ne \NP\) (as many believe), then your problem would have no efficient solution.
  • Problems in \(\NP\) that are “at least as hard” as every other problem in \(\NP\) are called
    \(\NP\)-complete.
    • Intuitively: given an \(\NP\) complete problem \(A\), any other problem in \(\NP\) can be solved by translating it into an instance of \(A\), and then solving that.
    • This process is called a reduction from that other problem to \(A\).
  • Powerful theoretical tool: by showing a problem is \(\NP\)-complete, you show solving it for large inputs is “probably” intractable.

Example \(\NP\)-complete problem: Boolean satisfiability

  • Recall Boolean formulas: expressions involving Boolean variables and the logical operators \(\land\), \(\lor\), and \(\neg\). Example: \[ \fragment{\phi = (x \land y) \lor (z \land \neg y)} \]
  • Represent \(\TRUE\) by \(1\) and \(\FALSE\) by \(0\).
  • Operations defined by truth tables:

Negation/NOT

\[ \begin{array}{c|c} x & \neg x \\ \hline 0 & 1 \\ 1 & 0 \end{array} \]

Conjunction/AND

\[ \begin{array}{cc|c} x & y & x \land y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \end{array} \]

Disjunction/OR

\[ \begin{array}{cc|c} x & y & x \lor y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \end{array} \]

Example \(\NP\)-complete problem: Boolean satisfiability

More formally, we can define Boolean formulas inductively (like with regular expressions):

Given a finite set of variables \(V\), a Boolean formula \(\phi\) is defined as:

  • Base cases:
    1. Any \(x \in V\) is a Boolean formula.
    2. 0 and 1 are each Boolean formulas.
  • Inductive cases:
    Let \(\phi_1\) and \(\phi_2\) be Boolean formulas. Then the following are also Boolean formulas:
    1. \(\neg \phi_1\)
    2. \(\phi_1 \land \phi_2\)
    3. \(\phi_1 \lor \phi_2\)

Precedence: \(\neg\) has highest precedence, followed by \(\land\), then \(\lor\).
Parentheses can be used to override this.

We can use this recursive structure to efficiently parse and evaluate Boolean formulas.

Example \(\NP\)-complete problem: Boolean satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

  • Example: \[ \phi = (x \land y) \lor (z \land \neg y) \] is satisfiable because setting \(x = 0, y = 0, z = 1\) yields \(\phi(001) = 1\).

  • We introduce the Boolean satisfiability problem:

    \[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]

  • We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

    \[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

    Just evaluate \(\phi\) on the assignment \(w\)!

Example \(\NP\)-complete problem: Boolean satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

  • We introduce the Boolean satisfiability problem:

    \[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]

  • We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

    \[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

  • Cook-Levin Theorem: \(\probSAT \in \P \iff \P = \NP\).
    • In other words: we could prove \(\P = \NP\) by finding a polynomial-time algorithm for \(\probSAT\).
    • This is another way of saying \(\probSAT\) is \(\NP\)-complete.
    • \(\probSAT\) is “at least as hard” as any other problem in \(\NP\).