ECS 120 Theory of Computation

NP-Completeness and Reductions

Julian Panetta

University of California, Davis

Recall \(\NP\): Problems with efficiently verifiable solutions

A polynomial-time verifier for a language \(A\) is a Turing machine \(V\) such that: \[ A = \setbuild{x \in \binary^*}{\left(\exists w \in \binary^{\le |x|^k}\right) \; V \text{ accepts } \encoding{x, w}} \] for some constant \(k\). Furthermore, \(V\) must run in time \(O(|x|^c)\) for some constant \(c\).

Here, \(w\) is called a witness or certificate or proof for \(x\).

\(\NP\) is the class of languages that have polynomial-time verifiers.
We call the language decided by the verifier for \(A \in \NP\) the verification language of \(A\), denoted \(\verifier{A}\).

Recall: Example problems in \(\NP\)

\[\probHamPath = \setbuild{\encoding{G}}{ G \text{ is a directed graph with a Hamiltonian path }}\]

\[ \probClique = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-clique}} \]

\[ \probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a } \textbf{multiset} \text{of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y} \]

The class \(\coNP\)

The complements of \(\probHamPath, \probClique, \probSubsetSum\) are not obviously in \(\NP\)
(and are believed not to be)
- Why not?
- Suppose \(G\) has no Hamiltonian path: what’s the short, easy-to-check proof?
- Only obvious way is the same obvious way to decide \(\probHamPath\): enumerate all \(n!\) permutations of nodes and check that each is not a Hamiltonian path.
The class of languages whose complements are in \(\NP\) is called \(\coNP\).
We don’t know whether \(\NP = \coNP\) (we doubt it), but note:
- \(\P = \mathsf{coP}\quad\) (why?)
- So proving \(\NP \ne \coNP\) would imply \(\P \ne \NP\).
“Proving \(\coNP \ne \NP\) is at least as hard as proving \(\P \ne \NP\).”

\(\P \subseteq \NP\)

Theorem: \(\P \subseteq \NP\)

Proof:

Let \(L \in \P\), then there is a polynomial-time algorithm A(x) deciding whether \(x \in L\).
The following is a verifier for \(L\):

def V(x: str, w: str) -> bool:
    return A(x)

Why does this work?
- \(x \in L \iff A \text{ accepts } x \fragment{\iff (\exists w \in \binary^{|x|})\ V \text{ accepts } \encoding{x,w}.}\)
- In other words the verifier just ignores the witness and directly “verifies” by calling the decider.
Somewhat more obvious that \(\P \subseteq \NP\) using the “nondeterministic polynomial-time Turing machine” characterization of \(\NP\):
- A deterministic algorithm is a special case of a nondeterministic algorithm that never makes guesses.

The (literal) million-dollar question: \(\P = \NP\)?

We showed that \(\P \subseteq \NP\), so either
- \(\P \subsetneq \NP\); or
- \(\P = \NP\).
We don’t know which is true, but most conjecture that \(\P \subsetneq \NP\).
The \(\P = \NP\) question is one of the most important open problems in computer science.
Informally, it’s asking: “Is finding solutions as easy as verifying that they work?”
- Intuitively it does seem like it should be harder to find a solution than to verify one…
- Many important problems are not known to be in \(\P\), but for some reason, most important problems we encounter in practice have the “\(\NP\) property”: You know a good solution when you see one.

\(\NP\) problems are decidable in exponential time

For many problems in \(\NP\) we don’t know how to decide them in polynomial time.
But at worst, we can use their efficient verifiers to decide them by brute-force search: try all possible witness strings \(w\) and check if any work. How long does this take?

Theorem: \(\NP \subseteq \EXP.\)

Proof:

k = 7 # change this constant depending on L
def exhaustive(x: str) -> bool:
    for i in range(k+1):
        for w in binary_strings_of_length(i):
            if V(x,w):
                return True
    return False

Given problem \(L \in \NP\), let \(V\) be its verifier.
Thus \(x \in L \iff (\exists w \in \binary^{\le |x|^k}) \; V \text{ accepts } \encoding{x, w}\).
Here \(k\) is some constant, and \(V\) runs in time \(O(|x|^c)\) for some other constant \(c\).
We can thus decide \(L\) by:
- running \(V\) on all possible strings \(w\) of length \(\le |x|^k\)
- accepting \(x\) if any of the \(\encoding{x, w}\) are accepted by \(V\).
The number of strings we must try is \(2^0 + 2^1 + \ldots + 2^{|x|^k} \fragment{= 2^{|x|^k + 1} - 1} \fragment{= O(2^{|x|^k}).}\)
The total runtime is therefore \(O(|x|^c \cdot 2^{|x|^k}) \fragment{= O(2^{2 |x|^k})\text{; thus } L \in \EXP.}\)

\(\NP\)-completeness

The practical benefit of studying \(\NP\) is being able to identify when a problem you want to solve is fundamentally “hard.”
- Maybe you just haven’t found the right algorithm yet and should keep trying?
- But if you can prove your problem is “at least as hard” as the hardest problems in \(\NP\), then you can be sure you’re not missing an obvious trick.
- If \(\P \ne \NP\) (as many believe), then your problem would have no efficient solution.
Problems in \(\NP\) that are “at least as hard” as every other problem in \(\NP\) are called
\(\NP\)-complete.
- Intuitively: given an \(\NP\) complete problem \(A\), any other problem in \(\NP\) can be solved by translating it into an instance of \(A\), and then solving that.
- This process is called a reduction from that other problem to \(A\).
Powerful theoretical tool: by showing a problem is \(\NP\)-complete, you show solving it for large inputs is “probably” intractable.

Example \(\NP\)-complete problem: Boolean satisfiability

Recall Boolean formulas: expressions involving Boolean variables and the logical operators \(\land\), \(\lor\), and \(\neg\). Example: \[ \fragment{\phi = (x \land y) \lor (z \land \neg y)} \]
Represent \(\TRUE\) by \(1\) and \(\FALSE\) by \(0\).
Operations defined by truth tables:

Negation/NOT

\[ \begin{array}{c|c} x & \neg x \\ \hline 0 & 1 \\ 1 & 0 \end{array} \]

Conjunction/AND

\[ \begin{array}{cc|c} x & y & x \land y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \end{array} \]

Disjunction/OR

\[ \begin{array}{cc|c} x & y & x \lor y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \end{array} \]

Example \(\NP\)-complete problem: Boolean satisfiability

More formally, we can define Boolean formulas inductively (like with regular expressions):

Given a finite set of variables \(V\), a Boolean formula \(\phi\) is defined as:

Base cases:
1. Any \(x \in V\) is a Boolean formula.
2. 0 and 1 are each Boolean formulas.
Inductive cases:
Let \(\phi_1\) and \(\phi_2\) be Boolean formulas. Then the following are also Boolean formulas:
1. \(\neg \phi_1\)
2. \(\phi_1 \land \phi_2\)
3. \(\phi_1 \lor \phi_2\)

Precedence: \(\neg\) has highest precedence, followed by \(\land\), then \(\lor\).
Parentheses can be used to override this.

We can use this recursive structure to efficiently parse and evaluate Boolean formulas.

Example \(\NP\)-complete problem: Boolean satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

Example: \[ \phi = (x \land y) \lor (z \land \neg y) \] is satisfiable because setting \(x = 0, y = 0, z = 1\) yields \(\phi(001) = 1\).
We introduce the Boolean satisfiability problem:

\[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]
We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

\[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

Just evaluate \(\phi\) on the assignment \(w\)!

Example \(\NP\)-complete problem: Boolean satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

We introduce the Boolean satisfiability problem:

\[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]
We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

\[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

Cook-Levin Theorem: \(\probSAT \in \P \iff \P = \NP\).
- In other words: we could prove \(\P = \NP\) by finding a polynomial-time algorithm for \(\probSAT\).
- This is another way of saying \(\probSAT\) is \(\NP\)-complete.
- \(\probSAT\) is “at least as hard” as any other problem in \(\NP\).

Title

Recall \(\NP\): Problems with efficiently verifiable solutions

Recall: Example problems in \(\NP\)

The class \(\coNP\)

\(\P \subseteq \NP\)

The (literal) million-dollar question: \(\P = \NP\)?

\(\NP\) problems are decidable in exponential time

\(\NP\)-completeness

Example \(\NP\)-complete problem: Boolean satisfiability

Example \(\NP\)-complete problem: Boolean satisfiability

Example \(\NP\)-complete problem: Boolean satisfiability

Example \(\NP\)-complete problem: Boolean satisfiability