Title

ECS 120 Theory of Computation
NP-Completeness and Reductions
Julian Panetta
University of California, Davis

Recall \(\NP\): Problems with Efficiently Verifiable Solutions

A polynomial-time verifier for a language \(A\) is a Turing machine \(V\) such that: \[ A = \setbuild{x \in \binary^*}{\left(\exists w \in \binary^{\le |x|^k}\right) \; V \text{ accepts } \encoding{x, w}} \] for some constant \(k\). Furthermore, \(V\) must run in time \(O(|x|^c)\) for some constant \(c\).

Here, \(w\) is called a witness or certificate or proof for \(x\).

\(\NP\) is the class of languages that have polynomial-time verifiers.
We call the language decided by the verifier for \(A \in \NP\) the verification language of \(A\), denoted \(\verifier{A}\).

Recall Example Problems in \(\NP\)

\[\probHamPath = \setbuild{\encoding{G}}{ G \text{ is a directed graph with a Hamiltonian path }}\]

\[ \probClique = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-clique}} \]

\[ \probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a } \textbf{multiset} \text{of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y} \]

The Class \(\coNP\)

  • The complements of \(\probClique\) and \(\probSubsetSum\) are not obviously in \(\NP\)
    (and are believed not to be)
    • Why not?
  • The class of languages whose complements are in \(\NP\) is called \(\coNP\).
  • We don’t know whether \(\NP = \coNP\) (we doubt it), but note:
    • \(\P = \texttt{coP}\quad\) (why?)
    • So proving \(\NP \ne \coNP\) would imply \(\P \ne \NP\).

    “Proving \(\coNP \ne \NP\) is at least as hard as proving \(\P \ne \NP\).”

The \(\P\) vs \(\NP\) Question

NP P P = NP or
  • It is obvious that \(\P \subseteq \NP\), so either
    • \(\P \subsetneq \NP\); or
    • \(\P = \NP\).
  • We don’t know which is true, but many conjecture that \(\P \subsetneq \NP\).
  • The \(\P = \NP\) question is one of the most important open problems in computer science.
  • Informally, it’s asking: “Is finding solutions as easy as verifying that they work?”
  • Intuitively it does seem like it should be harder to find a solution than to verify one…

\(\NP\) Problems are Decidable in Exponential Time

  • For many problems in \(\NP\) we don’t know how to solve them efficiently.
  • But at worst, we can use their efficient verifiers to solve them by brute-force search:
    try all possible witness strings \(w\) and check if any work. How long does this take?

Theorem: \(\NP \subseteq \EXP\)

Proof:

  • Given problem \(A \in \NP\), let \(V\) be its verifier.
  • Thus \(x \in A \iff \exists w \in \binary^{\le |x|^k} \; V \text{ accepts } \encoding{x, w}\).
  • Here \(k\) is some constant, and \(V\) runs in time \(O(|x|^c)\) for some other constant \(c\).
  • We can thus decide \(A\) by:
    • running \(V\) on all possible strings \(w\) of length \(\le |x|^k\)
    • accepting if any of the \(\encoding{x, w}\) are accepted by \(V\).
  • The number of strings we must try is \(2^0 + 2^1 + \ldots + 2^{|x|^k} \fragment{= 2^{|x|^k + 1} - 1} \fragment{= O(2^{|x|^k}).}\)
  • The total runtime is therefore \(O(|x|^c \cdot 2^{|x|^k}) \fragment{= O(2^{2 |x|^k}).}\)

\(\NP\)-Completeness

  • The practical benefit of studying \(\NP\) is being able to identify when a problem you want to solve is fundamentally “hard.”
    • Maybe you just haven’t found the right algorithm yet and should keep trying?
    • But if you can prove your problem is “at least as hard” as the hardest problems \(\NP\), then you can be sure you’re not missing an obvious trick.
    • If \(\P \ne \NP\) (as many believe), then your problem would have no efficient solution.
  • Problems in \(\NP\) that are “at least as hard” as every other problem in \(\NP\) are called
    \(\NP\)-complete.
    • Intuitively: given an \(\NP\) complete problem \(A\), any other problem in \(\NP\) can be solved by translating it into an instance of \(A\), and then solving that.
    • This process is called a reduction from that other problem to \(A\).
  • Powerful theoretical tool: by showing a problem is \(\NP\)-complete, you show solving it for large inputs is “probably” intractable.

Example \(\NP\)-Complete Problem: Boolean Satisfiability

  • Recall Boolean formulas: expressions involving Boolean variables and the logical operators \(\land\), \(\lor\), and \(\neg\). Example: \[ \fragment{\phi = (x \land y) \lor (z \land \neg y)} \]
  • Represent \(\TRUE\) by \(1\) and \(\FALSE\) by \(0\).
  • Operations defined by truth tables:

Negation/NOT

\[ \begin{array}{c|c} x & \neg x \\ \hline 0 & 1 \\ 1 & 0 \end{array} \]

Conjunction/AND

\[ \begin{array}{cc|c} x & y & x \land y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \end{array} \]

Disjunction/OR

\[ \begin{array}{cc|c} x & y & x \lor y \\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \end{array} \]

Example \(\NP\)-Complete Problem: Boolean Satisfiability

More formally, we can define Boolean formulas inductively (like with regular expressions):

Given a finite set of variables \(V\), a Boolean formula \(\phi\) is defined as:

  • Base cases:
    1. Any \(x \in V\) is a Boolean formula.
    2. 0 and 1 are each Boolean formulas.
  • Inductive cases:
    Let \(\phi_1\) and \(\phi_2\) be Boolean formulas. Then the following are also Boolean formulas:
    1. \(\neg \phi_1\)
    2. \(\phi_1 \land \phi_2\)
    3. \(\phi_1 \lor \phi_2\)

Precedence: \(\neg\) has highest precedence, followed by \(\land\), then \(\lor\).
Parentheses can be used to override this.

We can use this recursive structure to efficiently parse and evaluate Boolean formulas.

Example \(\NP\)-Complete Problem: Boolean Satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

  • Example: \[ \phi = (x \land y) \lor (z \land \neg y) \] is satisfiable because setting \(x = 0, y = 0, z = 1\) yields \(\phi(001) = 1\).

  • We introduce the Boolean satisfiability problem:

    \[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]

  • We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

    \[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

    Just evaluate \(\phi\) on the assignment \(w\)!

Example \(\NP\)-Complete Problem: Boolean Satisfiability

A Boolean formula \(\phi\) is satisfiable if there is an assignment of truth values to its variables that makes the formula evaluate to \(1\).

  • We introduce the Boolean satisfiability problem:

    \[ \probSAT = \setbuild{\encoding{\phi}}{ \phi \text{ is a satisfiable Boolean formula }} \]

  • We know \(\probSAT \in \NP\) because we can easily implement a polynomial-time decider for:

    \[ \verifier{\probSAT} = \setbuild{\encoding{\phi, w}}{ \phi \text{ is a Boolean formula, and } \phi(w) = 1 } \]

  • Cook-Levin Theorem: \(\probSAT \in \P \iff \P = \NP\).
    • In other words: we could prove \(\P = \NP\) by finding a polynomial-time algorithm for \(\probSAT\).
    • This is another way of saying \(\probSAT\) is \(\NP\)-complete.
    • \(\probSAT\) is “at least as hard” as any other problem in \(\NP\).

Reductions

  • The way you show problem \(B\) is at least as difficult as problem \(A\) is by exhibiting a reduction from \(A\) to \(B\).
    • Intuitively: we can solve an instance of \(A\) by converting it into an “equivalent” instance of \(B\).
    • Provided the conversion process is efficient, a fast algorithm for \(B\) enables an efficient algorithm for \(A\).
  • Since our notion of “efficiency” is “computable in polynomial time,” we are specifically concerned with polynomial-time reductions.

Simple Reduction Example: \(\probIndSet\) to \(\probClique\)

  • Recall the the \(k\)-clique problem: \[ \probClique = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-clique}} \]

  • We now introduce a closely related problem, the \(k\)-independent set problem: \[ \probIndSet = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-independent set}} \]

  • Given input \(\encoding{G_i, k}\) for \(\probIndSet\),
    how can we convert it to input \(\encoding{G_c, k}\) for \(\probClique\) such that:

    \[ \encoding{G_i, k} \in \probIndSet \iff \encoding{G_c, k} \in \probClique \]

    data/images/complexity/ind_set_clique.svg

Definition of Polynomial-Time Reducibility

A function \(f: \binary^* \to \binary^*\) is polynomial-time computable if there exists a polynomial-time algorithm that, given input \(x\), computes \(f(x)\).

Let \(A, B \subseteq \binary^*\) be decision problems. We say \(A\) is polynomial-time reducible to \(B\)
if there is a polynomial-time computable function \(f\) such that for all \(x \in \binary^*\):

\[ \fragment{x \in A \iff f(x) \in B} \]

We denote this by \(A \le^P B\).

\(A \le^P B\) means: “B is at least as hard as A” (to within a polynomial-time factor).

Example Reduction to Bound Complexity of \(\probClique\)

from itertools import combinations as subsets
def reduction_from_independent_set_to_clique(G, k):
    V, E = G
    Ec = [ {u,v} for (u,v) in subsets(V,2)
            if {u,v} not in E and u!=v ]
    Gc = (V, Ec)
    return (Gc, k)

# Hypothetical polynomial-time algorithm for Independent Set
def independent_set(G, k):
    Gp, kp = reduction_from_independent_set_to_clique(G, k)
    return clique_algorithm(Gp, kp)

# Hypothetical polynomial-time algorithm for Clique
def clique_algorithm(G, k):
    raise NotImplementedError("Not implemented yet...")