Title

ECS 120 Theory of Computation
NP Completeness and Reductions, Pt 4
Julian Panetta
University of California, Davis

Restatement of the Cook-Levin Theorem

Cook-Levin Theorem: \(\probSAT\) and \(\probThreeSAT\) are NP-complete.

  • Proof: Optional Section 10.10 (and ECS 220)
  • This is proved “directly” by exhibiting a reduction from a generic problem \(A \in \NP.\)
  • Once we have these “base” NP-complete problems, we can build up a library of
    NP-complete problems by finding reductions from other problems.

Theorem: \(\probIndSet\) and \(\probClique\) are NP-complete.

Proof:

  • \(\probThreeSAT \le^P \probIndSet\) as proved earlier, so \(\probIndSet\) is NP-hard.
  • \(\probIndSet \le^P \probClique\) as proved earlier, so \(\probClique\) is NP-hard.
  • \(\probClique, \probIndSet \in \NP\) due to polynomial-time verifiers (shown for \(\probClique\) here).
  • Thus, both \(\probIndSet\) and \(\probClique\) are NP-complete.

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof:

  • Given a 3-CNF formula \(\phi\), we must output \((C, t)\) such that: \[ \phi \text{ is satisfiable } \iff (\exists S \subseteq C) \; t = \sum_{y \in S} y \]

  • Suppose \(\phi\) has \(m\) variables \(x_1,\dots,x_m\) and \(l\) clauses \(c_1,\dots,c_l\).

  • Let \(C = \{y_1,z_1,y_2,z_2,\dots,y_m,z_m,g_1,h_1,g_2,h_2,\dots,g_l,h_l\}\) be a multiset of \(2m + 2l\) integers, and integer \(t\) defined as follows.

  • For example, if \(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\), here are the integers in base 10:

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1, & 0 & 0 & 0, & 1 & 0 & 1 \\ z_1 & 1, & 0 & 0 & 0, & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0, & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0, & 1 & 0 & 0 \\ y_3 & & & 1 & 0, & 1 & 1 & 0 \\ z_3 & & & 1 & 0, & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: (\(m=\) number of variables, \(l=\) number of clauses)

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

  • More generally, define each integer in \(C\) using \(m + l\) digits in base \(10\):
    • \(y_i\) and \(z_i\) have a 1 at index \(i\), 0 at indices \(\{1,\dots,m\} \setminus \{i\}\). (top-left block)
    • \(y_i\) has a 1 at index \(m+j\) iff clause \(c_j\) contains literal \(x_i\). (top-right block)
    • \(z_i\) has a 1 at index \(m+j\) iff clause \(c_j\) contains literal \(\overline{x_i}\). (top-right block)
    • For \(j = 1,\dots,l\), \(g_j\) and \(h_j\) both have a 1 only at index \(m + j\). (bottom-right block)
  • Finally, set target sum \(t\) to have a 1 in each of the first \(m\) indices and a 3 in each of the last \(l\) indices.
  • The number of digits is \((m+l)(2(m+l)+1) = O((m+l)^2)\), so can be computed in polynomial time.
  • Since each column in the table has exactly two or five 1’s, there are no carries in the sum of any subset of \(C\).

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: \((\implies)\): Suppose \(\phi\) is satisfied by \(x=x_1,\dots,x_m\).

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

  • For each variable \(x_i\), if \(x_i = 1\), include \(y_i\) in subset \(S\), otherwise include \(z_i\).
    • Because exactly one of \(y_i\) or \(z_i\) is in \(S\), the first \(m\) digits of the sum are 1.
  • For each clause \(c_j\):
    • Since \(x\) satisfies \(\phi\), one, two, or three literals in \(c_j\) are true under \(x\).
    • If exactly one literal in \(c_j\) is true under \(x\), include both \(g_j\) and \(h_j\) in \(S\).
    • If exactly two literals in \(c_j\) are true under \(x\), include exactly one of \(g_j\) or \(h_j\) in \(S\).
    • If exactly three literals in \(c_j\) are true under \(x\), include neither \(g_j\) nor \(h_j\) in \(S\).
    • This ensures the last \(l\) digits of the sum are 3.
  • So \(t = \sum_{y \in S} y\).

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: \((\impliedby)\): Suppose \((\exists S \subseteq C)\ t=\sum_{y \in S} y\).

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

  • To get digit 1 at each of the first \(m\) indices of \(t\), for each variable \(x_i\), exactly one of \(y_i\) or \(z_i\) must be in \(S\).
  • If \(y_i \in S\), set \(x_i = 1\); otherwise set \(x_i = 0\); we must show this assignment satisfies \(\phi\).
  • Consider clause \(c_j\). Since there are only two 1’s in rows \(g_j\) and \(h_j\), to get digit 3 at index \(m+j\) of \(t\), for some \(i\), we have \(y_i \in S\) (thus \(c_j\) contains literal \(x_i\)) or \(z_i \in S\) (thus \(c_j\) contains literal \(\overline{x_i}\)).
    • If \(y_i \in S\) then we set \(x_i=1\), so \(x_i\) satisfies \(c_j\).
    • If \(z_i \in S\) then we set \(x_i=0\), so \(\overline{x_i}\) satisfies \(c_j\).
  • Since this holds for all clauses, \(\phi\) is satisfied by the assignment.

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

  • Given Boolean formula \(\phi\), we must output 3-CNF formula \(\psi\) so that \(\phi\) is satisfiable \(\iff \psi\) is satisfiable.
  • Obvious approach: convert \(\phi\) to equivalent 3-CNF formula \(\psi\). (i.e., \(\phi(x) = \psi(x)\) for all assignments \(x\).)
    • Problem: \(\psi\) can be exponentially larger than \(\phi\). (see ECS 220)
  • Instead, we construct \(\psi\) so that \(\phi\) is satisfiable \(\iff \psi\) is satisfiable, but \(\phi\) and \(\psi\) are not equivalent.
    • In fact \(\psi\) has more variables than \(\phi\)!
  • Terminology: each Boolean operator (\(\wedge\), \(\vee\), \(\neg\)) in \(\phi\) is called a gate.
    • For example, \(\phi = (x_1 \wedge \overline{x_2}) \vee (x_3 \wedge x_4)\) has four gates.
data/images/complexity/formula.svg

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

  • For each input and each gate in Boolean formula \(\phi\), create variables in 3-CNF \(\psi\) to represent them.
    • In the example, \(\phi\) has variables \(x_1,x_2,x_3,x_4\) and gates \(g_1,g_2,g_3,g_4\), so \(\psi\) will have variables \(x_1,x_2,x_3,x_4,g_1,g_2,g_3,g_4\).
  • For each two-input gate \(g_i\), add four clauses to \(\psi\) that assert “\(g_i\) is functioning properly”. For example,
    • \((\overline{x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) expresses “If \(x_1=0\) and \(x_2=0\), then \(g_2=0\)”.
    • \((\overline{x_1} \wedge {x_2}) \implies \overline{g_2}\) expresses “If \(x_1=0\) and \(x_2=1\), then \(g_2=0\)”.
    • \(({x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) expresses “If \(x_1=1\) and \(x_2=0\), then \(g_2=0\)”.
    • \(({x_1} \wedge {x_2}) \implies {g_2}\) expresses “If \(x_1=1\) and \(x_2=1\), then \(g_2=1\)”.
  • Wait… aren’t clauses supposed to be \(\vee\)’s of literals?
    • Yes! Remember \(p \implies q\) is equvalent to \(\overline{p} \vee q\).
    • So \((\overline{x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) is equivalent to \(\overline{(\overline{x_1} \wedge \overline{x_2})} \vee \overline{g_2} = x_1 \vee x_2 \vee \overline{g_2}\).
  • Finally, if \(g_k\) is the output gate, add clause \((g_k \vee g_k \vee g_k)\) to \(\psi\).
data/images/complexity/formula.svg

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

  • (\(\implies\)) Assume \(\phi\) is satified by assignment \(x=x_1\dots x_n\).
    • Then assigning variables \(x_i\) in \(\psi\) according to \(x\), and each gate variable \(g_j\) according to the output of that gate on inputs defined by \(x\), satisfies all clauses in \(\psi\).
  • (\(\impliedby\)) Assume \(\psi\) is satisfied by assignment \(w=x_1 \dots x_n g_1 \dots g_k\). We claim \(x=x_1 \dots x_n\) satisfies \(\phi\).
    • Consider gate \(g_j\) with inputs that are either variables \(x_i\) or outputs of other gates.
    • Since all clauses in \(\psi\) are satisfied, the four clauses corresponding to \(g_j\) ensure
      that \(g_j\) is the correct logical function of its inputs under assignment \(x\).
    • By induction on the structure of \(\phi\), the output gate \(g_k\) must have value 1
      under assignment \(w\), since \((g_k \vee g_k \vee g_k)\) is a clause.
    • Therefore, \(\phi\) is satisfied by assignment \(x=x_1 \dots x_n\).
data/images/complexity/formula.svg