Restatement of the Cook-Levin Theorem

Cook-Levin Theorem: \(\probSAT\) and \(\probThreeSAT\) are NP-complete.

Proof: Optional Section 10.10 (and ECS 220)
This is proved “directly” by exhibiting a reduction from a generic problem \(A \in \NP.\)
Once we have these “base” NP-complete problems, we can build up a library of
NP-complete problems by finding reductions from other problems.

Theorem: \(\probIndSet\) and \(\probClique\) are NP-complete.

Proof:

\(\probThreeSAT \le^P \probIndSet\) as proved earlier, so \(\probIndSet\) is NP-hard.
\(\probIndSet \le^P \probClique\) as proved earlier, so \(\probClique\) is NP-hard.
\(\probClique, \probIndSet \in \NP\) due to polynomial-time verifiers (shown for \(\probClique\) here).
Thus, both \(\probIndSet\) and \(\probClique\) are NP-complete.

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof:

Given a 3-CNF formula \(\phi\), we must output \((C, t)\) such that: \[ \phi \text{ is satisfiable } \iff (\exists S \subseteq C) \; t = \sum_{y \in S} y \]
Suppose \(\phi\) has \(m\) variables \(x_1,\dots,x_m\) and \(l\) clauses \(c_1,\dots,c_l\).
Let \(C = \{y_1,z_1,y_2,z_2,\dots,y_m,z_m,g_1,h_1,g_2,h_2,\dots,g_l,h_l\}\) be a multiset of \(2m + 2l\) integers, and integer \(t\) defined as follows.
For example, if \(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\), here are the integers in base 10:

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1, & 0 & 0 & 0, & 1 & 0 & 1 \\ z_1 & 1, & 0 & 0 & 0, & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0, & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0, & 1 & 0 & 0 \\ y_3 & & & 1 & 0, & 1 & 1 & 0 \\ z_3 & & & 1 & 0, & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: (\(m=\) number of variables, \(l=\) number of clauses)

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

More generally, define each integer in \(C\) using \(m + l\) digits in base \(10\):
- \(y_i\) and \(z_i\) have a 1 at index \(i\), 0 at indices \(\{1,\dots,m\} \setminus \{i\}\). (top-left block)
- \(y_i\) has a 1 at index \(m+j\) iff clause \(c_j\) contains literal \(x_i\). (top-right block)
- \(z_i\) has a 1 at index \(m+j\) iff clause \(c_j\) contains literal \(\overline{x_i}\). (top-right block)
- For \(j = 1,\dots,l\), \(g_j\) and \(h_j\) both have a 1 only at index \(m + j\). (bottom-right block)
Finally, set target sum \(t\) to have a 1 in each of the first \(m\) indices and a 3 in each of the last \(l\) indices.
The number of digits is \((m+l)(2(m+l)+1) = O((m+l)^2)\), so can be computed in polynomial time.
Since each column in the table has exactly two or five 1’s, there are no carries in the sum of any subset of \(C\).

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: \((\implies)\): Suppose \(\phi\) is satisfied by \(x=x_1,\dots,x_m\).

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

For each variable \(x_i\), if \(x_i = 1\), include \(y_i\) in subset \(S\), otherwise include \(z_i\).
- Because exactly one of \(y_i\) or \(z_i\) is in \(S\), the first \(m\) digits of the sum are 1.
For each clause \(c_j\):
- Since \(x\) satisfies \(\phi\), one, two, or three literals in \(c_j\) are true under \(x\).
- If exactly one literal in \(c_j\) is true under \(x\), include both \(g_j\) and \(h_j\) in \(S\).
- If exactly two literals in \(c_j\) are true under \(x\), include exactly one of \(g_j\) or \(h_j\) in \(S\).
- If exactly three literals in \(c_j\) are true under \(x\), include neither \(g_j\) nor \(h_j\) in \(S\).
- This ensures the last \(l\) digits of the sum are 3.
So \(t = \sum_{y \in S} y\).

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

\(\probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a multiset of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y}\)

\(\probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probThreeSAT \le^\P \probSubsetSum\).

Proof: \((\impliedby)\): Suppose \((\exists S \subseteq C)\ t=\sum_{y \in S} y\).

\(\phi = (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_3} \vee x_4)\)

\[ \begin{array}{c|cccc|cccc} & x_1 & x_2 & x_3 & x_4 & c_1 & c_2 & c_3 \\ \hline y_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ z_1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\ y_2 & & 1 & 0 & 0 & 0 & 1 & 0 \\ z_2 & & 1 & 0 & 0 & 1 & 0 & 0 \\ y_3 & & & 1 & 0 & 1 & 1 & 0 \\ z_3 & & & 1 & 0 & 0 & 0 & 1 \\ y_4 & & & & 1 & 0 & 0 & 1 \\ z_4 & & & & 1 & 0 & 0 & 0 \\ \hline g_1 & & & & & 1 & 0 & 0 \\ h_1 & & & & & 1 & 0 & 0 \\ g_2 & & & & & & 1 & 0 \\ h_2 & & & & & & 1 & 0 \\ g_3 & & & & & & & 1 \\ h_3 & & & & & & & 1 \\ \hline \hline t & 1 & 1 & 1 & 1 & 3 & 3 & 3 \end{array} \]

To get digit 1 at each of the first \(m\) indices of \(t\), for each variable \(x_i\), exactly one of \(y_i\) or \(z_i\) must be in \(S\).
If \(y_i \in S\), set \(x_i = 1\); otherwise set \(x_i = 0\); we must show this assignment satisfies \(\phi\).
Consider clause \(c_j\). Since there are only two 1’s in rows \(g_j\) and \(h_j\), to get digit 3 at index \(m+j\) of \(t\), for some \(i\), we have \(y_i \in S\) (thus \(c_j\) contains literal \(x_i\)) or \(z_i \in S\) (thus \(c_j\) contains literal \(\overline{x_i}\)).
- If \(y_i \in S\) then we set \(x_i=1\), so \(x_i\) satisfies \(c_j\).
- If \(z_i \in S\) then we set \(x_i=0\), so \(\overline{x_i}\) satisfies \(c_j\).
Since this holds for all clauses, \(\phi\) is satisfied by the assignment.

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

Given Boolean formula \(\phi\), we must output 3-CNF formula \(\psi\) so that \(\phi\) is satisfiable \(\iff \psi\) is satisfiable.
Obvious approach: convert \(\phi\) to equivalent 3-CNF formula \(\psi\). (i.e., \(\phi(x) = \psi(x)\) for all assignments \(x\).)
- Problem: \(\psi\) can be exponentially larger than \(\phi\). (see ECS 220)
Instead, we construct \(\psi\) so that \(\phi\) is satisfiable \(\iff \psi\) is satisfiable, but \(\phi\) and \(\psi\) are not equivalent.
- In fact \(\psi\) has more variables than \(\phi\)!
Terminology: each Boolean operator (\(\wedge\), \(\vee\), \(\neg\)) in \(\phi\) is called a gate.
- For example, \(\phi = (x_1 \wedge \overline{x_2}) \vee (x_3 \wedge x_4)\) has four gates.

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

For each input and each gate in Boolean formula \(\phi\), create variables in 3-CNF \(\psi\) to represent them.
- In the example, \(\phi\) has variables \(x_1,x_2,x_3,x_4\) and gates \(g_1,g_2,g_3,g_4\), so \(\psi\) will have variables \(x_1,x_2,x_3,x_4,g_1,g_2,g_3,g_4\).
For each two-input gate \(g_i\), add four clauses to \(\psi\) that assert “\(g_i\) is functioning properly”. For example,
- \((\overline{x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) expresses “If \(x_1=0\) and \(x_2=0\), then \(g_2=0\)”.
- \((\overline{x_1} \wedge {x_2}) \implies \overline{g_2}\) expresses “If \(x_1=0\) and \(x_2=1\), then \(g_2=0\)”.
- \(({x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) expresses “If \(x_1=1\) and \(x_2=0\), then \(g_2=0\)”.
- \(({x_1} \wedge {x_2}) \implies {g_2}\) expresses “If \(x_1=1\) and \(x_2=1\), then \(g_2=1\)”.
Wait… aren’t clauses supposed to be \(\vee\)’s of literals?
- Yes! Remember \(p \implies q\) is equvalent to \(\overline{p} \vee q\).
- So \((\overline{x_1} \wedge \overline{x_2}) \implies \overline{g_2}\) is equivalent to \(\overline{(\overline{x_1} \wedge \overline{x_2})} \vee \overline{g_2} = x_1 \vee x_2 \vee \overline{g_2}\).
Finally, if \(g_k\) is the output gate, add clause \((g_k \vee g_k \vee g_k)\) to \(\psi\).

More reductions: \(\probSAT \le^\P \probThreeSAT\)

\(\probSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable Boolean formula}} \quad \probThreeSAT = \setbuild{\encoding{\phi}}{\phi \text{ is a satisfiable 3-CNF formula}}\)

Theorem: \(\probSAT \le^\P \probThreeSAT\).

Proof:

(\(\implies\)) Assume \(\phi\) is satified by assignment \(x=x_1\dots x_n\).
- Then assigning variables \(x_i\) in \(\psi\) according to \(x\), and each gate variable \(g_j\) according to the output of that gate on inputs defined by \(x\), satisfies all clauses in \(\psi\).
(\(\impliedby\)) Assume \(\psi\) is satisfied by assignment \(w=x_1 \dots x_n g_1 \dots g_k\). We claim \(x=x_1 \dots x_n\) satisfies \(\phi\).
- Consider gate \(g_j\) with inputs that are either variables \(x_i\) or outputs of other gates.
- Since all clauses in \(\psi\) are satisfied, the four clauses corresponding to \(g_j\) ensure
  that \(g_j\) is the correct logical function of its inputs under assignment \(x\).
- By induction on the structure of \(\phi\), the output gate \(g_k\) must have value 1
  under assignment \(w\), since \((g_k \vee g_k \vee g_k)\) is a clause.
- Therefore, \(\phi\) is satisfied by assignment \(x=x_1 \dots x_n\).

Title

Restatement of the Cook-Levin Theorem

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

More reductions: \(\probThreeSAT \le^\P \probSubsetSum\)

More reductions: \(\probSAT \le^\P \probThreeSAT\)

More reductions: \(\probSAT \le^\P \probThreeSAT\)

More reductions: \(\probSAT \le^\P \probThreeSAT\)