NP-Complete Problems ==================== Some problems are thought to require more time to solve than is representable by any polynomial. Their solution is often of the variety "try every possible solution". If we are talking about n-bit integers, then there is a solution space of 2^n to search. If we are interested in minimizing the cost of a traveling salesperson visiting n cities, then there is n! possible routings. Problems of this type are often called intractable because even for small values for n, solutions are beyond the realm of computer solution. Problems which take polynomial time are considered tractable. This sounds like a ridiculous way to partition problems: Surely a 2^(sqrt(n)) algorithm is more useful than a n^100 one. But, in fact 2^(sqrt(n)) gets large pretty fast, and most polynomially bounded functions require less than n^3 time. So, in practice, there appears to be a real gap between tractable and intractable problems. The theory of "NP-Completeness" has been developed to give evidence that certain problems are indeed intractable. The theory was developed relative to turing machines and so has a somewhat funny flavor. The theory addressed the problem of whether a language was "decidable". Because of this artifact, we address these topics in terms of "decision problems". Decision problems all take some input (such as the description of a graph), and then return either of {yes, no}. The problem can be "Does the graph have a certain property". P is the set of all decision problems that can be answered in polynomial time (i.e. the problem has a solution algorithm that runs in O(n^k) time for some fixed constant k). Examples: Is x even? Does graph G have a cycle? NP is the set of all decision problems with short proofs for "yes" answers. "Short" means verification of the proof must run in polynomial time, requiring the length of the proof, and the processing of it, must be no longer than a polynomial. Examples: Does graph G have a cycle? (Proof is a list of edges making a cycle in G) Is x composite? (Proof is a & b such that a * b = x) P \subseteq NP -------------- It not hard to see that there are problems in NP which are not known to be in P. Testing for composite numbers is not known how to do in polynomial time, but checking a proof of compositeness is trivial. Because of the formal definition of NP (see your book), it is also clear that there is no problem in P but not in NP. This means that either P = NP, or P \subset NP. Nobody has been able to prove either is the case, but there is good evidence of the latter. There is a set of problems, NPC (NP-Complete) which are in NP but that if any of them could be solved in polynomial time, then all problems in NP could be too. It is hard to believe that all problems in NP could be solved in polynomial time, so it is strongly believed that NPC is not in P (implying P != NP). Thus, showing that a problem in NPC is very good evidence that there is no polynomial-time solution for it. Our goal to show that a problem is hard is thus to show that it is in NPC. Our next lecture will focus on how to do this. NP-complete ----------- A problem Q is NP-complete if it is 1. NP-hard, and 2. Q \in NP. NP-hard ------- A problem Q is NP-hard if every problem in NP can be reduced to it. Reductions ---------- P reduces to Q if P can be solved using Q as a subroutine (P \leq_P Q). More formally, T is a polynomial reduction from P to Q if 1. T translates inputs of type P into ones of type Q in polynomial time, 2. P(x) is correctly answered by Q(T(x)) This means that if P polynomially reduces to Q, then we can build a P-solver as P(x) = Q(T(x)). Our goal is to use NP-completeness as our standard of proof that a problem is intractable. To show a problem is NP-complete, we need to show, either directly or via reduction, that the problem can be reduced to by all problems in NP. Via reduction is easiest, but we need to have problem to reduce from If we know that P is hard, then we can use a reduction to show that Q is hard, too. The logic is as follows: If P is known to be hard, and we can solve P using Q as a subroutine, then Q must be hard, too. Thus, once we know of a hard problem P (one that is NP-complete), then any problem we can reduce P to must also be hard (NP-complete) "circuit-satisfiability" is a decision problem that can be reduced to by any problem in NP. Problem: A boolean circuit is combination of AND, OR and NOT gates (with the size being the number of gates plus the number of wires). Decision: is boolean circuit C satisfiable (i.e. is there a combination of inputs that causes the output to be 1). To show this problem is NP-complete we must do two things: 1. Show CIRCUIT-SAT is in NP, and 2. Give a polynomial method to solve *any* other problem in NP using a CIRCUIT-SAT decider as a subroutine. #1 is easy. All we need to do is define a certificate format and verification algortihm. The algorithm needs to verify that the certificate proves a particular circuit is satisfiable. The obvious certificate is a list of inputs for the circuit. The circuit can be simulated in polynomila time. #2 is harder. We need to show how to take an arbitrary problem P in NP and show how to solve instances of P using a circuit-satisfiability solver (all in polynomial time). To do so, we note that P must have a verifier in order to be in NP. That verifier takes a problem instance x and certificate y and returns either yes or no. To determine the solution for P on input x we are really asking whether a certificate y exists which causes the verifier to output 1 for input x. To see if problem P answers 1 on x we build a circuit that takes y as its input wires, simulates the execution of the verifier and outputs the verifier's answer as its answer. If this circuit is satisfiable, then P should answer 1. Two NP-Completeness proofs ========================== 3-SAT Reduces to CLIQUE ----------------------- The reduction goes as follows: For each of m clauses, create 3 vertices each representing one of the three variables in the clause. Next create an edge between every pair of vertices that are not in the same group of three and do not contradict each other. Now, an m-clique represents a group of m variables, each from a different clause, none of which contradict one another. Setting all of these variables true satisfies the 3-SAT formula. SAT Reduces to SET-INTERSECTION ------------------------------- The \textit{set intersection\/} problem is defined as follows: Given finite sets $A_1, A_2, \ldots,A_m$ and $B_1, B_2, \ldots,B_n$, is there a set $T$ such that \begin{eqnarray*} |T \cap A_i| \geq 1 & & \mbox{for $i = 1, 2,\ldots,m$, and} \\ |T \cap B_j| \leq 1 & & \mbox{for $j = 1, 2,\ldots,n$} \end{eqnarray*} Show that the set intersection problem is \textbf{NP}-complete. (\textit{Hint: SAT is \textbf{NP}-hard and may be useful in your reduction. Recall that the SAT problem is to determine whether a boolean CNF formula $\phi$ is satisfiable.\/}) \textit{First we demonstrate that set intersection is in \textbf{NP} by giving a certificate format and verification algorithm for certificates. The certificate is simply a set $T$ that satisfies the constraints. The verification algorithm checks that the certificate is no larger than the sum of the sizes of all $A_i$ and $B_j$. It then verifies the constraints. The verification for each set takes no longer than the size of the certificate times the size of the set being verified. All of this is polynomial in size and running time.} \textit{Next we need to show how to reduce from SAT to set intersection. We are given an instance of the SAT problem in the form of a CNF formula $\phi$, and we need to show how to determine whether it is satisfiable using a set intersection solver as a subroutine. For the $i$-th clause of $\phi$, put each of the symbols of the clause into set $A_i$ (eg, If the $i$-th clause were $(a \vee \overline{b})$, then $A_i$ would be defined as $\{a,\overline{b}\}$). For the $j$-th distinct literal in the formula $\phi$, put the literal and its negation into set $B_j$ (eg, If the $j$-th literal were $\overline{b}$, then $B_j$ would be defined as $\{b,\overline{b}\}$). Now, a set intersection solver will only be able to find an appropriate set $T$ if at least one symbol in each clause can be placed into $T$ while at the same time no more than one of a literal and its negation can be in $T$ at the same time. $T$ represents a collection of literals, that if all are assigned ``true'', then $\phi$ is satisfied.}