So far we’ve seen problems that can be solved efficiently (in polynomial time).
Next we’ll look at problems that we do not know how to solve efficiently, but we can verify a solution efficiently.
These belong to the class of languages/problems NP (nondeterministic polynomial time).
\(\NP\): solvable in polynomial time by a nondeterministic Turing machine.
To prove a problem is in NP, we need to exhibit a polynomial-time verifier for it.
Let’s see some examples!
Given a directed graph \(G\), check if a Hamiltonian path exists.
\[\probHamPath = \setbuild{\encoding{G}}{ G \text{ is a directed graph with a Hamiltonian path }}\]
Is \(\probHamPath \in \P\)? We don’t know of any polynomial-time algorithm for it!
However, we can efficiently verify a solution that is given to us: \[ \fragment{\verifier{\probHamPath} = \setbuild{\encoding{G, p}}{ G \text{ is a directed graph with the Hamiltonian path } p }} \]
def ham_path_verify(G,p):
V, E = G
# verify each pair of adjacent nodes in p is connected by an edge in `E`
for i in range(len(p) - 1):
if (p[i], p[i+1]) not in E:
return False
# verify p and V have same number of nodes
if len(p) != len(V):
return False
# verify each node appears at most once in p
if len(set(p)) != len(p):
return False
return True
Why is this algorithm polynomial-time?
Therefore \(\probHamPath \in \NP\).
\[ \probComposites = \setbuild{\encoding{n}}{ n \in \N^+ \text{ and } n = p q \text{ for some integers } p, q \ge 2} \]
Is \(\probComposites \in \P\)?
Surprisingly, yes! But this is not obvious (AKS primality test discovered in 2002)
But it’s easy to prove that \(\probComposites \in \NP\).
What is the verification language? \[ \verifier{\probComposites} = \left\{\fragment{\encoding{n, d} \; \big| \; n, d \in \N^+,} \; \fragment{d \text{ divides } n,} \; \fragment{\text{ and } \; 1 < d < n}\right\} \]
What’s an algorithm for deciding \(\verifier{\probComposites}\)?
def composite_verify(n: int, d: int) -> bool:
return 1 < d < n and n % d == 0
What’s the time complexity in terms of the number of bits in \(n\) and \(d\)?
\[ \probComposites = \setbuild{\encoding{n}}{ n \in \N^+ \text{ and } n = p q \text{ for some integers } p, q \ge 2} \]
Is \(\probComposites \in \P\)?
Surprisingly, yes! But this is not obvious (AKS primality test discovered in 2002)
But it’s easy to prove that \(\probComposites \in \NP\).
What is the verification language? \[ \verifier{\probComposites} = \left\{\encoding{n, d} \; \big| \; n, d \in \N^+, \; d \text{ divides } n, \; \text{ and } \; 1 < d < n\right\} \]
What’s an algorithm for deciding \(\verifier{\probComposites}\)?
def composite_verify(n: int, d: int) -> bool:
return 1 < d < n and n % d == 0
def composites_decide_slow(n: int) -> bool:
for d in range(2, n):
if composite_verify(n, d):
return True
return False
Now let’s formally define the class \(\NP\).
A polynomial-time verifier for a language \(A\) is a Turing machine \(V\) such that: \[ \fragment{A = \setbuild{x \in \binary^*}{\fragment{\left(\exists w \in \binary^{\le |x|^k}\right)} \; \fragment{V \text{ accepts } \encoding{x, w}}}} \] for some constant \(k\). Furthermore, \(V\) must run in time \(O(|x|^c)\) for some constant \(c\).
\(\NP\) is the class of languages that have polynomial-time verifiers.
We call the language decided by the verifier for \(A \in \NP\) the verification language of \(A\), denoted \(\verifier{A}\).
A clique in a graph \(G\) is a subset of vertices such that every two are connected by an edge.
A \(k\)-clique is a clique with \(k\) vertices.
\[ \probClique = \setbuild{\encoding{G, k}}{ G \text{ is an undirected graph with a } k\text{-clique}} \]
Theorem: \(\probClique \in \NP\).
Proof: the following is a polynomial-time verifier, deciding \[ \verifier{\probClique} = \setbuild{\encoding{G, \fragment{k, C}}}{ G \text{ is an undirected graph } \fragment{\text{with a } k\text{-clique } C}} \]
from itertools import combinations as subsets
def clique_verifier(G, k, C):
V, E = G
# verify C is the correct size, and k is not too large
if len(C) != k or k > len(V):
return False
# verify each pair of nodes in C shares an edge
for (u, v) in subsets(C, 2):
if (u, v) not in E: return False
return True
Why is this algorithm polynomial-time?
Is the witness short enough?
“Given a collection of integers, can we select some of them that sum to target integer \(t\)?”
\[ \probSubsetSum = \setbuild{\encoding{C, t}}{ C \text{ is a } \textbf{multiset} \text{of integers, and } (\exists S \subseteq C) \; t = \sum_{y \in S} y} \]
Example: \(\encoding{\{4, 4, 11, 16, 21, 27\}, 29} \in \probSubsetSum\),
but \(\encoding{\{4, 11, 16\}, 13} \notin \probSubsetSum\).
Theorem: \(\probSubsetSum \in \NP\).
Proof: the following is a polynomial-time verifier
def subset_sum_verify(C, t, S):
if sum(S) != t: # check sum
return False
for x in S: # verify that S is a subset of C
if x not in C:
return False
return True