Let \(\encoding{\mathcal{X}}_1\) and \(\encoding{\mathcal{X}}_2\) be two different encodings of input object \(\mathcal{X}\) into binary strings.
Define an “encoding blowup factor” from \(\encoding{\mathcal{X}}_1\) to \(\encoding{\mathcal{X}}_2\) as: \[
f_{1 \to 2}(n) = \max_{\mathcal{X} \text{ s.t. } |\encoding{\mathcal{X}}_1| = n} |\encoding{\mathcal{X}}_2|
\]
If \(f_{1 \to 2}(n) = O(n^c)\) and \(f_{2 \to 1}(n) = O(n^c)\) (i.e., each encoding length is within a polynomial of the other), each encoding yields the same definition of \(\P\).
(Assuming encoding/decoding takes polynomial time.)
Example of reasonable encodings for graph \(G = (V, E)\):
Node and edge lists:
\(\encoding{G}_1 = \texttt{ascii\_to\_binary}(((1,2,3,4), ((1,2),(2,3),(3,1),(1,4))))\)
Binary adjacency matrix: \[ \begin{bmatrix} 0 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ \end{bmatrix} \quad \Longrightarrow \quad \encoding{G}_2 = 0111101011001000 \hspace{15em} \]
Let \(\encoding{\mathcal{X}}_1\) and \(\encoding{\mathcal{X}}_2\) be two different encodings of input object \(\mathcal{X}\) into binary strings.
Define an “encoding blowup factor” from \(\encoding{\mathcal{X}}_1\) to \(\encoding{\mathcal{X}}_2\) as: \[
f_{1 \to 2}(n) = \max_{\mathcal{X} \text{ s.t. } |\encoding{\mathcal{X}}_1| = n} |\encoding{\mathcal{X}}_2|
\]
If \(f_{1 \to 2}(n) = O(n^c)\) and \(f_{2 \to 1}(n) = O(n^c)\) (i.e., each encoding length is within a polynomial of the other), each encoding yields the same definition of \(\P\).
(Assuming encoding/decoding takes polynomial time.)
Reasonable and unreasonable encodings of \(n \in \N^+\):
\[ \begin{aligned} \encoding{n}_1 \phantom{|} &= \texttt{bin}(n) \in \{0, 1\}^* \\ |\encoding{n}_1| &= \fragment{\lfloor \log_2(n) \rfloor + 1} \fragment{= O(\log n)} \\ \encoding{n}_2 \phantom{|} &= 1^n \\ |\encoding{n}_2| &= n \end{aligned} \hspace{15em} \]
\(f_{1 \to 2}(n) = \fragment{O(2^n)}\)
If we were to care about the different time complexity classes \(O(n^k)\) within \(\P\)
We can try plotting the runtime of our algorithm against different input sizes.
Example: 3sum
If we were to care about the different time complexity classes \(O(n^k)\) within \(\P\)
We can try plotting the runtime of our algorithm against different input sizes.
More specifically, we can use a log-log plot:
Example: 3sum: Slope \(k\) on a log-log plot indicates time complexity \(O(n^k)\).
We denote by \(\P\) the class of languages decidable in polynomial time by a (deterministic) Turing machine: \(\P = \bigcup_{k = 1}^\infty \time{n^k}\)
Example problem in \(\P\):
3Sum Problem:
Given a set \(A\) of \(n\) integers (\(A \subset \Z\), \(|A| = n\)), check if there are three elements that sum to zero (i.e., \(a + b + c = 0\) for \(a, b, c \in A\)).
def three_sum_1(A: list[int]) -> bool:
for i in range(len(A)):
for j in range(i, len(A)):
for k in range(j, len(A)):
if A[i] + A[j] + A[k] == 0:
return True
return False
We’ll next show that the following problems are in \(\P\) by exhibiting and analyzing algorithms:
Integers \(a, b \in \N^+\) are called relatively prime if their greatest common divisor \(\gcd(a, b) = 1\).
\[ \probRelprime = \setbuild{\encoding{a, b}}{ a,b \in \N^+ \text{ and } \gcd(a, b) = 1 } \]
Theorem: \(\probRelprime \in \P\)
Proof: We can compute \(\gcd(a, b)\) efficiently using the Euclidean algorithm: (c. 300 BC!)
def gcd(a: int, b: int) -> int:
# Order so a ≥ b (simplifies analysis)
if (a < b): a, b = b, a
while b > 0:
a = a % b
a, b = b, a
return a
def rel_prime(a: int, b: int) -> bool:
return gcd(a, b) == 1
Simply checking each integer up to \(b\) would take exponential time!
Directed reachability problem:
\[ \probPath = \setbuild{\encoding{G, s, t}}{ G \text{ is a directed graph containing a path from node } s \text{ to } t } \]
Theorem: \(\probPath \in \P \hspace{1em}\) (Use a breadth-first search)
from typing import TypeVar
Node = TypeVar('Node')
type Graph[Node] = tuple[list[Node], list[tuple[Node, Node]]]
def path(G: Graph[Node], s: Node, t: Node) -> bool:
if s == t:
return True
V, E = G
seen = {s}
queue = collections.deque([s])
while len(queue) > 0:
node = queue.popleft()
neighbors = [v for (u,v) in E if u==node]
for v in neighbors:
if v == t:
return True
if v not in seen:
seen.add(v)
queue.append(v)
return False
while loop run? At most \(n\) (each node enters queue at most once).neighbors take (each itertion)? \(m\) steps (each edge is checked once).for loop run?neighbors list are disjoint between different while loop iterations).This is \(O(n \cdot m + m)\).
Could we do better?
Given an undirected graph \(G\), check if it is connected.
\[\probConnected = \setbuild{\encoding{G}}{ G \text{ is connected} }\]
Theorem: \(\probConnected \in \P\)
Simple approach for handling undirected graphs:
convert them to an equivalent directed graph (of proportional size) in polynomial time.
Node = TypeVar('Node')
type Graph[Node] = tuple[list[Node], list[tuple[Node, Node]]]
def add_reverse_edges(G: Graph[Node]) -> Graph[Node]:
V,E = G
reverse_edges = [(v,u) for (u,v) in E if (v,u) not in E and (v,u) not in reverse_edges]
return (V, E+reverse_edges)
Now just check if all pairs of nodes are connected by a path:
def connected(G: Graph[Node]) -> bool:
V,E = add_reverse_edges(G)
for (s,t) in itertools.combinations(V,2):
if not path(G,s,t):
return False
return True
forloop.path takes polynomial time per iteration (previous slides)