ECS 120 Theory of Computation

Diagonalization and Undecidability of the Halting Problem

Julian Panetta

University of California, Davis

Recall: Sizes of sets

For any sets \(A\) and \(B\), we say \(|A| \ge |B|\) if and only if there exists an onto function \(f: A \to B\).

Corollary: \(|B| > |A|\) if no onto function \(f: A \to B\) exists.
Examples comparing sizes of infinite sets:
- \(|\N^+| \fragment{= |\N|} \fragment{= |\Z|} \fragment{= |\Q^+|} \fragment{= |\Q|} \fragment{= |\binary^*|}\)
- \(|(0, 1)| \fragment{= |\R^+|} \fragment{= |\R|} \fragment{= |\{0,1\}^\infty|} \fragment{= |\mathcal{P}(\N)|} \fragment{= |\mathcal{P}(\{0,1\}^*)|}\)
Are all infinite sets actually the same size?
No! \(\R\) is strictly larger than \(\N\)!
This was proved by Cantor’s diagonalization argument.

Cantor’s diagonalization technique

The following theorem changed the course of science:

Theorem: Let \(X\) be any set. Then there is no onto function \(f: X \to \powerset(X)\).

Proof:

Let \(f: X \to \powerset(X)\); we must show that \(f\) is not onto.
This means we must show that for some \(D \subseteq X\), for all \(a \in X\), \(f(a) \ne D\).
Define the set: \[ D = \setbuild{a \in X}{a \notin f(a)} \]

This is weird/tricky to parse but it’s well defined!
Let \(a \in X\) be arbitrary. Since \(D \in \powerset(X)\), it suffices to show that \(D \ne f(a)\).
By the definition of \(D\), \(a \in D \iff a \notin f(a)\).
Therefore, \(D \ne f(a)\), so \(f\) is not onto.

Why is this called “diagonalization”?

The name “diagonalization” comes from a way of visualizing the construction of \(D\)
from a given \(f\) (arbitrary).
Let’s lay out the elements of \(X\) along the columns of a table, and create rows for elements of \(\powerset(X)\) that each column element gets mapped to.
For example, if \(X = \N\) we can construct a table like:

	0	1	2	3	\(\cdots\)	k
\(f(0) = \{1, 3, \ldots\}\) Each cell value indicates whether the “column element” is in the set.	0	1	0	1	\(\cdots\)
\(f(1) = \{0, 1, 2, 3 \ldots\}\)	1	1	1	1	\(\cdots\)
\(f(2) = \{2, \ldots\}\)	0	0	1	0	\(\cdots\)
\(f(3) = \{0, 2, \ldots\}\)	1	0	1	0	\(\cdots\)
\(\vdots\)					\(\ddots\)
\(f(k) = D\) Row \(k\) (corresponding to \(D\)) is constructed by taking each diagonal element and flipping it!	1	0	0	1	\(\cdots\)	?? Assigning a value to \((k, k)\) using this rule is impossible!

We can also claim \(D\) can’t be on this table since it differs from every row.

Countable and uncountable sets

A set \(X\) is countable if \(|X| \le |\N|\). There are two possible cases:

\(|X| < |\N|\), i.e., \(X\) is finite.
\(|X| = |\N|\), i.e., \(X\) is infinite but can be put in a one-to-one correspondence with \(\N\).

A set \(X\) is uncountable if \(|X| > |\N|\).

In other words: a set is countable if it can be enumerated, \[ X = \{x_0, x_1, x_2, \ldots\} \] so that any given element can be found at some finite index \(n\).

This enumeration implicitly defines an onto function
\(f: \N \to X\).
Observation: \(\powerset(\N)\) is uncountable.

Uncountability of \(\R\)

Observation: \(\R\) is uncountable (\(|\R| > |\N|\)).
One possible proof: show that \(|\R| \ge |\powerset(\N)|\) and use the fact \(|\powerset(\N)| > |\N|\)
- Do this by exhibiting an onto function \(f: \R \to \powerset(\N)\).
- Let \(f\) take the decimal expansion of \(x \in \R\)
  and map it to the set of indices of the nonzero digits.
- For example, \(f(10.2030450) = \{0, 2, 4, 6, 7\}\).
- This function is onto:
  - For any (potentially infinite) subset \(S \subseteq \N\),
    we can construct a real number \(x\) whose decimal expansion has non-zero digits at the indices in \(S\).
  - For example, given \(S = \{0, 1, 2, 5, 9\}\), we can construct \(x = 1.110010001\) so that \(f(x) = S\).
Another possible proof: directly show \(|\R| > |\N|\) using diagonalization!
Continuum Hypothesis: there is no set \(A\) such that \(|\N| < |A| < |\R|\).
- This hypothesis can neither be proved nor disproved using the standard axioms of set theory (ZFC)
  (the standard foundation of mathematics).
- In other words, the hypothesis is independent of ZFC.

Undecidable languages

Finally getting back to computation:

Observation: There exist undecidable languages \(L \subseteq \binary^*\).

Proof:

The set of binary strings is countable, i.e., \(|\binary^*| = |\N|\).
The set of all total Turing machines (halting algorithms) is therefore countable
(each can be encoded as a binary string).
Since each total Turing machine decides a single language, there are
at most countably many decidable languages.
However, the set of all binary languages is the uncountable set \(\powerset(\binary^*)\).
So there are infinitely many languages that cannot be decided by any algorithm!

But we still haven’t proved any specific language to be undecidable!

Undecidability of the halting problem

We are finally ready for Turing’s diagonalization argument proving the undecidability of:

\[ \probHALT = \setbuild{\encoding{M, w}}{M \text{ is a Turing machine that halts on } w} \]

Theorem: \(\probHALT\) is undecidable.

Proof:

Assume for contradiction that \(\probHALT\) is decidable by the algorithm:
```
def H(M, w): raise NotImplementedError()
```
Define the following algorithm (analogous to the set \(D\) we constructed earlier):
```
import inspect
def D(M):
    M_src = inspect.getsource(M)
    if H(M, M_src):
        while True: pass # Loop forever
    else:
        return           # Halt
```
D loops if TM \(M\) halts on \(\encoding{M}\) (if \(\encoding{M, \encoding{M}} \in \probHALT\)),
and halts if TM \(M\) loops on \(\encoding{M}\) (if \(\encoding{M, \encoding{M}} \notin \probHALT\)).

Undecidability of the halting problem

Theorem: \(\probHALT\) is undecidable.

Proof:

Assume for contradiction that \(\probHALT\) is decidable by the algorithm:
```
def H(M, w): raise NotImplementedError()
```
Define the following algorithm (analogous to the set \(D\) we constructed earlier):
```
def D(M):
    if H(M, inspect.getsource(M)):
        while True: pass # Loop forever
    else: return         # Halt
```
D loops if TM \(M\) halts on \(\encoding{M}\) (if \(\encoding{M, \encoding{M}} \in \probHALT\)),
and halts if TM \(M\) loops on \(\encoding{M}\) (if \(\encoding{M, \encoding{M}} \notin \probHALT\)).

What does D(D) do?
- If D halts on itself, then H(D, D_src) returns True, so D loops forever (a contradiction!).
- If D loops on itself, then H(D, D_src) returns False, so D halts (a contradiction!).
We conclude that H must not exist, so \(\probHALT\) is undecidable!

Undecidability of the halting problem

def D(M):
    if H(M, inspect.getsource(M)):
        while True: pass # Loop forever
    else: return         # Halt

Expressing the previous argument in Turing machine notation:

\[\begin{align*} D \text{ halts on }& \encoding{D} \\ &\iff H \text{ accepts }\encoding{D, \encoding{D}} \quad \quad \color{green} (\text{Since } H \text{ solves }\probHALT) \\ &\iff D \text { loops on } \encoding{D} \quad \quad \quad \; \; \, \color{green} (\text{Implementation of } D) \end{align*}\]

This argument doesn’t just show \(H\) can’t analyze this weird algorithm \(D\).

It shows \(H\) itself cannot exist!

Why is this called diagonalization?

Let’s consider the (countable) set of all Turing machines and try running each on
strings that encode Turing machines.

These inputs probably don’t make sense, but the TMs will nonetheless either loop or halt.

We can organize the results in a table:

	\(\encoding{M_0}\)	\(\encoding{M_1}\)	\(\encoding{M_2}\)	\(\encoding{M_3}\)	\(\cdots\)	\(\encoding{D}\)
\(M_0\)	loop	halt	loop	halt	\(\cdots\)
\(M_1\)	halt	halt	halt	halt	\(\cdots\)
\(M_2\)	loop	loop	halt	loop	\(\cdots\)
\(M_3\)	halt	loop	halt	loop	\(\cdots\)
\(\;\,\vdots\)					\(\ddots\)
\(D\)	halt	loop	loop	halt	\(\cdots\)	??

The row for machine \(D\) is constructed by taking each diagonal result and flipping it!

Assigning a value to \((D, \encoding{D})\) using this rule is impossible!

Title

Recall: Sizes of sets

Cantor’s diagonalization technique

Why is this called “diagonalization”?

Countable and uncountable sets

Uncountability of \(\R\)

Undecidable languages

Undecidability of the halting problem

Undecidability of the halting problem

Undecidability of the halting problem

Why is this called diagonalization?