Complexity

We consider the running time of an algorithm as a function of the length of the string representing its input

Definition: Let $M$ be a deterministic Turing machine that halts on all inputs. The time complexity of $M$ is the function $f : N \to N$ , where $f (n)$ is the maximum number of steps that $M$ uses on any input of length $n$

We say $M$ runs in time $f (n)$ and that $M$ is an $f (n)$ time Turing machine

Coming up with precise expressions for time complexity is burdensome and generally unnecessary

Definition: We say that $f (n) = O (g (n))$ , if for every $n \geq n_{0}$ , $f (n) \leq c g (n)$ for some $c, n_{0}$ , in which case $g (n)$ is an asymptotic upper bound for $f (n)$

Bounds of the form $n^{c}$ are called polynomial bounds and bounds of the form $2^{(n^{δ})}$ are called exponential bounds

Definition: We say that $f (n) = o (g (n))$ if $lim_{n \to \infty} \frac{f ( n )}{g ( n )} = 0$

The difference between big- $O$ and small- $o$ is analogous to the difference between $\leq$ and $<$

How much time does a single-tape Turing machine need to decide $A = {0^{k} 1^{k} ∣ k \geq 0}$ ?

To measure this more clearly, we give low-level descriptions

$M_{1} =$ “On input string $w$ :

Scan across the tape and reject if a $0$ is found to the right of a $1$
Repeat if both $0$ s and $1$ s remain on the tape:
Scan across the tape, crossing off a single $0$ and a single $1$
If neither $0$ s nor $1$ s remain on the tape, accept otherwise reject”

We consider these stages separately and come to $O (n^{2})$

Definition: The time complexity class $TIME (t (n))$ is the collection of all languages that are decidable by an $O (t (n))$ time Turing machine

So we know at least $A \in TIME (n^{2})$ , but can we do better?

$M_{2} =$ “On input string $w$ :

Scan across the tape and reject if a $0$ is found to the right of a $1$
Repeat if both $0$ s and $1$ s remain on the tape:
Scan across the tape and reject if the total remaining $0$ s and $1$ s is odd
Scan across the tape, crossing off every other $0$ starting with the first $0$ and then every other $1$ starting with the first $1$
If neither $0$ s nor $1$ s remain on the tape, accept otherwise reject”

We see that the complexity of $M_{2}$ is $O (n lo g n)$

In fact, any language that can be decided in $o (n lo g n)$ time on a single-tape Turing machine is regular (see Kobayashi)

With a second tape, we can solve this in linear time, by treating the second tape as a stack

Even though these models of computation are equivalent in computability theory, they still affect the complexity of languages

Theorem: Every $t (n)$ time multitape Turing machine has an equivalent $O (t^{2} (n))$ time single-tape Turing machine

This is because of how we can simulate a multitape machine with one tape; we store the tapes consecutively and scan across the entire tape (which can require up to a multiple of $t (n)$ space) to decide our next move

Definition: The running time of a nondeterministic Turing machine $N$ is the function $f : N \to N$ where $f (n)$ is the maximum number of steps that $N$ uses on any branch of its computation on any input of length $n$

Theorem: Every $t (n)$ time nondeterministic single-tape Turing machine has an equivalent $2^{O (t (n))}$ time deterministic single-tape Turing machine

Without going into too much depth, this is an intuitive result that reflects how we can simulate a search through the machine’s decision tree

The Class P

Exponential time algorithms are effectively unusable, and typically arise from brute-force searching the space of solutions

Because of this, we declare that all reasonable deterministic computational models are polynomially equivalent, where reasonable vaguely refers to models approximating running times on actual computers

Note that we’ve already shown a polynomial equivalence between single and multi-tape Turing machines

This discussion focuses on aspects of complexity that are unaffected by polynomial differences, which allows us to discuss some important fundamental properties of algorithms

In practice, the difference between $n$ and $n^{3}$ is very important, but complexity theory also takes a much broader look at runtime complexity

Definition: $P$ is the class of languages that are decidable in polynomial time on a single-tape Turing machine, $P = k ⋃ TIME (n^{k})$

$P$ is invariant for all reasonable models of computation and corresponds roughly to the class of problems that are realistically solvable on a computer

This threshold has proven to be very useful, and polynomial time algorithms can almost always be reduced in order to a point of practical utility

To actually describe algorithms, we give high-level descriptions without reference to particular computational models, in order to avoid tedious details of tapes and head motions

We still use $⟨ \cdot ⟩$ to refer to a reasonable encoding methods of an object, where all reasonable methods for an object are polynomially equivalent

To encode graphs, we often use an adjacency matrix

Theorem: $P A T H = {⟨ G, s, t ⟩ ∣ G is a directed graph that has a directed path from s to t}$ is in $P$

We can use a breadth-first search to decide $P A T H$

Theorem: $RE L PR I ME = {⟨ x, y ⟩ ∣ x and y are relatively prime}$ is in $P$

We cannot simply search through every possible divisor, since the length of an encoding for $k$ is exponential in $k$ , but we can use the well-known Euclidean algorithm for calculating the greatest common divisor of $x$ and $y$ , and conclude that they are relatively prime if $gcd (x, y) = 1$

Theorem: Every context-free language is a member of $P$

We use a dynamic programming approach, where each subproblem tracks the symbols that can generate the $i$ through $j$ th range of the input string

We break down a subproblem of length $n$ into smaller parts by considering each possible split, so this takes $O (n^{3})$

The Class NP

Many problems have not yielded polynomial time problems, and essentially can only be solved with brute-force

Let’s start with an example

A Hamiltonian path in a directed graph $G$ is a directed path through a graph that goes through each node exactly once

$H A MP A T H = {⟨ G, s, t ⟩ ∣ G is a directed graph with a Hamiltonian path from s to t}$

We can easily make an exponential time algorithm for $H A MP A T H$ , but no one knows if a polynomial time solution exists

However, $H A MP A T H$ has polynomial verifiability, meaning we can verify whether a potential solution is valid in polynomial time

Definition: A verifier for a language $A$ is an algorithm $V$ , where $A = {w ∣ V accepts ⟨ w, c ⟩ for some string c}$

A polynomial time verifier runs in polynomial time in the length of $w$ , and a language is polynomially verifiable if it has a polynomial time verifier

$c$ acts as additional information, essentially a certificate or proof that $w$ belongs to $V$

In the case of $H A MP A T H$ , $c$ would look like an actual Hamiltonian path from $s$ to $t$

Definition: $NP$ is the class of languages that have polynomial time verifiers

$NP$ comes from nondeterministic polynomial time, an alternative characterization of the class

Theorem: A language is in $NP$ $⟺$ it is decided by some nondeterministic polynomial time Turing machine

For the forward definition of this theorem, $A \in NP$ means $A$ has a polynomial time verifier $V$ , which runs in $n^{k}$ , so we can nondeterministically select the string and run it on $V$

For the other direction, we treat $c$ as a description of the nondeterministic choice to be made at each step

Definition: $NTIME (t (n))$ is the set of languages decided by some $O (t (n))$ time nondeterministic Turing machine

$NP = ⋃_{k} NTIME (n^{k})$

The class $NP$ is insensitive to the choice of reasonable nondeterministic computational model

$coNP$ contains the languages that are complements of languages in $NP$ , or problems where you must verify something is not present

It is unknown whether $coNP$ is different than $NP$

Even more than this, we are unable to prove the existence of a single language in $NP$ that is not in $P$ , even though it seems almost obvious

The question of whether $P = NP$ is one of the greatest unsolved problems in theoretical computer science and modern mathematics

Most researchers believe they are not equal, since solving the problem would have enormous implications, however proving otherwise is still beyond reach

We can prove $NP \subseteq EXPTIME = ⋃_{k} TIME (2^{n^{k}})$ , but this is not enough (and pretty intuitive already)

Binyamin's Notes

Explorer

The Class P

The Class NP

Table of Contents