Probabilistic Algorithms

A probabilistic algorithm uses the outcome of a random process to influence execution, which can be helpful for certain kinds of predictions

Why would this ever be useful? Sometimes calculating the best choice takes too much time, and estimating it always introduces bias

For example, determining information about individuals in a large population can be done quicker if we use random sampling

Definition: A probabilistic Turing machine $M$ is a type of nondeterministic Turing machine in which each nondeterministic step is called a coin-flip step and has two legal next moves

We assign a probability to each branch $b$ of $M$ ‘s computation on input $w$ as $Pr [b] = 2^{- k}$ where $k$ is the number of coin-flip steps that occur on branch $b$

Then, $Pr [M accepts w] = \sum_{b is accepting} Pr [b]$

Definition: $M$ decides language A with probability $ϵ$ if $w \in A ⟹ Pr [M accepts w] \geq 1 - ϵ$ and $w \in / A ⟹ Pr [M rejects w] \geq 1 - ϵ$

We also consider error probability bounds as a function of $n$ , like $ϵ = 2^{- n}$ which is exponentially small error

We measure time and space complexity as with any nondeterministic Turing machine, taking the worst case computation branch on each input

Definition: $BPP$ is the class of languages decidable by probabilistic polynomial time Turing machines with an error probability of $C$

This class is equivalent for any $C < \frac{1}{2}$ by virtue of the amplification lemma

Lemma: Let $ϵ$ be a fixed constant with $0 < ϵ < \frac{1}{2}$ , then for any polynomial $p (n)$ , a probabilistic Turing machine $M_{1}$ with error $ϵ$ has an equivalent machine $M_{2}$ with error $2^{- p (n)}$

The gist of this result is we can simply repeat the same computation $p (n)$ times and take the majority result, which reduces the error and ultimately still runs in polynomial time

Primality

While a polynomial time algorithm for testing primality exists, the most practical algorithms are probabilistic

The most obvious way to test primality is by testing for factors, but this is not known to be possible in probabilistic polynomial time

Definition: For any $p$ greater than 1, we say that two numbers are equivalent modulo p if they differ by a multiple of $p$ , in which case we write $x \equiv y (mod p)$

Every number is equivalent modulo $p$ to some member of the set $Z_{p} = {0, \dots, p - 1}$ , and for convenient we also write $Z_{p}^{+} = {1, \dots, p - 1}$

The main idea of this algorithm uses Fermat’s little theorem

Theorem: If $p$ is prime and $a \in Z_{p}^{+}$ , then $a^{p - 1} \equiv 1 (mod p)$

We use this to derive a Fermat test, where we say $p$ passes the Fermat test at $a$ if $a^{p - 1} \equiv 1 (mod p)$

Definition: A number is pseudoprime if it passes Fermat tests for all smaller a’s relatively prime to it

The pseudoprime numbers are identical to the prime numbers with the exception of the infrequent Carmichael numbers

$PSE U D OPR I ME =$ “On input $p$ :

Select $a_{1}, \dots, a_{k}$ randomly in $Z_{p}^{+}$
Compute $a_{i}^{p - 1} mod p$ for each $i$
Accept if all computed values are 1”

Because a number that is not pseudoprime it must fail at least half of the tests (not shown here), the probability of a false positive is at most $2^{- k}$

Modular exponentiation is computable in polynomial time, so this is a probabilistic polynomial time algorithm

To convert this to a full primality algorithm, we use the principle that the number 1 has exactly two square roots, 1 and -1, modulo any prime $p$ , but will always have four or more for many composite numbers, including all Carmichael numbers

Because of this, if $p$ passes the Fermat test at $a$ , $a^{p - 1} mod p = 1$ , and we can calculate the square root of $1$ by dividing the degree of the exponent by 2 until we get a different number (starting with $a^{(p - 1) /2} mod p$ )

$PR I ME =$ “On input $p$ :

If $p$ is even, accept if $p = 2$ and reject otherwise
Select $a_{1}, \dots, a_{k}$ randomly in $Z_{p}^{+}$
For each $i$ from 1 to $k$ :
Compute $a_{i}^{p - 1} mod p$ and reject if different from 1
Let $p - 1 = s \cdot 2^{l}$ where $s$ is odd and compute $a_{i}^{s \cdot 2^{0}}, \dots, a_{i}^{s \cdot 2^{l}}$
If some element of this sequence is not 1, find the last element that is not 1 and reject if that element is not $- 1$
If the test has not rejected at this point, accept”

Let’s demonstrate that this works as intended, and the maximum error probability is $2^{- k}$

We say $a_{i}$ is a (compositeness) witness if either test 1 or 2 rejects (stage 4 and 6 respectively)

First, we see that if $p$ is an odd prime number, then $Pr [PR I ME accepts p] = 1$

$a_{i}$ cannot be a test 1 witness because of Fermat’s little theorem
$a_{i}$ cannot be a test 2 witness because this would imply some $b^{2} - 1 \equiv 0 (mod p)$ , which leads to $(b - 1) (b + 1) = c p$ , meaning $p$ cannot be prime

To prove the error bound, we use the Chinese remainder theorem, which says that if $p$ and $q$ are relatively prime, then each number $r \in Z_{pq}$ corresponds to a unique pair $(a, b)$ such that $r \equiv a (mod p)$ and $r \equiv b (mod q)$

We show that $Pr [a is a witness] \geq \frac{1}{2}$ for a random $a \in Z_{p}^{+}$ by finding a unique witness for each nonwitness

In every nonwitness, the test 2 sequence consists of either all 1s or all 1s along with a -1 at some position

Among all nonwitnesses with a -1 in some test 2 position, we find the witness $h$ in which -1 appears in the largest position $j$ , such that $h^{s \cdot 2^{j}} \equiv - 1 (mod p)$

Because $p$ is composite, either $p$ is the power of a prime or we can write $p$ as the product of $q$ and $r$ relatively prime

The Chinese remainder theorem tells us that some number $t$ exists such that,
$t \equiv h (mod q)$
$t \equiv 1 (mod r)$

Therefore,
$t^{s \cdot 2^{j}} \equiv - 1 (mod q)$
$t^{s \cdot 2^{j}} \equiv 1 (mod r)$

Since $x \equiv a (mod p) ⟺ x \equiv a (mod q) and x \equiv a (mod r)$ , we see that $t^{s \cdot 2^{j}} \neq \equiv \pm 1 (mod p)$ and $t^{s \cdot 2^{j + 1}} \equiv 1 (mod p)$ , so $t$ is a witness

Now, note that for each nonwitness $d$ , $d^{s \cdot 2^{j}} \equiv \pm 1 (mod p)$ and $d^{s \cdot 2^{j + 1}} \equiv 1 (mod p)$ , due to the way we chose $j$

Then, $(d t)^{s \cdot 2^{j}} \neq \equiv \pm 1 (mod p)$ and $(d t)^{s \cdot 2^{j + 1}} \equiv 1 (mod p)$ , so $d t mod p$ is a witness

If $d_{1}$ and $d_{2}$ are distinct nonwitnesses, then $d_{1} t mod p \neq = d_{2} t mod p$ , since if $t d_{1} mod p = t d_{2} mod p$ then $d_{1} = t^{s \cdot 2^{j + 1}} d_{1} mod p = t^{s \cdot 2^{j + 1}} d_{2} mod p = d_{2}$

Thus we’ve proven our analysis in the case where $p$ is not a prime number

Otherwise $p = q^{e}$ with $q$ prime and $e > 1$ , in which case we let $t = 1 + q^{e - 1}$

$t^{p} = (1 + q^{e - 1})^{p} = 1 + p \cdot q^{e - 1} + multiples of higher powers of q^{e - 1}$ which is by definition equivalent to $1 mod p$

Hence $t$ is a test 1 witness, because if $t^{p - 1} \equiv 1 (mod p)$ , then $t^{p} \equiv t \neq \equiv_{1} (mod p)$

Then we obtain our other witnesses as above, and our bound is proven

Theorem: $PR I MES \in BPP$

This specific algorithm has a one-sided error, in the sense that rejections are absolute

This is a common feature, so there is a special complexity class

Definition: $RP$ is the class of language decidable by probabilistic polynomial time Turing machines where inputs in the language are accepted with a probability of at least $\frac{1}{2}$ , and inputs not in the language are rejected with a probability of $1$

Our primality algorithm shows that $COMPOS I TES \in RP$

Again, we can use a probability amplification technique on these algorithms

Read-Once Branching Programs

We examine branching programs as an interesting example of a problem that can apparently only be solved in polynomial time with probabilism

Definition: A branching program is a directed acyclic graph where all query nodes are labeled by variables with two outgoing edges labeled 0 and 1, except for two output nodes

A branching program determines a Boolean function on the values of the query nodes

Branching programs are related to the class $L$ in a way that is analogous to the relationship between Boolean circuits and the class $P$

The problem of testing equivalence for branching programs is $coNP$ -complete

Here we consider a read-once branching program, which can query each variable at most one time on any execution path

Theorem: $E Q_{ROBP}$ is in $BPP$

Our algorithm works by assigning random values to the variables

Instead of Boolean values, we modify our scheme to handle non-Boolean assignments to the variables and allow the values to propagate through the graph, summing the edges flowing into a single node, such that the result is a polynomial $p (x_{1}, \dots, x_{m})$

Let $F$ be a finite field with at least $3 m$ elements (the coefficient is arbitrary), where $m$ is the number of variables of the read-once branching programs we’re investigating

$D =$ “On input $⟨ B_{1}, B_{2} ⟩$ :

Select $a_{1}, \dots, a_{m}$ at random from $F$
Accept if $p_{1} (a_{1}, \dots, a_{m}) = p_{2} (a_{1}, \dots, a_{m})$ ”

We show this decides $E Q_{ROBP}$ with an error probability of at most $\frac{1}{3}$

Because $B$ is read-once, we may write $p$ as a sum of product terms $y_{1} y_{2} \dots y_{m}$ where each $y_{i}$ is $x_{i}, (1 - x_{i}),$ or 1, and where each product term corresponds to a path in $B$ from the start node to the output node labeled 1

We can then rewrite $p$ into an equivalent polynomial $q$ such that no product terms contain $y_{i} = 1$ , by repeatedly splitting terms into the sum of a term with $y_{i} = x_{i}$ and $y_{i} = 1 - x_{i}$

The result is that $q$ contains a product term for each assignment on which $B$ evaluates to 1

If $B_{1}$ and $B_{2}$ are equivalent then the polynomials $q_{1}$ and $q_{2}$ must be the same, and therefore $p_{1}$ and $p_{2}$ are equal on every assignment

To show the probability that $D$ rejects nonequivalent branching programs, we need some results on polynomials

First, a degree- $d$ polynomial $p$ on a single variable $x$ either has at most $d$ roots or is everywhere equal to 0 (we can prove this simply by induction)

Lemma: Now let $F$ be a finite field with $f$ elements and let $p$ be a nonzero polynomial on $x_{1}, \dots, x_{m}$ , with degrees at most $d$ ; if $a_{1}, \dots, a_{m}$ are selected randomly in $F$ , then $Pr [p (a_{1}, \dots, a_{m}) = 0] \leq m d / f$

We prove this inductively

For $m = 1$ , $p$ has at most $d$ roots so the probability that $a_{1}$ is one of them is at most $d / f$

Assume true for $m - 1$
Let $x_{1}$ be one of $p$ ‘s variables, and let $p_{i}$ be the polynomial comprising the terms of $p$ containing the terms of $p$ containing $x_{1}^{i}$ but where $x_{1}^{i}$ has been factored out

Then, $p = p_{0} + x_{1} p_{1} + x_{1}^{2} p_{2} + \dots + x_{1}^{d} p_{d}$

If $p (a_{1}, \dots, a_{m}) = 0$ then either all $p_{i}$ evaluate to 0 or some doesn’t and $a_{1}$ is a root of the single variable polynomial obtained by evaluating $p$ on $a_{2}, \dots, a_{m}$

The probability that all $p_{i}$ evaluate to 0 is at most the probability that some $p_{j}$ evaluates to 0, which by our induction hypothesis is at most $(m - 1) d / f$

If some $p_{i}$ doesn’t evaluate to 0, then this reduces to a nonzero polynomial in the single variable $x_{1}$ , which our base case shows has probability at most $d / f$

Thus the probability that $a_{1}, \dots, a_{m}$ is a root of the polynomial is at most $m d / f$ proving our lemma

This means the probability that our random assignment is a root of $B_{2} - B_{1}$ is at most $\frac{1}{3}$ , meaning $D$ rejects with a probability of at least $\frac{2}{3}$

Pseudorandom Generators

It’s important to note that true randomness is generally difficult or impossible to obtain

Practical implementations use pseudorandom generators, which are deterministic algorithms whose output appears random

Pseudorandom generators can never be truly random, but may satisfy various statistical tests

Proving an algorithm works equally well with pseudorandom generators is difficult, and may not always hold

Pseudorandom generators have been devised that can produce results indistinguishable from truly random results by any test that operates in polynomial time, assuming that one-way functions exist

Binyamin's Notes

Explorer

Primality

Read-Once Branching Programs

Pseudorandom Generators

Table of Contents