A probabilistic algorithm uses the outcome of a random process to influence execution, which can be helpful for certain kinds of predictions

Why would this ever be useful? Sometimes calculating the best choice takes too much time, and estimating it always introduces bias

For example, determining information about individuals in a large population can be done quicker if we use random sampling

Definition: A probabilistic Turing machine is a type of nondeterministic Turing machine in which each nondeterministic step is called a coin-flip step and has two legal next moves

We assign a probability to each branch of ‘s computation on input as where is the number of coin-flip steps that occur on branch

Then,

Definition: decides language A with probability if and

We also consider error probability bounds as a function of , like which is exponentially small error

We measure time and space complexity as with any nondeterministic Turing machine, taking the worst case computation branch on each input

Definition: is the class of languages decidable by probabilistic polynomial time Turing machines with an error probability of

This class is equivalent for any by virtue of the amplification lemma

Lemma: Let be a fixed constant with , then for any polynomial , a probabilistic Turing machine with error has an equivalent machine with error

The gist of this result is we can simply repeat the same computation times and take the majority result, which reduces the error and ultimately still runs in polynomial time

Primality

While a polynomial time algorithm for testing primality exists, the most practical algorithms are probabilistic

The most obvious way to test primality is by testing for factors, but this is not known to be possible in probabilistic polynomial time

Definition: For any greater than 1, we say that two numbers are equivalent modulo p if they differ by a multiple of , in which case we write

Every number is equivalent modulo to some member of the set , and for convenient we also write

The main idea of this algorithm uses Fermat’s little theorem

Theorem: If is prime and , then

We use this to derive a Fermat test, where we say passes the Fermat test at if

Definition: A number is pseudoprime if it passes Fermat tests for all smaller a’s relatively prime to it

The pseudoprime numbers are identical to the prime numbers with the exception of the infrequent Carmichael numbers

“On input :

  1. Select randomly in
  2. Compute for each
  3. Accept if all computed values are 1”

Because a number that is not pseudoprime it must fail at least half of the tests (not shown here), the probability of a false positive is at most

Modular exponentiation is computable in polynomial time, so this is a probabilistic polynomial time algorithm

To convert this to a full primality algorithm, we use the principle that the number 1 has exactly two square roots, 1 and -1, modulo any prime , but will always have four or more for many composite numbers, including all Carmichael numbers

Because of this, if passes the Fermat test at , , and we can calculate the square root of by dividing the degree of the exponent by 2 until we get a different number (starting with )

“On input :

  1. If is even, accept if and reject otherwise
  2. Select randomly in
  3. For each from 1 to :
  4. Compute and reject if different from 1
  5. Let where is odd and compute
  6. If some element of this sequence is not 1, find the last element that is not 1 and reject if that element is not
  7. If the test has not rejected at this point, accept

Let’s demonstrate that this works as intended, and the maximum error probability is

We say is a (compositeness) witness if either test 1 or 2 rejects (stage 4 and 6 respectively)

First, we see that if is an odd prime number, then

cannot be a test 1 witness because of Fermat’s little theorem
cannot be a test 2 witness because this would imply some , which leads to , meaning cannot be prime

To prove the error bound, we use the Chinese remainder theorem, which says that if and are relatively prime, then each number corresponds to a unique pair such that and

We show that for a random by finding a unique witness for each nonwitness

In every nonwitness, the test 2 sequence consists of either all 1s or all 1s along with a -1 at some position

Among all nonwitnesses with a -1 in some test 2 position, we find the witness in which -1 appears in the largest position , such that

Because is composite, either is the power of a prime or we can write as the product of and relatively prime

The Chinese remainder theorem tells us that some number exists such that,

Therefore,

Since , we see that and , so is a witness

Now, note that for each nonwitness , and , due to the way we chose

Then, and , so is a witness

If and are distinct nonwitnesses, then , since if then

Thus we’ve proven our analysis in the case where is not a prime number

Otherwise with prime and , in which case we let

which is by definition equivalent to

Hence is a test 1 witness, because if , then

Then we obtain our other witnesses as above, and our bound is proven

Theorem:

This specific algorithm has a one-sided error, in the sense that rejections are absolute

This is a common feature, so there is a special complexity class

Definition: is the class of language decidable by probabilistic polynomial time Turing machines where inputs in the language are accepted with a probability of at least , and inputs not in the language are rejected with a probability of

Our primality algorithm shows that

Again, we can use a probability amplification technique on these algorithms

Read-Once Branching Programs

We examine branching programs as an interesting example of a problem that can apparently only be solved in polynomial time with probabilism

Definition: A branching program is a directed acyclic graph where all query nodes are labeled by variables with two outgoing edges labeled 0 and 1, except for two output nodes

A branching program determines a Boolean function on the values of the query nodes

Branching programs are related to the class in a way that is analogous to the relationship between Boolean circuits and the class

The problem of testing equivalence for branching programs is -complete

Here we consider a read-once branching program, which can query each variable at most one time on any execution path

Theorem: is in

Our algorithm works by assigning random values to the variables

Instead of Boolean values, we modify our scheme to handle non-Boolean assignments to the variables and allow the values to propagate through the graph, summing the edges flowing into a single node, such that the result is a polynomial

Let be a finite field with at least elements (the coefficient is arbitrary), where is the number of variables of the read-once branching programs we’re investigating

“On input :

  1. Select at random from
  2. Accept if

We show this decides with an error probability of at most

Because is read-once, we may write as a sum of product terms where each is or 1, and where each product term corresponds to a path in from the start node to the output node labeled 1

We can then rewrite into an equivalent polynomial such that no product terms contain , by repeatedly splitting terms into the sum of a term with and

The result is that contains a product term for each assignment on which evaluates to 1

If and are equivalent then the polynomials and must be the same, and therefore and are equal on every assignment

To show the probability that rejects nonequivalent branching programs, we need some results on polynomials

First, a degree- polynomial on a single variable either has at most roots or is everywhere equal to 0 (we can prove this simply by induction)

Lemma: Now let be a finite field with elements and let be a nonzero polynomial on , with degrees at most ; if are selected randomly in , then

We prove this inductively

For , has at most roots so the probability that is one of them is at most

Assume true for
Let be one of ‘s variables, and let be the polynomial comprising the terms of containing the terms of containing but where has been factored out

Then,

If then either all evaluate to 0 or some doesn’t and is a root of the single variable polynomial obtained by evaluating on

The probability that all evaluate to 0 is at most the probability that some evaluates to 0, which by our induction hypothesis is at most

If some doesn’t evaluate to 0, then this reduces to a nonzero polynomial in the single variable , which our base case shows has probability at most

Thus the probability that is a root of the polynomial is at most proving our lemma

This means the probability that our random assignment is a root of is at most , meaning rejects with a probability of at least

Pseudorandom Generators

It’s important to note that true randomness is generally difficult or impossible to obtain

Practical implementations use pseudorandom generators, which are deterministic algorithms whose output appears random

Pseudorandom generators can never be truly random, but may satisfy various statistical tests

Proving an algorithm works equally well with pseudorandom generators is difficult, and may not always hold

Pseudorandom generators have been devised that can produce results indistinguishable from truly random results by any test that operates in polynomial time, assuming that one-way functions exist