## Saturday, April 27, 2013

### The definitions of pseudoprimes

The idea behind the AKS primality tests is not an original idea from its authors, but it can be traced back to Pierre de Fermat, in the 1600.

The fundamental idea that Fermat had, and that is used in the AKS test, but also in the Miller-Rabin and Solovay-Strassen probabilistic tests, is the definition of a "pseudoprime in base $$b$$". Basically Fermat applied finite-group algebra to obtain information about the primality of a number, using the following definition:

Let $$n \in \mathbb{N}$$ be an odd, non-prime number and let $$a \in \mathbb{N}$$ such that $$GCD(n, a) = 1$$, then $$n$$ is a pseudoprime in base $$a$$ if:
$a^{n-1} \equiv 1 \bmod n$
We can easily see that this definition is strongly related to Fermat's little theorem.

This definition gives an immediate idea on how to check if a number is a prime: we simply take some bases $$a$$ at random and check if the given number is a pseudoprime in that base. If it isn't, then the number is not a prime, otherwise it may be a prime number. The more bases we test and the bigger gets the probability of $$n$$ being a prime number.

Unfortunately this test doesn't work, since there exist the so-called Charmicael numbers which are composite numbers that are pseudoprime in every base.

The Solovay-Strassen and Miller-Rabin probabilistic tests follow exactly this idea. They give a stronger definition of pseudoprime and test if the given number is a pseudoprime with different bases.

For example the Solovay-Strassen test uses this definition of pseudoprime, called Euler's pseudoprime:

Let $$n \in \mathbb{N}$$ be an odd, non-prime, number and let $$b \in \mathbb{N}$$ be a number such that $$GCD(n,b) = 1$$, then $$n$$ is an Euler pseudoprime in base $$b$$ if:
$b^{\frac{n-1}{2}} \equiv \left(\frac{b}{n}\right) \bmod n$

Where $$\left(\frac{a}{b}\right)$$ is the Jacobi Symbol.

It's quite easy to show that if $$n$$ is an Euler pseudoprime in base $$b$$ then it is also a pseudoprime in base $$b$$, but, most importantly, it holds the following theorem:

Let $$n \in \mathbb{N}$$ be an odd, non-prime, number. Let
$B = \left\{b \in \mathbb{N} \mid b < n \wedge b^{\frac{n-1}{2}} \equiv \left(\frac{b}{n}\right) \bmod n \right\}$

Then $$|B| \leq \frac{\phi(n)}{2}$$.

From this theorem it immediately follows this corollary:

Let $$n \in \mathbb{N}$$ be an odd number and let $$m \in \mathbb{N}$$ be a positive number. Then, the probability that $$n$$ is not a prime when it is an Euler pseudoprime for $$m$$ different bases in the range $$\{1, \dots, n-1\}$$ is less than or equal to $$2^{-m}$$.

this means that simply testing for about $$50$$ or $$100$$ bases gives a really precise answer. The Miller-Rabin uses an even stronger definition of pseudoprime, called Strong pseudoprime:

Let $$n \in \mathbb{N}$$ be an odd, non-prime, number and let $$b < n$$ be a natural number coprime with $$n$$. Let $$s, t \in \mathbb{N}$$ be numbers such as $$n = 2^st + 1$$ with $$t$$ odd. Then $$n$$ is a Strong pseudoprime in base $$b$$ if:
$b^t \equiv 1 \bmod n$
Or if:
$\exists r \in \mathbb{N}, r < s \mid b^{2^rt} \equiv -1 \bmod n$

It's easy to show that every Strong pseudoprime is an Euler's and Fermat's pseudoprime and it also holds that if $$n$$ is an Euler pseudoprime, then it is a Strong pseudoprime only if $$n \equiv 3 \bmod 4$$. Hence this definition of pseudoprime is strictly stronger than the definition of Euler pseudoprime.

In fact we can show that if $$n$$ is an odd number and it is a Strong pseudoprime for $$m$$ different bases, then the probability that $$n$$ is not a prime are less than or equal to $$4^{-m}$$.

### The AKS definition of pseudoprimes

Also the AKS uses this kind of definition of a pseudoprime. The difference between this test and the former algorithms is that the AKS pseudoprime are related to a polynomial extension of Fermat's little theorem:

Let $$a \in \mathbb{Z}_0, n \in \mathbb{N}, n \geq 2$$ such that $$a$$ and $$n$$ are coprime. Then $$n$$ is prime if and only if:
$(x + a)^n \equiv x^n + a \bmod n$

The problem with this theorem is that the polynomial has an exponential number of terms, hence it doesn't provide an efficient way to check for primality.

The idea of the AKS authors is to reduce the above definition in a smaller polynomial Ring. In particular what they wanted to obtain was something like the following:

Pseudo-theorem: Let $$a \in \mathbb{Z}_0, n \in \mathbb{N}, n \geq 2$$ such that $$a$$ and $$n$$ are coprime and let $$r \in \mathbb{N}$$ be a "big enough number" with $$r \leq \log^k{n}$$ for some fixed $$k \in \mathbb{N}$$. Then $$n$$ is prime if and only if:
$(x + a)^n \equiv x^n + a \; \text{in} \; \mathbb{Z}_n[x]/(x^r - 1)$

Unfortunately such a theorem doesn't hold. But it holds a similar theorem where, instead of using a single base $$b$$, it's enough to test a "small" number of bases.
In particular this is the AKS theorem:

Let $$n > 1 \in \mathbb{N}$$ such that $$n$$ is not a power of a prime with exponent bigger than $$1$$. Let $$r < n \in \mathbb{N}$$ such that:
$r \leq [16\log_2^5{n}] + 1, \qquad ord_r(n) > 4\log_2^2{n}$
If $$n$$ doesn't have prime factors smaller or equal to $$r$$, then $$n$$ is prime if and only if:
$(X + a)^n \equiv X^n + a \; \text{in} \; \mathbb{Z}_n [X]/(X^r-1) \quad \forall a = 1, \dots, [2\sqrt{\phi(r)}\log_2{n}]$
So this immediately suggest a pseudocode for testing primality:

def test_AKS(n):
if n < 2:
return False
elif n == 2:
return True
if find_int_power(n):
return False
r = find_r(n)
for divisor in range(2, r+1):
if n % divisor == 0:
return False
if n < r**2:
return True

a_upper_bound = int(2 * phi(r) ** .5 * math.log(n, 2)) + 1
for a in range(1, a_upper_bound+1):
if ((x + a) ** n) != (x ** n + a) % (x ** r - 1):
return False
return True 


(Where the condition on the if block inside the last loop is actually pseudocode where the polynomial modulus applies to the whole thing. Remember: we cannot evaluate something like $$(x + a)^n$$.)

The important fact about this algorithm is that we can prove to have a complexity of $$\mathcal{O}(\log^{12}{n})$$ and hence it proves that testing primality is a problem in $$\mathcal{P}$$.

The problem about this algorithm is the really big exponent of the complexity. Furthermore the constants hidden in the big-O notation are pretty big, due to the complexity of operations with polynomials. Using an efficient implementation of polynomials is fundamental.