15-451 Algorithms 11/16/00 * Number-theoretic algs - GCD - fast exponentiation - Euler's theorem - RSA ============================================================================== Today and next time will talk about number-theoretic algorithms. A collection of uses. One of most important is in cryptography, RSA. Hard and Easy problems: ====================== FACTORING: Given number n, find a number d (1 < d < n) that divides n if one exists. No known fast algorithm (for non-quantum computers anyway). Note: goal is polynomial time in log(n). I.e., in the number of bits it takes to write down the input. Easy to factor in time linear in n. Even sqrt(n) by trial division. Quadratic sieve alg is roughly 2^{sqrt(log n loglog n)}. Number-field sieve is 2^{O((log n)^{1/3}(log log n)^{2/3})}. Largest RSA challenge number factored so far is 512 bits (155 digits). Number Factored Size in Bits Date Factored Algorithm Used =============== ============ ============= ============== RSA-100 330 April 1991 QS RSA-110 364 April 1992 QS RSA-120 397 June 1993 QS RSA-129 425 April 1994 QS RSA-130 430 April 1996 NFS RSA-140 463 February 1999 NFS RSA-155 512 August 1999 NFS (source: certicom.com and rsa.com) PRIMALITY TESTING: Given number n, is it prime or composite? This *can* be done quickly by a randomized algorithm: if n is prime then for sure will say prime, but if n is composite, it is only guaranteed to say composite with prob 1 - 1/2^{100}. More easy problems: - inverse mod n: given a,n compute a^{-1} mod n. - GCD: given a,b compute GCD(a,b) - exponentiation: given a,b,n compute a^b mod n. Obvious fact: COMPOSITENESS (is n composite) is in NP. Proof is a factor between 1 and n. Nonobvious fact: PRIMALITY (is n prime) is also in NP. I.e., there exist short proofs that a number is prime. We'll get to this later. This is why factoring is believed to *not* be NP-complete. Decision problem for factoring is: "given n and k, does n have a divisor between 1 and k". This is clearly in NP: proof is the divisor. But, it is also in NP: proof is the prime factorization of n, along with a proof that each of these factors is prime. Nonetheless, still no fast algorithm known. Cryptosystems like RSA are based on fact that primality testing and other easy problems above are easy, but factoring seems hard. Will get to RSA later once we've built up some of foundations. Start with some of things we *can* do: GCD === GCD(a,b) is the largest d s.t. d divides a, and d divides b. I.e., both a and b are multiples of d. gcd(a,0)=a. E.g., what's GCD of 51 and 9? Can we compute GCD(a,b) quickly - in time poly(log(a) + log(b))? Yes. Classic algorithm over 2000 years old called Euclid's alg. Based on observation that if a>b, then GCD(a,b) = GCD(b, a mod b). Note: I'm using "a mod b" to mean smallest nonnegative of form a-kb for integer k. Proof: if d|a (so a = a'*d) and d|b (so b = b'*d) then a-kb = d(a' - kb'), so d|(a mod b) too. Similarly, if d|b and d|a-kb then d|a. So, this is the algorithm GCD(a,b) { if (b==0) return a; else return GCD (b, a % b); } E.g., GCD(51, 9) = GCD(9,6) = GCD(6,3) = GCD(3,0) = 3. Can anyone see quick argument that running time is O(log a)? E.g., looking 2 steps at a time? [If b < a/2 then after just 1 step the largest is cut in half. If b > a/2 then a mod b < a/2 so in 2 steps the largest is cut in half] Extended GCD: also computes integers x and y such that d = ax + by. Idea: can compute with same algorithm. Recursively, running on b and a-kb, we compute x', y',d such that d = bx' + (a-kb)y'. This means that d = ay' + b(x'-ky'). This seems like a curiosity but we'll see how this will be really useful. Now, turn to modular arithmetic. Definitions: Z_n = {0,1,2,...,n-1} Z_n^* = {a in Z_n : gcd(a,n) = 1}. If n is prime, then Z_n^* = {1,2,...,n-1} Z_n^* is a group under multiplication: if you multiply two numbers relatively prime to n, you get another number relatively prime to n. Also, each a in Z_n^* has an inverse: a number b such that ab = 1 mod n. Question: why is this true and how can we compute inverse quickly? E.g., what is 5^{-1} mod 17? Here's how: compute extended GCD of n,a. Get x,y such that nx+ay=1. So, ay = 1 mod n: y is the inverse of a. E.g., EGCD(17,5) calls EGCD(5,2) where 2 = 17-3*5. This returns x'= 1, y' = -2. So, our answer is x = -2, y = 1 - 3*(-2) = 7. So, 5^{-1} = 7 mod 17. Fast Exponentiation =================== Given a, d, n, can we compute a^d mod n in time poly(log(n), log(d))? Answer: yes, by repeated doubling. Nice way: Call the routine below with x=1 power(a,d,x) \\ invariant is result = (a^d)*x mod n { if (d==0) return x; if d is odd, return power(a,d-1,x*a % n); if d is even, return power(a^2 % n, d/2, x); } Every two steps, d is cut by factor of 2, so # iterations is O(log d). The Euler Phi Function ====================== Defn: phi(n) = |Z_n^*|. E.g., if p is prime, then phi(p) = p-1. If n=p*q, what is phi(n)? phi(n) = (p-1)(q-1). Can see this by listing the numbers {1,...,n} and then removing the q multiples of p, then removing the p multiples of q, and then realizing that we removed n twice. As an aside, in general, if n= p1^e1 * p2^e2 *...* pk^ek, then: phi(n) = (p1 - 1)*p1^{e1 - 1}*...*(pk - 1)*pk^{ek - 1} Euler's Theorem: for any a in Z_n^*, a^{phi(n)} = 1 (mod n). Easier fact: "Fermat's little theorem": a^{p-1} = 1 mod p, when p is prime, a in Z_p^*. Proof of Fermat: we'll prove a^p = a (mod p) and do it by induction on a. OK for a=1. (a+1)^p = a^p + p*a^{p-1} + (p choose 2)a^{p-1} + ... + (p choose p-1)a + 1 = a^p + 1 mod p. By induction, a^p = a mod p, so (a+1)^p = a+1 mod p. Proof of Euler's theorem ------------------------ Definition: order(a) = smallest t such that a^t = 1 (mod n). Note that {1, a, a^2, ..., a^{t-1}} is a subgroup of Z_n^*: it's closed under multiplication and inverses. Now we can use a basic fact from group theory that THE SIZE OF ANY SUBGROUP DIVIDES THE SIZE OF THE GROUP. Proof: Say H is a subgroup of G and y is not in H. Then the coset yH = {yh : h in H} is a set of size |H| (if y*h1 = y*h2 then h1 = h2) and is disjoint from H (if y*h1 = h2 then y = h2*h2^{-1}, which is in H by H's group closure properties). Furthermore, all cosets are disjoint (if z*h1 = y*h2 then z = y*h3 for some h3 in H). So, if t is the order of a, then phi(n) = b*t for some b, and a^{phi(n)} = (a^t)^b = 1 (mod n).