Thursday, September 6, 2012

Understanding RSA

This is a explanation of RSA that I wrote about a year ago for a discrete math class.  I've shared this a few times so I'm posting it here in hopes it will be useful to others. Please post comments if something is unclear or if you think you've found a mistake.


A key is a number that can be used with a cryptographic algorithm to encrypt or decrypt a message.  In symmetric cryptosystems (AES, DES, RC5, etc) the same key is used for encryption and decryption.  In asymmetric cryptosystems like RSA, there are two keys: one key is used to encrypt while another is used to decrypt.

A user's public key is used to encrypt messages intended for that user.  The user can decrypt those messages using his private key but an attacker with only knowledge of his public key cannot decrypt the message.  This allows the user to share the encryption key publicly for anyone that might want to send him a secret message (hence “public”).  The user must keep his decryption key secret (private).  

The Algorithm

In RSA, each user picks two large (1,000 bits or larger) prime numbers called p and q and multiplies them together to get n.   n is the modulus.  Encryption and decryption in RSA are done modulo n, which means we divide any result by n and only keep the remainder.  E.g. 7 mod 5 = 2 because 7 divided by 5 is 1 remainder 2.

The user also needs to compute the Euler Phi function on n: ϕ (n).  The Euler Phi function returns the number of numbers less than n that are relatively prime to n (they share no common factors).  This is easy for the user because if p and q are prime, then  

ϕ(n) = (p – 1)(q – 1).

To create public and private keys, the user picks a public exponent e that is relatively prime to ϕ (n) and computes a decryption exponent d that is the multiplicative inverse of e mod ϕ (n).  More simply

ed mod ϕ(n) = 1.  

The decryption exponent can be computed using the extended Euclidean algorithm.  The Euclidean algorithm is used to find the greatest common denominator (GCD) of two numbers.  In the extended Euclidean algorithm, we can do some extra computations to find two numbers u and v such that gcd(x,y) = ux + vy. 

If, and only if, x and y are relatively prime, we have gcd(x,y) = ux + vy = 1.  In RSA, this is done by taking e and ϕ(n) and finding u and v such that:

 ue + v∙ϕ(n) = 1

Taken mod ϕ(n):
 ue + v∙ϕ(n) mod  ϕ(n) = ue mod ϕ(n) = 1.   

u is the multiplicative inverse of e mod ϕ(n) and is called d in the RSA algorithm.

The user can share his public key, the modulus n and the exponent e written (e,n).  Any sender that wants to send him a message can encrypt the message m by calculating me mod n.  The user can decrypt this message be calculating (me )d mod n = m.  To understand why this works, we need to understand how exponents are evaluated.  For any number x, and exponents a and b, 

(xa )b  = xa b 


x ∙ xa = xa + 1
Now that we understand exponents, we can use Euler's Theorem: If gcd(x,n) = 1, then:

x ϕ(n) mod n = 1 mod n

What this means is that if we raise x to a power that is a multiple of ϕ(n) and take the result mod n, the result is the same as raising x to the power 0:

x 0 mod ϕ(n) mod n = x 0 mod n = 1 mod n

 It might not be obvious how this would be useful, but consider the case where we have:

 x 1 mod ϕ(n) mod n = x ∙x ϕ(n) mod n = x ∙x0 mod n = x mod n

 In RSA, since e∙d mod ϕ(n) = 1 mod ϕ(n):

m e∙d mod n = m e∙d mod ϕ(n) mod n = m 1 mod ϕ(n) mod n = m

In order for an attacker to compute the decryption exponent, he must compute ϕ(n) which means he must factor n.   

The security of RSA rests on the difficulty of factoring n to learn the decryption exponent or of finding m given me mod n.  Since e is known, an attacker could also attempt to find the value e∙d given a known value of m.  Finding the value  e∙d such that  m e∙d = 1 mod n is called the discrete logarithm problem and is the basis for Diffie-Hellman key exchange.  Factoring numbers that are the product of large primes or finding discrete logarithms are both difficult problems.  Given large enough parameters, these problems cannot be solved in any reasonable time frame given current algorithms and computing hardware.  They can, however, be solved in a time that is much faster than brute force.  Factoring a 1024-bit RSA modulus is considered equivalent to brute forcing a symmetric key of about 80-bits.  Check out the Keylength website to compare key sizes.

An RSA Example

p = first prime = 17
q = second prime = 23
n = modulus = 17 ∙ 23 = 391
e  = public exponent = 31
Euler Phi function = ϕ(n) = (17 – 1)(23 – 1) = 16 x 22 = 352
m = message = 101
c = ciphertext
p = plaintext

We can compute c:
c = m31 mod n = 10131 mod 391 = 186

Just for fun, let’s decrypt that.  First, we need to compute the multiplicative inverse of e mod ϕ (n).  I’m lazy so I used an online calculator at 2000 Clicks.  I found the multiplicative inverse of 31 mod ϕ(n)  = 159.  So:

p = cd mod n =186∙159 mod 391 = 101

To learn more, I strongly recommend Understanding Cryptography by Christof Paar and Jan Pelzl.  You can also check out my post on learning cryptography for additional resources.

1 comment:

  1. Its really difficult to understand this algorithm. I read this post several times and finally got to know the basic idea behind it. Thank you for this excellent detail.