RSA Explained (With Examples)

Motivation

RSA (Rivest-Shamir-Adleman) is one of the first public-key cryptosystems and is widely used for secure data transmission. In such a cryptosystem, the encryption key is public and is different from the decryption key which is kept secret.

If I wanted to comprehend zero knowledge proofs, then understanding the grand-daddy of public-key cryptosystems is a must.

Background Maths

Exponential Rules ¹

Division Theorem ²

The division theorem gives us a formal way to proof the equivalence relationship between the divident, divisor, quotient, and the remainder. i.e.

Example:

We can write the equation as .

Modulo Arithmetic

states that and both have the same remainder after division with .

Modular arithemtic and the divisor theorem are closely related – say we have , plugging those values into gives us .

We can write that as , or .

We can also go backwards:

More generally:

Greatest Common Divisor (gcd)

The greatest common divisor between two numbers is the largest integer that will divide both numbers.

Example: gcd(3, 9) = 3.

If one of the numbers in the gcd is a prime number, then the gcd will always be 1.

Multiplicative Inverse

A multiplicative inverse for a number , denoted by , is a number when multiplied by yields the multiplicative identity .

In modulo arithmetic, only numbers whose has a multiplicative inverse, i.e there exists

Euler's Totient Function

In number theory, Euler's totient function counts the positive integers up to a given integer n that are relatively prime to n.

In other words, the totient function (often represented as for number calculates the number of integers between and whose gcd is equal to .

More concretely in code:

def totient(n):
    total = 0
    for i in range(2, n):
        if gcd(i, n) == 1:
            total = total + 1
    return total

If is a prime number, then .

One important thing to note is that multiplication in the totient function is associative:

This will come into use when we calculate where .

Euler's Theorem ³

The following formula above holds true if the gcd between and is (a.k.a coprime).

RSA

Introduction

Say we have message , with public key , and secret key , we can encrypt with as cipher and decrypt with .

Algorithm

Generate two randomly large prime numbers , and .
Calculate .
Calculate totient of n .
Generate public key e that satifies the two constraints:
Calculate the multiplicative inverse of (this will be the private key) such that
The generated public key is , and the generated private key is .
Given message m, me mod n yields the encrypted message, and med mod n yields the decrypted message.
- This is because
- Therefore, , giving us our original message

Why RSA Works

Calculating Modulus

Our modulus, is calculated by multiplying the two prime numbers and . This step sort of acts like a one way function, easy to calculate given and , but hard to compute and given .

Whats scary to me is that computing the prime factors and is only considered a hard enough problem, meaning that if someone found out how to calculate and given with polynomial complexity, all encryption as we know it (e.g. SSL) will break.

This is essential to RSA's security as given a composite number (), it is considered a hard problem to determine the prime factors (, ).

Totient Function

. This is because the totient of a prime number is simply , and that multiplication is associative in the totient function ().

Public Key

Public Key is a number that is randomly chosen between , and has to satisfy . We need in order for a multiplicative inverse (our secret key) to exist ().

Secret Key

The secret key is calculated using the formula: . This process can be calculated using the extended euclidean algorithm ⁴ given parameters

Why RSA satifies proof of correctness even though the key generation is based on and not

Given message , public key , and secret key .

To encrypt a message we do:

And to decrypt said encrypted message, we do:

Our key generation uses the following formula:

Which can be rewritten as (re: equation

Substituting equation

into equation

yields us:

Which can be rewritten as (re: exponential rules

Rewriting using Euler's Theorem (

, equation

) gives us:

And so, we have just proved that RSA satisfies the proof of correctness even though the key generation is based on

and not

Conclusion

You don't need to be a math wizard to understand RSA ;-)

Example

"""
2019-01-06 Kendrick Tan
RSA

Rivest–Shamir–Adleman (RSA) is a process that allows
two parties to exchange secret information within
each other over an insecure line (e.g. the internet)

Party A sends Party B it's public key.
Party B uses the public key to encrypt the message they want to send
Party A receives encrypted message, decrypts it using their private key
"""

def gcd(a, b):
    """
    Greatest Common Divisor
    """
    m = min(a, b)

    for i in range(m, 0, -1):
        if a % i == 0 and b % i == 0:
            return i

    return 1

def xgcd(a, b):
    """
    Extended Euclidean Distance

    return (g, x, y) such that a*x + b*y = g = gcd(x, y)
    """
    x0, x1, y0, y1 = 0, 1, 1, 0
    while a != 0:
        q, b, a = b // a, a, b % a
        y0, y1 = y1, y0 - q * y1
        x0, x1 = x1, x0 - q * x1
    return b, x0, y0

def encrypt(msg, e, n):
    return ''.join([chr(ord(c)**e % n) for c in msg])

def decrypt(msg, d, n):
    return ''.join([chr(ord(c)**d % n) for c in msg])

## 1. Choose two distinct prime numbers p and q
p = 23
q = 31

## 2. Calculate n = p*q
n = p*q

## 3. Calculate the totient: phi(n) = (p - 1)*(q - 1)
phi_n = (p - 1) * (q - 1)

## 4.1 Choose integer e such that 1 < e < phi_n
e = 7
assert 1 < e < phi_n

## 4.2 Assert greatest-common-divisor (gcd) between e and phi_n = 1
## i.e. e and phi_n share no factors other than 1
assert gcd(e, phi_n) == 1

## 5. Compure d to satisgy the congruence relation d * e = 1 mod phi_n
## i.e. de = 1 + k * phi_n

## goal is to find d such that e*d = 1 mod phi_n
## EED calculates x and y such that ax + by = gcd(a, b)
## Let a = e, b = phi_n, therefore:
## gcd(e, phi_n) = 1 
## is equal to
## e*x + phi_n*y = 1
## take mod phi_n
## (e*x + phi_ny*y) mod phi_n = 1 mod phi_n
## = e*x = 1 mod phi_n
_, d, _ = xgcd(e, phi_n)

assert (d * e % phi_n) == 1

## 6. Encrypt a message using the public key (e)
## c = m**e % n
orig_msg = 'hello world'
enc_msg = encrypt(orig_msg, e, n)
assert orig_msg != enc_msg

## 7. Decrypt number using the private key (d)
## m = c**e % n
dec_msg = decrypt(enc_msg, d, n)

assert orig_msg == dec_msg

print(f'original message: {orig_msg}')
print(f'encrypted message: {enc_msg}')
print(f'decrypted message: {dec_msg}')

"""
This works because we know that:

d*e = 1 mod phi_n
d*e = k*phi_n + 1

c = m**e mod n
m = c**d mod n (sub c)
  = (m**e mod n)**d mod n
  = m**(d * e) mod n
  = m**(k*phi_n + 1) mod n
  = (m**(phi_n)**k)*m**1 mod n # note: m^(phi_n) = 1 mod n
  = (1 mod n)**k * m**1 mod n
  = 1**k * m**1 mod n
  = m mod n

https://crypto.stackexchange.com/questions/1789/why-is-rsa-encryption-key-based-on-modulo-varphin-rather-than-modulo-n
"""

Kendrick