# RSA Explained (With Examples)

# Motivation

RSA (Rivest-Shamir-Adleman) is one of the first public-key cryptosystems and is widely used for secure data transmission. In such a cryptosystem, the encryption key is public and is different from the decryption key which is kept secret.

If I wanted to comprehend zero knowledge proofs, then understanding the grand-daddy of public-key cryptosystems is a must.

# Background Maths

## Exponential Rules _{[1]}

$$
\begin{align} \label{eq:exponent_rule}
g^{a-b} &= \dfrac{g^a}{g^b} \newline
g^{a+b} &= g^a g^b \newline
{(g^a)}^b &= g^{ab} \newline
(g^a\ mod\ p)^{b}\ mod\ p &= g^{ab}\ mod\ p
\end{align}
$$

## Division Theorem _{[2]}

The division theorem gives us a formal way to proof the equivalence relationship between the **divident, divisor, quotient,** and the **remainder**. i.e.

$$
\begin{align}
divident &\equiv quotient \cdot divisor + remainder \newline
n &\equiv k \cdot q + r
\end{align}
$$

**Example**: \( 9/2 = 4 \ remainder \ 1 \)

We can write the equation as \(9 = 2⋅4 + 1\).

## Modulo Arithmetic

\(a = b \ mod \ n\) states that \(a\) and \(b\) both have the same remainder after division with \(n\).

Modular arithemtic and the divisor theorem are closely related – say we have \(k = 2, q = 7, r = 3\), plugging those values into \(n = kq + r\) gives us \(n = 17\).

We can write that as \(17 = 3 \ mod \ 7 \), or \( 17 / 7 = 2 \ remainder \ 3 \).

We can also go backwards:

\(17 = 3 \ mod \ 7\)

\(17 = k⋅7 + 3\)

More generally:

$$
\begin{align}
n &= r \space mod \space q \newline \label{eq:modulo_division_format}
n &= k ⋅ q + r
\end{align}
$$

## Greatest Common Divisor (gcd)

The greatest common divisor between two numbers is the largest integer that will divide both numbers.

**Example:** gcd(3, 9) = 3.

If one of the numbers in the gcd is a prime number, then the gcd will *always* be 1.

## Multiplicative Inverse

A multiplicative inverse for a number \( x \), denoted by \( x^{-1} \), is a number when multiplied by \(x\) yields the multiplicative identity \(1\).

\(x⋅x^{-1} = 1\)

In modulo arithmetic, only numbers whose \(gcd(x, n) = 1\) has a multiplicative inverse, i.e there exists

$$
\begin{align}
x \cdot x^{-1} = 1\ mod\ n,\ \forall gcd(x, n) = 1 \label{eq:multiplicative_inverse_modulus}
\end{align}
$$

## Euler’s Totient Function

In number theory, Euler’s totient function counts the positive integers up to a given integer n that are relatively prime to n.

In other words, the totient function (often represented as *ϕ*) for number \(n\) calculates the number of integers between \(2\) and \(n\) whose gcd is equal to \(1\).

More concretely in code:

```
def totient(n):
total = 0
for i in range(2, n):
if gcd(i, n) == 1:
total = total + 1
return total
```

If \(n\) is a prime number, then \(ϕ(n) = n - 1\).

One important thing to note is that multiplication in the totient function is associative:

$$
\begin{align}
\phi(a \cdot b) &= \phi(a) \cdot \phi(b) \label{eq:totient_multiplication_associative}
\end{align}
$$

This will come into use when we calculate \(ϕ(n)\) where \(n = pq\).

## Euler’s Theorem _{[3]}

$$
\begin{align}
a^{ϕ(n)} ≡ 1 \space mod \space n \label{eq:eulers_theorem}
\end{align}
$$

The following formula above holds true if the gcd between \(n\) and \(a\) is \(1\) (a.k.a coprime).

# RSA

## Introduction

Say we have message *M*, with public key *P**k*, and secret key *S**k*, we can encrypt *M* with *P**k* as cipher *C* and decrypt *C* with *S**k*.

## Algorithm

- Generate two randomly large
**prime**numbers \(p\), and \(q\). - Calculate \(n = pq\).
- Calculate totient of n \(ϕ(n) = (p - 1)⋅(q - 1)\).
- Generate
*public key*\(e\) that satifies the two constraints:- \(3 < e < ϕ(n)\)
- \(gcd(e, ϕ(n)) = 1\)

- Calculate the multiplicative inverse of \(e, d\) (this will be the private key) such that \(ed = 1 \ mod \ ϕ(n)\)
- The generated public key is \( (e, n) \), and the generated private key is \( (d, n) \).
- Given message \(m\), \( m^{e} \ mod \ n \) yields the encrypted message, and \( m^{ed} \ mod \ n \) yields the decrypted message.
- This is because \(ed = 1 \ mod \ n \)
- Therefore, \(m^{ed} \ mod \ n = m^1 \ mod \ n \), giving us our original message

## Why RSA Works

### Calculating Modulus \( n \)

Our modulus, \( n \) is calculated by multiplying the two prime numbers \( p \) and \( q \). This step sort of acts like a one way function, easy to calculate \( n \) given \( p \) and \( q \), but hard to compute \( p \) and \( q \) given \( n \).

Whats scary to me is that computing the prime factors \( p \) and \( q \) is only considered a **hard enough** problem, meaning that if someone found out how to calculate \( p \) and \( q \) given \( n \) with polynomial complexity, all encryption as we know it (e.g. SSL) will break.

**This is essential to RSA’s security as given a composite number (\( n \)), it is considered a hard problem to determine the prime factors (\( p \), \( q \)).**

### Totient Function \(ϕ(n)\)

\( ϕ(n) = (p - 1)(q - 1) \ when \ p,q \ is \ prime \). This is because the totient of a prime number \(p\) is simply \( p - 1 \), and that multiplication is associative in the totient function (\ref{eq:totient_multiplication_associative}).

### Public Key \(e\)

Public Key \(e\) is a number that is randomly chosen between \( 3 < e < ϕ(n) \), and has to satisfy \( gcd(e, ϕ(n)) = 1 \). We need \( gcd(e, ϕ(n)) = 1 \) in order for a multiplicative inverse (our secret key) to exist (\ref{eq:multiplicative_inverse_modulus}).

### Secret Key \(d\)

The secret key \( d \) is calculated using the formula: \( d⋅e = 1 \ mod \ ϕ(n) \). This process can be calculated using the extended euclidean algorithm _{[4]} given parameters \( e \ and \ ϕ(n) \)

### Why RSA satifies proof of correctness even though the key generation is based on \(mod \ ϕ(n)\) and not \(mod \ n\)

Given message \(m\), public key \(e\), and secret key \(d\).

The encrypt a message we do:

$$
\begin{align}
m^{e} \space mod \space n
\end{align}
$$

And to decrypt said encrypted message, we do:

$$
\begin{align}
m^{ed} \space mod \space n \label{eq:decrypted_message_formula}
\end{align}
$$

Our key generation uses the following formula:

$$
\begin{align}
d \cdot e = 1 \space mod \space \phi(n)
\end{align}
$$

Which can be rewritten as (re: equation \ref{eq:modulo_division_format}):

$$
\begin{align}
d \cdot e = k \cdot \phi(n) + 1 \label{eq:modulus_division_substitute}
\end{align}
$$

Substituting equation \ref{eq:modulus_division_substitute} into equation \ref{eq:decrypted_message_formula} yields us:

$$
\begin{align}
m^{k \cdot \phi(n) + 1} \space & mod \space n
\end{align}
$$

Which can be rewritten as (re: exponential rules \ref{eq:exponent_rule}):

$$
\begin{align}
(m^{\phi(n)})^k \cdot m^k \space & mod \space n
\end{align}
$$

Rewriting using Euler’s Theorem (\(a^{ϕ(n)} = 1 \ mod \ n\), equation \ref{eq:eulers_theorem}) gives us:

$$
\begin{align}
(1 \space mod \space n)^k \cdot m^1 \space & mod \space n \newline
1^k \cdot m^1 \space & mod \space n \newline
\end{align}
$$

And so, we have just proved that RSA satisfies the proof of correctness even though the key generation is based on \(mod ϕ(n)\) and not \(n\)

$$
\begin{align}
\therefore m^{e \cdot d} \space mod \space n = m^1 \space mod \space n
\end{align}
$$

# Conclusion

You don’t need to be a math wizard to understand RSA ;-)

# Example

```
"""
2019-01-06 Kendrick Tan
RSA
Rivest–Shamir–Adleman (RSA) is a process that allows
two parties to exchange secret information within
each other over an insecure line (e.g. the internet)
Party A sends Party B it's public key.
Party B uses the public key to encrypt the message they want to send
Party A receives encrypted message, decrypts it using their private key
"""
def gcd(a, b):
"""
Greatest Common Divisor
"""
m = min(a, b)
for i in range(m, 0, -1):
if a % i == 0 and b % i == 0:
return i
return 1
def xgcd(a, b):
"""
Extended Euclidean Distance
return (g, x, y) such that a*x + b*y = g = gcd(x, y)
"""
x0, x1, y0, y1 = 0, 1, 1, 0
while a != 0:
q, b, a = b // a, a, b % a
y0, y1 = y1, y0 - q * y1
x0, x1 = x1, x0 - q * x1
return b, x0, y0
def encrypt(msg, e, n):
return ''.join([chr(ord(c)**e % n) for c in msg])
def decrypt(msg, d, n):
return ''.join([chr(ord(c)**d % n) for c in msg])
# 1. Choose two distinct prime numbers p and q
p = 23
q = 31
# 2. Calculate n = p*q
n = p*q
# 3. Calculate the totient: phi(n) = (p - 1)*(q - 1)
phi_n = (p - 1) * (q - 1)
# 4.1 Choose integer e such that 1 < e < phi_n
e = 7
assert 1 < e < phi_n
# 4.2 Assert greatest-common-divisor (gcd) between e and phi_n = 1
# i.e. e and phi_n share no factors other than 1
assert gcd(e, phi_n) == 1
# 5. Compure d to satisgy the congruence relation d * e = 1 mod phi_n
# i.e. de = 1 + k * phi_n
# goal is to find d such that e*d = 1 mod phi_n
# EED calculates x and y such that ax + by = gcd(a, b)
# Let a = e, b = phi_n, therefore:
# gcd(e, phi_n) = 1
# is equal to
# e*x + phi_n*y = 1
# take mod phi_n
# (e*x + phi_ny*y) mod phi_n = 1 mod phi_n
# = e*x = 1 mod phi_n
_, d, _ = xgcd(e, phi_n)
assert (d * e % phi_n) == 1
# 6. Encrypt a message using the public key (e)
# c = m**e % n
orig_msg = 'hello world'
enc_msg = encrypt(orig_msg, e, n)
assert orig_msg != enc_msg
# 7. Decrypt number using the private key (d)
# m = c**e % n
dec_msg = decrypt(enc_msg, d, n)
assert orig_msg == dec_msg
print(f'original message: {orig_msg}')
print(f'encrypted message: {enc_msg}')
print(f'decrypted message: {dec_msg}')
"""
This works because we know that:
d*e = 1 mod phi_n
d*e = k*phi_n + 1
c = m**e mod n
m = c**d mod n (sub c)
= (m**e mod n)**d mod n
= m**(d * e) mod n
= m**(k*phi_n + 1) mod n
= (m**(phi_n)**k)*m**1 mod n # note: m^(phi_n) = 1 mod n
= (1 mod n)**k * m**1 mod n
= 1**k * m**1 mod n
= m mod n
https://crypto.stackexchange.com/questions/1789/why-is-rsa-encryption-key-based-on-modulo-varphin-rather-than-modulo-n
"""
```