Number Theory - Part 1

05 Oct 2014

Number theory is an integral part of cryptography. It allows us to efficiently introduce a great deal of computational complexity into cryptosystems and is a really interesting subject to study.

In this part, I cover prime numbers, modular arithmetic, GCD, Modular Multiplicative Inverse, and Zn*. They’re all fundamental areas of asymmetric cryptography which you’ll see time and time again.

Recap: Prime Numbers

A prime number is an integer greater than 1, such that it is only divisible by the number 1 and itself. {2,3,5,7,…}

Why use prime numbers?

Prime numbers are important in cryptographic algorithms because they essentially allow us to create a one-way function. This is because the prime factorization of large numbers is currently too impractical to be a viable attack, and so adversaries are unable to calculate private keys.

Using RSA as an example, the product of two prime numbers is used as the public key (this is a composite value). Knowing the original prime numbers allows us to also create the private key, but since factoring the product of two large prime numbers (the public key) is too difficult, any adversary would be incapable* of finding out the original prime numbers from the public key, and would therefore be unable to calculate the private key. So, there’s a one-way function in that we can compute a public key from two prime numbers, but can’t compute the prime numbers from the public key. (You can find a lot more information about RSA on the Wikipedia page, or keep checking this blog for an RSA post.)

*There are various attacks on plain RSA which mean the cipher text can be decrypted revealing the plain text (e.g. it’s a deterministic cryptosystem; it doesn’t introduce any randomness, just jumbles everything up, allowing an adversary to perform a chosen plain text attack- in fact, plain RSA isn’t secure at all, and isn’t even a proper method of encryption!) http://en.wikipedia.org/wiki/RSA_(cryptosystem)#Attacks_against_plain_RSA

Modular Arithmetic

Or “remainder arithmetic” if you like, it means we can be certain that no matter what kind of mathematical operation we apply to a number, the result will always be within some range (0, … , n). Modular arithmetic is particularly useful because it introduces some difficult problems, i.e. discrete logarithm, and calculating the e^th root, where no efficient solutions exist. We can take advantage of these in order to prevent adversaries gaining access to the private key.

If we consider this operation:

5 mod 2

This translates to “what’s the remainder, when we divide 5 by 2?”

5 = 2 x 2 remainder 1, so 5 mod 2 = 1

You’ll see modular arithmetic a lot in public-key cryptography; it’s usually the case that you’re performing modulo on some very large prime number. For example, in the Diffie–Hellman key exchange, both parties perform modulo on a common large prime number, and eventually come to a shared secret key.

Note: DH is based on the discrete logarithm assumption.

Addition

5 + 3 (mod 7)
  = 8 (mod 7)
  = 8 = 7 * 1 remainder 1
  = 8 (mod 7) = 1

20 + 19 (mod 13)
  = 39 (mod 13)
  = 39 = 13 * 3 remainder 0
  = 39 (mod 13) = 0

Subtraction

With subtraction we go backwards. So just like with addition where we wrapped round to 0 after we reached n-1, when we go lower than 0, we wrap round to n-1.

10 – 13 (mod 7)
  = -3 (mod 7)
  = -3 -2 -1 0 1 2 3 4 5 6
  = 4
  = -3 (mod 7) = 4

5 – 13 (mod 3)
  = -8 (mod 3)
  = -8 -7 -6 -5 -4 -3 -2 -1 0 1 2
  = 1
  = -8 (mod 7) = 1

Greatest Common Divisor (GCD)

This shows us the highest number that any two numbers divide into.

For example,
gcd(5, 20) = 5
gcd(9, 21) = 3

Here’s something you’ll see crop up quite a bit:

for all x,y in Zn, there exists some values a and b in Zn such that
a*x + b*y = gcd(x, y)

This is important because it means that whatever the GCD of x and y is, we can find some values a and b, that when multiplied with x and y, give us the same answer. Being able to determine the values a and b is useful because it allows us to compute the modular multiplicative inverse.

Recap: Inverse

Let X be an integer. X’s inverse would be the value which when multiplied with X results in the value 1 (X/1).

10 * (1/10) = 10 * 0.1 = 1 // The inverse of 10 is 0.1
5 * (1/5) = 5 * 0.25 = 1 // The inverse of 5 is 0.25

Being able to calculate the inverse of a number is particularly useful when we want to invert some encrypted message to get back to its original value (as is the case with RSA).

Modular Multiplicative Inverse

Just like with normal inversion, modular inversion is the process of multiplying a value with another to result in the number 1, but now the result is based on the modulo operation

For example:
  x * y = 1 (mod N)
shows that y is the modular multiplicative inverse of x modulo N.

Here are some examples:

Note: The inverse of a value X is denoted as X^-1

13^-1 in Z₄ = 1
  13 * 1 (mod 4) = 1
23^-1 in Z₁₉ = 5
  23 * 5 (mod 19) = 1

How do I calculate the Modular Inverse?

So we know gcd(x,y) can be shown as the equation a*x + b*y, and we also know that the gcd can be used to determine the modular multiplicative inverse. In this section I cover how to resolve the modular inverse using some of Euclid’s wizardry (the extended Euclidean algorithm).

Note: for the gcd and egcd algorithms see the Code Examples section.

x has a multiplicative inverse modulo N, if and only if:

gcd(x, N) = 1

To find the modular inverse, we need to look at the a and b values…

gcd(x, N) = 1 = a*x + b*N

This shows us that a*x = 1 in Zn, and therefore that a = x^-1 in Zn

Code Examples

Here’s the GCD algorithm implemented in Python. Note that the standard GCD algorithm will only show us what the greatest common divisor is (it will not show us the values of a and b- for this we need the extended Euclidean algorithm).

# calculates the greatest common divisor
def gcd(x,y):
    if x < 0 or y < 0:
        raise ValueError("Invalid arguments: x and y must not be < 0");

    if y == 0:
        # can't mod 0
        return x;
    # gcd(x, y) = gcd(y, x mod y)
    return gcd(y, x % y);

Here’s the Extended Greatest Common Divisor algorithm in Python (or egcd). This gives us the greatest common divisor, but will also produce the a and b values, which enables us to determine the modular multiplicative inverse.

# calculates d,a,b where d = gcd(x,y) and a, b are used in a*x + b*y = d
def egcd(x, y):
    if x < 0 or y < 0:
        raise ValueError("Invalid arguments: x and y must not be < 0");

    if y == 0:
        # y == 0 is the base case! return d,a,b (x, 1, 0)
        return [x, 1, 0];
    # recursive call- where x is y, and y is x mod y
    div = egcd(y, x % y)
    # d doesn't change, it's just the value of gcd(x, y)
    d = div[0];
    # setting a to the previous b value
    a = div[2];
    # setting the new b to be a - (x/y*b)
    b = div[1] - (x / y * div[2]);
    return [d, a, b];

Zn*

Zn is used to represent the integers {0, … , n-1}.

Zn* is the set of invertible elements in Zn (those which are co-prime to n)

Recap:

R = set of real numbers

Q = set of rational numbers

N = set of natural numbers (includes 0, unless shown as N₁)

Z = set of integers

What are co-prime integers?

Two positive numbers x and y are co-prime (or relatively prime) if their highest common factor is 1, i.e. gcd(x,y) = 1

Are 3 and 6 co-prime? No, both 1 and 3 go into 3 and 6

Are 7 and 13 co-prime? Yes! The highest common factor is 1.

Since the invertible elements of Zn are all co-prime to n, we can fairly easily calculate Zn* using the gcd algorithm. Let’s look at some examples:

Z8 = {0, 1, 2, 3, 4, 5, 6, 7}
Z8* = {1, 3, 5, 7} because {2, 4, 6} all have common factors with 8 that are greater than 1

Z5 = {0, 1, 2, 3, 4}
Z5* = {1, 2, 3, 4}

Note: If n is prime, then Zn* is simply Zn \ {0}

Zn* is a Cyclic group

This means there exists some generator g, such that {1, g, g², g³, …, g^n-2} = Zp*

For example: Z5*  = {1, 3, 3², 3³} = {1, 3, 4, 2} the generator is 3.

Note: remember, when we’re raising the generator to values, we’re performing modular arithmetic- 3² is 9, but 9 mod 5 = 4, so we pop 4 into the result set.