# Intro to Cryptography

Adam Hassan / April 2021 (3117 Words, 18 Minutes)

I made this a while ago, so some details may be wrong

#### Transcript:

Intro to Cryptography

Number Systems

Try Yourself: 110001 = ? 000101 = ?

Number Systems

Try Yourself: 110001 = 49 000101 = 5

Encoding - Ascii and Hex Ascii consists of the (latin) alphabet and key special characters Each of these has a decimal mapping that can be turned into hex 65 = 0x41 = A 66 = 0x42 = B

Encoding - Bases Base32 - Consists of (A-Z), (2-7), and = for padding. Length multiples of 8 JFZW4J3UEBRXSYTFOIQHG3ZAMNXW63B7EBCXQ5DSME====== Base64 - Consists of (A-Z), (a-z), (0-9), +, /, and = for padding. Length multiples of 4 SXNuJ3QgY3liZXIgc28gY29vbD8= Base16, Base36, Base58, ,Base62, Base85, Base91, Base92 … also exist

These are all easily reversible, just like all encoding

Encoding - Big Integers Numeric way of representing bytes (each character is a byte) 0x41 = A (is one byte, or 8 bits) message: HELLO ALL ascii bytes: [72, 69, 76, 76, 79, 32, 65, 76, 76] hex bytes: [0x48, 0x45, 0x4c, 0x4c, 0x4f, 0x20, 0x41, 0x4c, 0x4c] base-16: 0x48454c4c4f20414c4c base-10: 1333159023296662031436 ⇐ This is the “BIG INTEGER” representation

Intro to Ciphers - Shift Cipher Characters in the message are shifted by n positions in the given alphabet If our alphabet is A-Z and our shift is 3 A -> D | B -> E | Z -> C If your alphabet is A-Z,a-z,0-9 and our shift is 3 A -> d | B -> e | Z -> c | z -> 2

Intro to Ciphers - Vigenère Cipher Upgrade from Shift cipher, because we can use text as a key instead of a number Plaintext: ATTACKATDAWN Key: LEMON Keystream: LEMONLEMONLE Ciphertext: LXFOPVEFRNHR

Intro to Ciphers - Vigenère Cipher Upgrade from Shift cipher, because we can use text as a key instead of a number Plaintext: ATTACKATDAWN Key: LEMON Keystream: LEMONLEMONLE Ciphertext: LXFOPVEFRNHR

Intro to Ciphers - Vigenère Cipher Upgrade from Shift cipher, because we can use text as a key instead of a number Plaintext: ATTACKATDAWN Key: LEMON Keystream: LEMONLEMONLE Ciphertext: LXFOPVEFRNHR

What’s the problem with ciphers? Very easy to crack - computers can do millions of instructions per second Only works on text - how do we encrypt an image? Keys can be stolen - would make encoding useless

Breaking ciphers - Brute Forcing If you have a 26 character alphabet, you only need to try 25 keys in a shift cipher If key of Vigenere cipher is 6 characters long, a message using a 26 character alphabet only takes 26^6 instructions to break (small for a computer)

Breaking Ciphers - Frequency Analysis Naturally, some characters in the alphabet show up more than others. We can look for a distribution of letters that matches the correct English Frequency to figure out which decoded messages are most likely to be correct

Solving Ciphers Substitution Ciphers: https://www.quipqiup.com/ Obscure Ciphers: https://dcode.fr/ Cipher Detector: https://www.boxentriq.com/code-breaking/cipher-identifier

Types of Encryption - Private key (symmetric) Same key is used for encrypting and decrypting

Can be insecure, because ␋key might be stolen

Types of Encryption - Public key (asymmetric) Different keys used for encrypting and decrypting

Has a public and private key

Asymmetric encryption Cryptography conventions: Alice: User A Bob: User B Eve (or Mallory): Eavesdropper (or Malicious)

To send to Bob: Alice encrypts data with Bob’s public key Bob decrypts data with his private key

Types of Ciphers - Block Ciphers Encrypt chunks (typically 64 bits) of plaintext at a time Simpler but slower and larger More secure (in theory)

Types of Ciphers - Stream Ciphers Encrypt each byte of plaintext at a time Faster, smaller and more complex Used most often to trade versatility and security for speed, especially in low-power environments (less RAM, CPU needs) Best used when data length is unknown or continuous Easily reversed

Has an Initialization Vector (IV) Should be different every time you start a new stream Simple example of stream cipher symmetric

Encryption Algorithms - RSA (Rivest-Shamir-Adleman) One of the original public-key encryption systems.␋Depends on the idea that large prime numbers (often 2^2048 bits) are difficult to factor

Encryption Algorithms - DES (Data Encryption System) Encrypts the plaintext with a 64-bit key Key is permuted to get a new 56-bit key This key is used to generate 16 48-bit subkeys Each subkey is a left-shift of the previous subkey Message is broken into 64-bit blocks, each of which is permuted Overall, is a symmetric-key block cipher

Encryption Algorithms - 3DES DES was insecure. In 1998, a message could be cracked in 3 days Now, it could be cracked in hours.

Is literally just the DES algorithm 3 times. Uses 3 different keys. (Effective key-length of 56*3 = 168 bits)␋␋2DES is the same as 1DES because of the Meet-in-the-Middle attack. Example Attack Cool Video about Breaking 2DES

AES (Advanced Encryption Standard) Upgrade of 3DES. If we could crack 1DES, it won’t be long until 3DES is cracked. Symmetric block cipher 128-bit block size 32-bit multiples for key length (Main reason it was better than DES) Now, you’ve probably seen 2048-bit keys So common that some instruction sets include special instructions for AES operations

AES - ECB (Electronic Code Book) Plaintext is divided into blocks of 64 bits For each block, the same key is used for encryption Only suitable for small messages. Why? If a section of plaintext repeats, the encrypted section will also repeat

Problem with ECB?

AES - CBC (Cipher Block Chaining)

Previous block is used to encrypt next block Has an IV (Initialization vector) to encrypt initial block (same length as block size)

Crypto Math - XOR ( ⊕ )
XOR is the exclusive or.

If one input is true, but not both

a b a and b a or b a xor b True True True True False True False False True True False False False False False False True False True True

Crypto Math - XOR ( ⊕ ) We can also apply this to numbers This is the easiest using multiple bits (called a bit string) 1 is True and 0 is False & represents an AND | represents an OR ^ represents an XOR 00100 & 11101 = 00100

00100 | 11101 = 11101 |

00100 ^ 11101 = 11001

TRY YOURSELF:

11001 & 00001 = ?

11001 | 00001 = ? |

11001 ^ 00001 = ?

Crypto Math - XOR ( ⊕ ) We can also apply this to numbers This is the easiest using multiple bits (called a bit string) 1 is True and 0 is False & represents an AND | represents an OR ^ represents an XOR 00100 & 11101 = 00100

00100 | 11101 = 11101 |

00100 ^ 11101 = 11001

TRY YOURSELF:

11001 & 00001 = 00001

11001 | 00001 = 11001 |

11001 ^ 00001 = 11000

Crypto Math - XOR ( ⊕ ) Properties Commutative: A ⊕ B = B ⊕ A Associative: A ⊕ (B ⊕ C) = (A ⊕ B) ⊕ C Identity: A ⊕ 0 = A Self-Inverse: A ⊕ A = 0

Crypto Math - Modular Arithmetic Often described as clock math For (12 hour) clocks, we do MOD 12

Works by getting the remainder in a division. 15 MOD 6 = 3 15 / 6 = 2 remainder 3

21 MOD 4 = 1 21 / 4 = 5 remainder 1 TRY YOURSELF: 36 MOD 13 = ?

Crypto Math - Modular Arithmetic Often described as clock math For (12 hour) clocks, we do MOD 12

Works by getting the remainder in a division. 15 MOD 6 = 3 15 / 6 = 2 remainder 3

21 MOD 4 = 1 21 / 4 = 5 remainder 1 TRY YOURSELF: 36 MOD 13 = 10

Throwback Math - GCD and LCM GCD = product of common prime factors LCM = product of highest prime factors 4200 = 2^3 * 3^1 * 5^2 * 7 180 = 2^2 * 3^2 * 5^1 GCD(4200,180) = 2^2 * 3^1 * 5^1 = 60 LCM(600,180) = 2^3 * 3^2 * 5^2 * 7 = 12600 A*B = LCM(A,B) * GCD(A,B)

Throwback Math - GCD and LCM (v2.0) Euclidean Algorithm - Basic def gcd(a, b): if a == 0 return b return gcd(b%a, a) # % is the same as MOD (from earlier)

gcd(600,180)

600 / 180 = 3 r 60 180 / 60 = 3 r 0 Because the remainder is 0, 60 is the gcd

Throwback Math - GCD and LCM (v2.0) Euclidean Algorithm - Extended def egcd(a, b): if a == 0: return (b, 0, 1) g, y, x = egcd(b%a,a) return (g, x - (b//a) * y, y)

Finds values x,y such that gcd(a,b) = ax + by

gcd(123,45) 123 / 45 = 2 r 33 45 / 33 = 1 r 12 33 / 12 = 2 r 9 12 / 9 = 1 r 3 9 / 3 = 3 r 0 Because the remainder is 0, 3 is the gcd

ecgd(123,45) = 3 = 123x + 45y Rewrite original equations and substitute until you get your answer. 123 / 45 = 2 r 33 ⇔ 123 = 45 * 2 + 33 Rewrite: 33 = 123 - 45 * 2

12 = 45 - 33 * 1
Substitute: 12 = 45 - (123 - 45 *2) * 1
12 = 45*3 - 123
Continue until you get 3 = 123(-4) + 45(11)
egcd(123,45) = 3, -4, 11

We Love Python

XOR Cryptography Many cryptosystems use XORs. This can sometimes be broken by bruteforce with if part of the plaintext is known.

RSA Math d = private key (private exponent) e = public key (public exponent) m = plaintext message c = ciphertext n = modulus Encryption

Very elegant cryptosystem compared to AES, but much slower

RSA Math d = 11 e = 5 m = B = 2 c = 4 = D n = 14 Encryption: 25(mod 14) = 32(mod 14) = 4 = D

Decryption: 411(mod 14) = 4,194,304(mod 14) = 2 = B

RSA Math - TJCTF d = 10924704402434431641693027830372982936967195199035477485425 e = 65537 m = 39614169128650010573550764077907923138941 = “tjctf{RSA_2_3asy}” c = 29031324384546867512310480993891916222287719490566042302485 n = 379557705825593928168388035830440307401877224401739990998883 Encryption: me(mod n) = 29031324384546867512310480993891916222287719490566042302485 Which equals (in bytes) b’\x04\x9f\xfd+M`\xa4\x7f>M,gn\xd5\x80R \xf4r\xd5\xdd\x1c\x0e8\x15’ Decryption: cd(mod n) = 39614169128650010573550764077907923138941 Which equals b“tjctf{RSA_2_3asy}”

Chinese Remainder Theorem (CRT) Public Key files come with (n, e) If we can get d, we can get the plaintext If we can get p,q (factor n), we can get d

RSA has a lot of weaknesses if not done correctly: phi is dependent on all primes that make up n

Breaking RSA - Small Primes Websites to get factors: http://factordb.com/ https://www.alpertron.com.ar/ECM.HTM␋

Breaking RSA - Small Primes RSA can have as many primes as you want Size of primes matter. Not of n.

Cheaper & effective to have 2 big primes.

Breaking RSA - Weak Primes - Smooth Primes Not all Primes are created equal: Smooth Primes: https://en.wikipedia.org/wiki/Smooth_number

␋Computationally cheaper to generate, as checking “smoothness” is expensive. Easy to Factor

Breaking RSA - Weak Primes - Smooth Primes Based on Pollard’s factorization method, which makes products of primes easy to factor if they are (B)smooth This is the case if p-1 | B! and q - 1 has a factor > B

Breaking RSA - Weak Primes - Smooth Primes Not all Primes are created equal: Smooth Primes: https://en.wikipedia.org/wiki/Smooth_number

␋Computationally cheaper to generate, as checking “smoothness” is expensive. Easy to Factor

Breaking RSA - Weak Primes - Similar Primes Based on Fermat’s factorization theorem, which states that any number can be represented as a difference of squares. If primes are too close to each other (or one based on the other), you can crack it. Proof: https://en.wikipedia.org/wiki/Fermat%27s_factorization_method

Breaking RSA - Small public key Not all Exponents are created equal: Making an exponent cheaper is computationally cheaper, but unsafe. Smallest possible e is 3

Breaking RSA - Small public key Coppersmith attack Used for when exponent is very small (e <= 5) Early variants of RSA proposed exponents as small as 3. We now use 2^16 + 1 = 65537 Proof: https://web.eecs.umich.edu/~cpeikert/lic13/lec04.pdf

Breaking RSA - Small private key (or large public key) Not all Exponents are created equal: Making an exponent cheaper is computationally cheaper, but unsafe. Smallest possible e is 3

Breaking RSA - Small private key (or large public key)

Dachshund / Wiener attack Used for when private exponent is very small: d < ⅓N^(¼) Proof: https://sagi.io/crypto-classics-wieners-rsa-attack/

Diffie Hellman Key Exchange Public-key protocol to exchange keys Distribute shared key without man-in-the-middle figuring out keys Use shared key for symmetric key cryptography Used in TLS handshake for communication between client and server

Diffie Hellman - Secret Key Sharing g is a generator Usually shared at the start of the handshake p is a large prime number a and b are private values, one for each of the devices These are not meant to be shared A and B are public values a & g are combined and b & g are combined. These combined values are typically exchanged between devices.

Diffie Hellman

Diffie Hellman To determine the secret key, g^(ab), the attacker must determine a or b.

Note that g^(ab) = g^a^b

Diffie Hellman - Discrete Log Problem http://ramanujan.math.trinity.edu/rdaileda/teach/s18/m3341/lectures/discrete_log.pdf

Diffie Hellman - Discrete Log Problem

Diffie Hellman - Discrete Log Problem Because the plot is so random, log_g(a) is considered hard to compute.

We can bruteforce log_g(a) by computing: g^1, g^2, g^2… until it equals a

There is no know algorithm that is more efficient…

Diffie Hellman - Discrete Log Problem Because the plot is so random, log_g(a) is considered hard to compute.

We can bruteforce log_g(a) by computing: g^1, g^2, g^2… until it equals a

There is no know algorithm that is more efficient…␋for the general case

Diffie Hellman - Bad Generator - Too small If the generator is small enough, we can solve the problem in O(sqrt(n)) time, where n is the order of the generator (g)

This is much better than brute force Baby Step Giant Step Algorithm

Diffie Hellman - Bad Generator - Nonprime If the generator is non-prime (or smooth), we can solve the problem much faster than baby step does.

Pohlig Hellman algorithm

An Aside - Sagemath Python Library to solve math things. Has built-in functions for solving discrete_log, baby step, pollard…

https://doc.sagemath.org/

https://sagecell.sagemath.org/␋

An Aside - Sagemath Also has its own interpreter Almost the same as python Can run online: sagecell

https://doc.sagemath.org/

https://sagecell.sagemath.org/␋

Elliptic Curve Cryptography (ECC) Public-Key cryptography also based on solving the discrete-log problem, but much cheaper Requires shorter key length for same strength␋␋ Secure Elliptic Curves have a lot of restrictions:

Elliptic Curve Cryptography (ECC) Curves follow the equation: y^2 = axy + by = x^3 + cx^2 + dx + e␋Coefficients determine security https://www.desmos.com/calculator/dipjuaeocw

Elliptic Curve Cryptography (ECC) https://pycryptodome.readthedocs.io/en/latest/src/public_key/ecc.html␋

Elliptic Curve Cryptography (ECC) Pick 2 points: A,B Use the points to make a line Find the intersection with the curve Flip the point on the other side. One Cycle:

Elliptic Curve Cryptography (ECC) Pick 1 points: P Find the tangent line at the point Find the intersection with the curve Flip the point on the other side. One Cycle: (More Efficient)

Elliptic Curve Cryptography (ECC) Repeat until you get to N iterations

N is your private key and NP is your public key.

Here, the public key is 3P

Elliptic Curve Cryptography (ECC) This is even harder, because you don’t know where to start. ␋␋(A + B) + C = A + (B + C)␋␋This is why we can have a smaller key for the same security

Elliptic Curve Cryptography (ECC) Same idea as modular arithmetic.

Elliptic Curve Cryptography (ECC) For Alice (A) and Bob (B) to share a secret key, they can exchange their public keys (NP and MP respectively) They both agree to start at a point P on the curve, and then use the other’s public key to reach the final destination

Alice uses Bob’s MP: MP + MP + … + MP = N x MP = NMP

Bob uses Alice’s NP: NP + NP + … + NP = M x NP = NMP

Shared Secret Key = NMP

Elliptic-Curve Diffie–Hellman

ECDH is nearly identical to regular DHKE
Alice and Bob publicly agree on:
an elliptic curve E, defined by a and b
a modulus p
a generator point G on E
Alice and Bob generate secret values A and B, then publicly share A*G and B*G
Alice and Bob can now independently calculate the shared secret A*B*G
the x-coordinate of A*B*G can be put into a Key-Derivation Function
the y-coordinate is entirely determined by the x-coordinate, so we don’t lose any security by throwing it out

Why Elliptic Curves? solving ECDLP is much harder than DLP, so a smaller key provides the same amount of security the difference is small, but significant embedded applications a server that has to perform thousands of handshakes every second Symmetric RSA/DHKE Elliptic Curve 80 1024 160 112 2048 224 128 3072 256 192 7680 384 256 15360 512 NIST recommended key sizes, in bits

Elliptic Curve Cryptography (ECC) - Small N We know the elliptic curve equation and the generator base element We also know the public key and the ciphertext

If the modulus is too small, we can

Elliptic Curve Cryptography (ECC) - Small g We know the elliptic curve equation and the generator base element We also know the public key and the ciphertext

If the generator is too small, we can use pohlig hellman (just like we did before with diffie hellman)

Elliptic Curve Cryptography (ECC) - CTF Challenge p1

Elliptic Curve Cryptography (ECC) - CTF Challenge p2 Now that we have B, we can use pohlig hellman to calculate the discrete log of all possible prime factors.

Then use the CRT to find n, which we can use to get the private key

Elliptic Curve Cryptography (ECC) - Small E If the elliptic curve order is too small, we can use a MOV attack MOV: Menezes, Okamoto, Vanstone

For small elliptic curves, we can map the points onto a different curve where discrete log calculations are easier

A New Age - Quantum Computing All of these algorithms work fine with classical computing techniques, but what about quantum computing?

Shor’s Algorithm: Can solve large factorization problems and discrete log in polynomial time, while it generally is exponential time.

The solution? Lattice based cryptography

Lattice Based Cryptography Has many different implementations. Based on the idea that large lattices (matrices) are hard to solve. One iteration is the shortest vector problem:

Shortest Independent Vector Problem (SIVP) NP-hard problem that doesn’t have a polynomial solution for quantum or classical computing.

Lattice Based Cryptography Lattice Basis Vectors We want to analyze the lattice and reduce as easily as possible. Thus, we use the LLL algorithm to change the vector the right to the one on the left (short and right angle) Lenstra–Lenstra–Lovász lattice basis reduction algorithm https://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%93Lov%C3%A1sz_lattice_basis_reduction_algorithm

Lattice Based Cryptography - Solving Use LLL to find better basis vector Find shortest vector in basis Make a lattice using original vectors and the shortest vector in your reduced basis Use the lattice to solve for all unknown variables in equation Scale solutions by varying the bit-length of variables Sometimes given. Sometimes you have to estimate / bruteforce

Moving Forward:

https://cryptohack.org/ https://cryptopals.com/

https://pycryptodome.readthedocs.io/en/latest/ https://doc.sagemath.org/html/en/tutorial/ https://github.com/Adamkadaban/CTFs