Encryption and Cryptography • Jonathan Cook

Why do you trust a website when you give it your credit card information to purchase something? Why do you trust that no one in between you will be listening and be able to steal your credit card? Is it a matter of blind trust, or are there good technical reasons?

I hope you believe that there are good technical reasons – because there are!

Cryptography has long been used to deliver secret messages between people, governments, and armies. And there have been others trying to “crack the codes” to read those secret messages – indeed the cracking of German encryption techniques is often credited with turning the tide of World War II! see Wikipedia:Enigma_machine

Today, the Internet uses a variety of cryptographic techniques to establish security and trust for commercial and other Internet data transactions.

Goals of Cryptography

You might think that the only goal of cryptography is to keep data private – but there are other reasons too! The four main reasons for encryption techniques are:

Confidentiality (privacy) – keeping data private, so that only authorized viewers can see it
Data Integrity – ensuring that the data that was sent is the data that was received, that no unauthorized changes have been made
Authentication – being able to know for sure that who claims to have sent the data really is that person or entity
Non-repudiation – the sender of the data cannot claim that they did not send the data

Some might also add Access Control (or Authorization), but this uses Authentication, and might add Anonymity, which is a flip side to Authentication.

Simple Encryption: A Substitution Cypher

In class we spent lots of time on a simple substitution cipher. You can read more here: Wikipedia:Substitution_cipher

Modern Day Encryption Algorithms

Mathematicians and Computer Scientists have worked very hard in recent decades to create encryption schemes that “cannot” be cracked. The modern encryption algorithms are heavily mathematical and rely on the belief that factoring very large numbers (especially numbers that are the product of two prime numbers) is very difficult. Yes, we said “believe”, because no one has been able to prove it! If you can prove it you will win millions of dollars, be awarded the Nobel prize, and become world famous! See Wikipedia:Integer_factorization and Wikipedia:CS_Unsolved_Problems

The neat thing is that even if everyone knows the actual algorithms used to perform the encryption, the message still cannot be recovered unless one has the keys. Before the mid 1990s governments were trying to keep the algorithms secret, but they found it to be impossible and so essentially everyone shares the same algorithms now.

There are basically two kinds of encryption algorithms today, symmetric key (or secret key) encryption, and public-key encryption.

In symmetric key encryption, the sender and receiver must share the same key, and keep it secret. It is used both for encryption and decryption. If anyone finds out the secret key, they can read the messages and create fake encrypted messages. Keeping many keys secret seems like a problem, and it is, but symmetric key encryption is still very valuable because it is much faster (computationally easier) than public-key encryption.

Public-key encryption is fascinating and wonderful, and is the backbone of how we accomplish secure Internet communication. In public-key encryption, each person or entity has a key pair: a public key and a private key. The cool thing is that messages encrypted with the public key can only be decrypted with the private key, and vice versa! Just as the name implies, the private key is kept private, or secret, and not shared to anyone, but the public key is tossed out into the public for all the world to see.

Using Public-key Encryption

Ok, so how do we accomplish the fourr goals of cryptography with public-key encryption?

Confidentiality: I want only you to read the message I send you. So, I encrypt the message with your public key, and send it to you. Since it can only be decrypted with your private key, and you keep that secret, then only you will be able to read it.
Data Integrity: don’t worry about it – if a message comes out unjumbled and readable, then all the data is intact. Data corruption will cause the whole message to be unreadable since the encryption algorithms “mix up” the data.
Authentication: I want to prove to you I am the sender. So, I encrypt the message with my private key, and send it to you. You try to decrypt it with my public key. If it works (a readable message results), then you know I sent it because it must have been encrypted with my private key.
Non-repudiation: If authentication worked, then no one else could have sent it because no other private key could have encrypted it.

Ok, these are nice, but typically we want all four together – so how to do that? Easy – just do both processes! For totally secure communication, first I encrypt the message with my private key, and then I encrypt that with your public key and send it to you – voila! we have confidentiality and authentication (and integrity and non-repudiation).

Trusting Public Keys

The final issue is, how do I get your public key and how do I trust that it really is your public key, and not some hacker pretending to be you? This is the main problem with encryption that has been around since it first started being used: how to share keys?

Remember, our goal is to have trusted, private communication, but in order have that we have to communicate our keys (in any encryption scheme), and we need to do so in a trusted, private way! This is a chicken and egg problem!

In old days, people would personally deliver the keys, or have a trusted agent deliver them. Even today someone might FedEx you a private key!

With public key encryption, the problem is slightly different but still similar: the public key is easily available (it’s public!), but trusting that its the right public key is still the same old issue. So what’s the answer?

The best answer we have come up with, and the one the Internet uses when you trust Amazon, is to have a few trusted entities called Certificate Authorities, or CAs, that we can ask to verify public keys. In reality, it is the CAs who actually create the public/private key pair to use on the Internet. A company who wants to do secure Internet communication registers with a CA, verifies their identity in an off-line process (the chicken/egg problem), and then the CA issues them a certificate which is their identity and also contains their public key; the CA also gives them their private key.

Your web browser comes preloaded with the trusted CAs and their (also trusted) public keys. These are built-in and do not change. Then, whenever you surf over to a company’s secure (HTTPS) web site, the company gives your web browser their certificate. Now, remember how we do authentication! So how do we trust the certificate? Well, it contains the name of the CA who issued it, and then it contains some data that is encrypted with the CA’s private key. Since our browser has built-in the CA’s public key, it decrypts this data, and if the decryption succeeds, then we know that the CA issued the certificate, and then we can trust the company’s public key on the certificate. Yipee! We have secure, trusted Internet communication!

So when your browser pops up a message saying “Warning! This certificate cannot be verified – do you want to continue?”, what it is saying is that either 1) the certificate is from an unknown CA and you should not trust it; or 2) the certificate is from a known CA but it has expired and you should not trust it. Very often many of us ignore this message and continue, but just be sure you know what you are doing, and be careful!

Making It All Efficient

So public key encryption makes HTTPS work, but the public key encryption algorithms are fairly expensive (computationally hard), so we would prefer not to use them on lots of data. So what do we do?

In HTTPS (and TLS, Transport Layer Security), once the sender and receiver have properly identified each other with public key encryption, the first thing they do is to generate a secret key and share it using public key encryption. This secret key will then be used in symmetric key encryption for the rest of the connection, and when the connection is done it is thrown away.

So the majority of data for the HTTPS connection is encrypted efficiently using symmetric key encryption, but we don’t have to worry about long-term maintenance of secret keys because we use public key encryption to authenticate and kick off the communication.

Wonderful!

Post-Quantum Cryptography?!?

While Ant-Man and the Wasp had some entertaining adventures in the MCU Quantum Realm, people in the real world are working hard to make Quantum Computing a real thing. There are skeptics, though.

Quantum computing offers the tantalizing possibility that very complex computational algorithms might be solved extremely fast, because their exponential branching behavior can be “automatically” managed as superpositioned quantum states.

If large-scale quantum computing happens (and some think it will, “soon”), then the fundamental foundation of modern cryptography – the exponential difficulty of factoring large numbers – will be broken, and all of our modern encryption algorithms will be useless.

So, mathematicians and computer scientists are working on new cryptographic foundations for the time when post-quantum cryptography is needed.