Encryption Basics
A very basic introduction, from old CS 111 notes…
Encryption and Cryptography
Why do you trust a website when you give it your credit card information to purchase something? Why do you trust that no one in between you will be listening and be able to steal your credit card? Is it a matter of blind trust, or are there good technical reasons?
I hope you believe that there are good technical reasons – because there are!
Cryptography has long been used to deliver secret messages between people, governments, and armies. And there have been others trying to “crack the codes” to read those secret messages – indeed the cracking of German encryption techniques is often credited with turning the tide of World War II! see Wikipedia:Enigma_machine
Today, the Internet uses a variety of cryptographic techniques to establish security and trust for commercial and other Internet data transactions.
Goals of Cryptography
You might think that the only goal of cryptography is to keep data private – but there are other reasons too! The four main reasons for encryption techniques are:
- Confidentiality (privacy) – keeping data private, so that only authorized viewers can see it
- Data integrity – ensuring that the data that was sent is the data that was received, that no unauthorized changes have been made
- Authentication – being able to know for sure that who claims to have sent the data really is that person or entity
- Non-repudiation – the sender of the data cannot claim that they did not send the data
Simple Encryption: A Substitution Cypher
In class we spent lots of time on a simple substitution cipher. You can read more here: Wikipedia:Substitution_cipher
Modern Day Encryption Algorithms
Mathematicians and Computer Scientists have worked very hard in recent decades to create encryption schemes that “cannot” be cracked. The modern encryption algorithms are heavily mathematical and rely on the belief that factoring very large numbers (especially numbers that are the product of two prime numbers) is very difficult. Yes, we said “believe”, because no one has been able to prove it! If you can prove it you will win millions of dollars, be awarded the Nobel prize, and become world famous! See Wikipedia:Integer_factorization and Wikipedia:CS_Unsolved_Problems
The neat thing is that even if everyone knows the actual algorithms used to perform the encryption, the message still cannot be recovered unless one has the keys. About 20 years ago governments were trying to keep the algorithms secret, but they found it to be impossible and so essentially everyone shares the same algorithms now.
There are basically two kinds of encryption algortihms today, symmetric key (or secret key) encryption, and public-key encryption.
In symmetric key encryption, the sender and receiver must share the same key, and keep it secret. It is used both for encryption and decryption. If anyone finds out the secret key, they can read the messages and create fake encrypted messages.
Public-key encryption is fascinating and wonderful, and is the backbone of how we accomplish secure Internet communication. In public-key encryption, each person or entity has a key pair: a public key and a private key. The cool thing is that messages encrypted with the public key can be decrypted with the private key, and vice versa! Just as the name implies, the private key is kept private, or secret, and not shared to anyone, but the public key is tossed out into the public for all the world to see.
Using Public-key Encryption
Ok, so how do we accomplish the fourr goals of cryptography with public-key encryption?
-
Confidentiality: I want only you to read the message I send you. So, I encrypt the message with your public key, and send it to you. Since it can only be decrypted with your private key, and you keep that secret, then only you will be able to read it.
-
Data Integrity: don’t worry about it – if a message comes out unjumbled and readable, then all the data is intact. Data corruption will cause the whole message to be unreadable.
-
Authentication: I want to prove to you I am the sender. So, I encrypt the message with my private key, and send it to you. You try to decrypt it with my public key. If it works (a readable message results), then you know I sent it because it must have been encrypted with my private key.
-
Non-repudiation: If authentication worked, then no one else could have sent it because no other private key could have encrypted it.
Ok, these are nice, but typically we want all four together – so how to do that? Easy – just do both processes! For totally secure communication, first I encrypt the message with my private key, and then I encrypt that with your public key – voila! we have confidentiality and authentication (and integrity and non-repudiation).
Trusting Public Keys
The final issue is, how do I get your public key and how do I trust that it really is your public key, and not some hacker pretending to be you? This is the main problem with encryption that has been around since it first started being used: how to share keys?
Remember, our goal is to have trusted, private communication, but in order have that we have to communicate our keys (in any encryption scheme), and we need to do so in a trusted, private way! This is a chicken and egg problem!
In old days, people would personally deliver the keys, or have a trusted agent deliver them. Even today someone might FedEx you a private key!
With public key encryption, the problem is slightly different but still similar: the public key is easily available (it’s public!), but trusting that its the right public key is still the same old issue. So what’s the answer?
The best answer we have come up with, and the one the Internet uses when you trust Amazon, is to have a few trusted entities called Certificate Authorities, or CAs, that we can ask to verify public keys. In reality, it is the CAs who actually create the public/private key pair to use on the Internet. A company who wants to do secure Internet communication registers with a CA, verifies their identity in an off-line process (the chicken/egg problem), and then the CA issues them a certificate which is their identity and also contains their public key; the CA also gives them their private key.
Your web browser comes preloaded with the trusted CAs and their (also trusted) public keys. These are built-in and do not change. Then, whenever you surf over to a company’s secure (HTTPS) web site, the company gives your web browser their certificate. Now, remember how we do authentication! So how do we trust the certificate? Well, it contains the name of the CA who issued it, and then it contains some data that is encrypted with the CA’s private key. Since our browser has built-in the CA’s public key, it decrypts this data, and if the decryption succeeds, then we know that the CA issued the certificate, and then we can trust the company’s public key on the certificate. Yipee! We have secure, trusted Internet communication!
So when your browser pops up a message saying “Warning! This certificate cannot be verified – do you want to continue?”, what it is saying is that either 1) the certificate is from an unknown CA and you should not trust it; or 2) the certificate is from a known CA but it has expired and you should not trust it. Very often many of us ignore this message and continue, but just be sure you know what you are doing, and be careful!