Basic Cryptography Explained
This is a blog post that teaches you the fundamental concepts of cryptography.
Hashing (cryptographic hashing)
A hash is a one-way function that takes a string of variable length as input and produces a (sufficiently) unique fixed-length output. Since a variable-length input results in a fixed-length output, data is lost. Let's let H be a basic hash function (where H-1 (the inverse, i.e. a function that can get the original string from the hash) does not exist). (If you still don't understand, look at a basic and weak hash function for 2 integers, the modulo operation (divide them and take the remainder); if 15 mod 2 = 1, you can't get the original 2 integers from the number 1!) Here's an example of an SHA2 256-bit hash:
Hash functions have many uses. One of them is preventing password leaks. Let's say we have a user account table on a server. Without hashes, the passwords would be stored in plaintext (let puser be the password of user "user"):
User | Password |
---|---|
user0 | puser0 |
user1 | puser1 |
user2 | puser2 |
user3 | puser3 |
If a malicious actor hacked into this server, they would be able to retrieve the plaintext passwords of all the users and use them for their own ends. They could even test the password on other sites because many people reuse the same passwords for different sites.
Now consider this user account table on a server. With hashing, it is not necessary to store the original password in the server at all:
User | Password |
---|---|
user0 | H(puser0) |
user1 | H(puser1) |
user2 | H(puser2) |
user3 | H(puser3) |
Since a (cryptographic) hash is by definition irreversible, even if a malicious actor managed to hack into this server, they would only obtain the password hashes. To obtain the original plaintext passwords, the only method (unless a vulnerability is discovered in the hash function) is hash cracking, which means trying every possible combination (or just from a list of common passwords) until H(current_pwd) = H(puser). For modern hash functions on modern computers, this requires an implausible amount of time and computational power.
Encryption
Asymmetric encryption
Asymmetric encryption is when the same key (let's call it k) is used for encryption and decryption. Let E(m, k) and D(c, k) be the encryption and decryption functions, respectively, where m is a message to be encrypted and c is an encrypted message to be decrypted. If a person encrypts a message with a certain key, only a person who knows that key can decrypt and read the message. Here are some examples:
D(c, "ADifferentKey") = "UnreadableGibberish"
D(c, k) = m
Symmetric encryption
Symmetric encryption is quite different from asymmetric encryption. Symmetric encryption uses a keypair which consists of a private/secret key, which should not be shared at all, and a public key, which should be shared with everyone, as it is needed for a user's participation in encryption signing.
This is the fundamental concept of symmetric or keypair encryption: the private key can decrypt anything that was encrypted by the public key, and the public key can decrypt anything that was encrypted by the private key. Let ks be the secret/private key, kp be the public key, and E(m, k) and D(c, k) be the encryption and decryption functions, respectively. Here are some examples:
D(c, ks) = m
-- and --
E(m, ks) = c
D(c, kp) = m
For example, if Alice wants to send a secret message to Bob, Alice encrypts her message with Bob's public key: E(m, kp) = c. Bob, upon receiving the message, is the only person who can decrypt it with his private key: D(c, ks) = m.
Digital signing
Digital signing is a method of message verification/integrity (making sure a message is the exact message the sender intended you to receive). The digital signature is made by encrypting the message hash with the private key. Let S be the digital signing function to sign a message m:
Remember, this function is irreversible since it contains a hash function (where data is lost).
To verify a digital signature (i.e. verify that the message wasn't tampered with), you can decrypt the signature with the public key to obtain the hash and then hash the message. If the two hashes match, then the message is valid. If they aren't, that means the original message has been tampered with. Here's a functional representation of verifying a message m with signature s (where s is S(m)):
That's it! Hopefully you now understand basic cryptography!