Skip to main content
  1. Teaching/
  2. teaching/

·7 mins

Ciphers

For almost as long as humans have had writing, we have had ways of trying to conceal what we write. The oldest known cryptographic systems are from 1900 BCE, almost 4000 years ago.

The history of cryptography is a fascinating one, and if you're interested in it, I definitely suggest reading more. For this assignment, we will implement three cryptosystems: the Caesar cipher (used in the 100s BCE), the Vigenere cipher, developed as an extension to the Caesar cipher in the 1500s, and finally an almost-modern cryptosystem.

Preliminaries

In order to work with these systems, you'll need to know how to do a few things.

For the Caesar and Vigenere ciphers, we will only use lowercase English and spaces for simplicity (no punctuation, numbers, capital letters, etc will be allowed in the inputs). You will need to know how to turn a letter into a number and back again. You can use these functions to do that:

def char_to_num(c):
    return 0 if c == " " else ord(c) - ord('a') + 1

def num_to_char(n):
    return " " if n == 0 else chr(n + ord('a') - 1)

These will translate lowercase letters into the number of the English alphabet: "a" will become 1, "b" will become 2, etc. Spaces are represented by 0. You should write functions that transform a string into an int list and back.

Note: converting an list of characters into a string is somewhat tricky in Python. If you have an list of characters, you can use the join function to turn it into a string as follows:

x = ['p','y','t','h','o','n']
result = ''.join(x)
print(result)   # Gives "python"

To test if your function is working correctly, check that the following translations work for you:

  • "spider" -> [19, 16, 9, 4, 5, 18]
  • "omelette" -> [15, 13, 5, 12, 5, 20, 20, 5]
  • "with spaces" -> [23, 9, 20, 8, 0, 19, 16, 1, 3, 5, 19]
  • [8, 5, 12, 12, 15, 0, 23, 15, 18, 12, 4] -> "hello world"

You should now be ready to write the ciphers.

Caesar Cipher

This is the simplest cipher. The encryption function encrypt_caesar takes an input message which is a string, and a key which is an integer. It returns an list of ints which form the encrypted message.

The encrpytion works by shifting every letter by the same amount. For example, if the key is 3 and the message is "spider", then the encrypted message is [19 + 3, 16 + 3, 9 + 3, 4 + 3, 5 + 3, 18 + 3] = [21, 19, 12, 7, 6, 21].

The decryption function decrypt_caesar takes in an list of ints, which is an encrypted message, and a key. It subtracts every element from the encrypted message, turns the result back into a string, and then returns it. If the key is the same as the one used to encrypt the message, the return value should be the original message. Otherwise, the output might be gibberish, or the function might even crash.

You should check that your Caesar cipher works by encrypting and decrypting various strings. The encryption of "hello world" with a key of 2 should be [10, 7, 14, 14, 17, 2, 25, 17, 20, 14, 6].

Vigenere Cipher

The Caesar cipher is simple, but it has a significant weakness: it is trivially attacked by just trying all possible shifts (there are at most 26). A much better solution is to use a different shift for each character. This is what the Vigenere cipher does. It was invented in the 1500s and remained resistant to attacks for 300 years before finally being broken.

The encryption function of a Vigenere cipher encrypt_vigenere takes an input message and a key. Unlike the Caesar cipher, the key is a string. We turn both the message and key into lists of ints, then repeat the key until it is the same length as the message. This gives us a list of shift amounts.

Example of the key repeat process.

message   =  [16, 15, 20, 1, 20, 15, 5, 19]
key       =  [ 7,  3,  4]
shift_amt =  [ 7,  3,  4, 7,  3,  4, 7,  3]

We have repeated the key [7, 3, 4] until its length reached the length of the message. To accomplish this in Python, you could create a special shift_amts list and repeatedly push elements into it. Alternatively, you could use the * operator to repeat a list multiple times, then slice it to the right length.

We then shift each value in the message by the corresponding value in the shift list. For example, the first value of the message is 16, and the first value in the shift list is 7. The first value of the encrypted message will be 16 + 7 = 23. The second value of the encrypted message is 15 + 3 = 18, and so on.

The decrypt_vigenere function takes an encrypted message (int list) and a key, and decrypts the message by reversing the procedure applied by encrypt_vigenere.

Fake Stream Cipher

A modern stream cipher is similar to a Vigenere cipher, but instead of using a key phrase, it uses a pseudo-random number generator (PRNG) to create the shift amounts. To fix the values used by the PRNG, we use a single number, which is the key, and call random.seed(key) when starting decryption or encryption. We then use random.randint(0, 255) to generate the shift amounts. Similar to the Vigenere cipher, we should generate as many elements for the shifts as there are characters in the message.

Instead of adding the shift amount to the message, we should XOR the values together. In python, this is accomplished with the ^ operator. Write this into a function called encrypt_stream.

To decrypt the stream, we once again set the random seed to the key and generate a list of shift amounts. We then XOR the encrypted message with the shift amounts. (Unlike previously, where we added while encrypting and subtracted while decrypting, the XOR is its own inverse operation, so we do the same thing while encrypting and decrypting).

Write this into a function called decrypt_stream.

Insight

None for this assignment (it's long enough as it is!). However, if you would like to test your stream encryption function on arbitrary unicode text, you can use the following functions to turn any string (not just lowercase letters) into an list of numbers and back:

def str_to_intlist(input_str):
    bs = input_str.encode('utf-8')
    ns = [ int(b) for b in bs ]
    return ns

def intlist_to_str(input_list):
    if any( x > 255 for x in input_list):
        raise ValueError("Bytes in input list are invalid. Bad decryption?")
    b = bytes(input_list)
    return b.decode('utf-8')

intlist = str_to_intlist("小熊维尼") # Yields [229, 176, 143, 231, 134, 138, 231, 187, 180, 229, 176, 188]
retval = intlist_to_str(intlist) # Yields '小熊维尼'

Using this scheme, you should be able to encrypt and decrypt any UTF-8 text (in practice, pretty much any text that can be shown on a modern computer).

IMPORTANT: Do not try to use this stream cipher for any important communications! How easy it is to break this encryption is dependent on how good the random number generator used is. The default PRNG in Python is not strong enough to resist attacks!

In general, cryptography is very hard to get right, and unless you're an expert or being supervised by an expert, it is much better to use libraries written by experts.

Submission

There is no text interface for this assignment: your functions will be tested with a custom main() function that I will write. Therefore, make sure that all your code is in the 6 encrypt and decrypt functions specified above. All the functions should take two arguments: the message to be encrypted/decrypted first, and the key second.

Submit a single file named Ciphers.py on Canvas. Your file needs to compile and run. It should also have a header with the following information (this goes in your source file, not in the program output):

# File: Ciphers.py
# Student: 
# Course: Intro to Programming
# 
# Date:
# Description of Program: