Prog 4: Decode
Over Thanksgiving break you hear on the news that your computer professor has disappeared. The last action he performed before falling out of sight was to send an encrypted email to his students. That email message contained:
IQHN OH XALM EJG JYOQ NBGNRV KMBQJ ZTNZZV
Because you had been talking about it in class before break, you know that this message was created by:
Taking the original text and possibly reversing each word.
Transposing each word by some amount. This is called the Caesar cipher. For instance transposing by 1 would convert "abc" to "bcd", and would convert "xyz" to "yza". Transposing letters at the end of the alphabet "wraps around" back to the beginning of the alphabet
Use the Vigenere cipher with some keyword. You start out by trying the words in the following paragraph:
Students I hope you are learning how to use a computer They can be tricky but powerful when you figure out how to use them In addition having technical skills can advance your career in ways you would not otherwise In any case this course will be over in about five weeks
Your job is to figure out what the encrypted text says by writing a computer program to try all possible encryption combinations. For each combination you will take the resulting text and look up the words in a dictionary to see if it is likely that particular combination is the original message.
Further Explanation
In order to try and decode the message, it is helpful to fully understand how the original plain text was encrypted in the first place. The first step reverses text. For instance, input of
hard work
when reversed would look like
drah krow
Be sure to check both the original order as well as reversed order when decrypting.
Next comes the transposition step, also called a Caesar cipher. The sample input of
hard work
when transposed by 1 would look like
ibse xpsl
Note that non-alphabetic characters are not transposed. The reversed text can also be transposed, which would give:
esbi lspx
Next comes the Vigenere cipher, which uses an Vigenere table which looks like:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
To use this a special keyword is used. For this example lets use the keyword: HIDDEN
Encoding:
Take the plaintext message and write the keyword over it, repeating it as many times as you need it until each plaintext message has a letter over it. This would give you:
HIDDENHID <-repeated keyword
HARD WORK <-original plaintext message
Now use the vertical pairs of letters to encode the plaintext message, which is the text: HARD WORK. The first (left-most) vertical pair of letters are 'H' on the top row and 'H' on the bottom row. Use the top 'H' as the row in the Vigenere table, and the bottom 'H' as the column, giving you the letter 'O'. The next vertical pair of letters are 'I' on the top and 'A' on the bottom. Use 'I' as the row, and 'A' as the column, giving you the letter 'I'. Next comes 'D' and 'R', which gives the letter 'U' (this is highlighted in bold in the table above). Resolving these pairs in turn gives:
HIDDENHID <-repeated keyword
HARD WORK <-original plaintext message
OIUG JVZN <-ciphertext (the encoded message)
Note that non alphabetic characters (spaces, punctuation) are not looked up in the table, but are simply copied to the ciphertext. Also note that the "plaintext" message in our case may already be reversed and transposed, which for the example we've been developing above would give:
HIDDENHID <-repeated keyword
ESBI LSPX <-original message, reversed and transposed
LAEL YZXA <-ciphertext (the encoded message)
So in this case, your program would need to take the String "LAEL YZXA" and try to figure out what the original text was.
Decoding:
To decipher a message simply take the top letter (from the keyword row, e.g. 'H' in the left-most column below) and follow it across until you find the corresponding ciphertext character (e.g. 'O' in the left-most column below). The character that is the label for that colum (e.g. 'H' in this case) is the original plaintext character.
HIDDENHID <-repeated keyword
???? ???? <-original plaintext message
OIUG JVZN <-ciphertext (the encoded message)
In our case you don't know what the keyword is, but you do know it is one of the words in the paragraph given near the top of this program description. You will need to try each word in that paragraph as a keyword. As you try each word your program will generate what it thinks are the words in the plaintext message. To avoid having to read all these yourself, your program must look up these possible words in a dictionary and keep track of how many of them are found. The keyword that gives the most correct dictionary words is the one that is likely to be correct.
Your program has the added complication that you won't be able to tell right away if a keyword is correct, because the original message may have also been transposed and reversed. So you not only need to look up the words generated for each keyword, but for each of those keyword sets you must also try all possible transpositions and direction (forwards and reverse) and then look up the words in the dictionary.
You need to know the following concepts in order to write this program:
Everything from the previous programs; arrays of char; string functions; reading text from a file; binary dictionary search
Notes:
Write the program in stages, as indicated below. Each of the below stages is worth a certain amount out of the 55 points for program execution.
(15 points) Running the program to create encoded text. This should look like:
Author: Dale Reed Program 4: Decode TA: Sandy Storm, T 6:00 AM Oct 30, 2012 Choose from the following options: 1. Encode some text 2. Decode using user-entered values 3. Decode the ciphertext given with the assignment 4. Exit program Your choice: 1
Enter the text to be encoded: all generalizations are false Enter direction (Forwards or Reverse): reverse Enter transposition value (0..25): 3 Enter a keyword for Vigenere encryption: secret Keyword is:secret Keyword, plainText and ciphertext are: secretsecretsecretsecretsecret all generalizations are false gsf zjjpyugeghwyuab jlh zzqum
To break this down even further, consider writing code in this order:
Create the overall program that prints identifying information, displays the menu, and reads in the dictionary into a fixed-size array.
Add the function to look up a word in the dictionary (you won't use this for awhile, but it is handy to add it at this point.)
Add code to store each of the following items. In subsequent steps we will do something with them, but for now just store the user input.
user input of text to be encoded, stored in an array
the direction ("forward" or "reverse")
the transposition value
the Vigenere encryption word
Write a function to take a string and a transposition value (0..25) and apply that transposition to that string.
Write a function to reverse a string. Note that this step and the one above it can be done in either order.
Depending on the user input, call the above two functions, yielding a string that has some transposition to it and may also be reversed.
Take a break, eat some candy, pat yourself on the back. If you are doing this within 3 days of when the program was assigned, awesome!
Write a function for the Vigenere encoding. You should send it the original string and a keyword, and the function should return the Vigenere-encoded string.
Now call the Vigenere function on the results from the previous steps.
(20 points) Add the ability to decode ciphertext. This should look like:
Author: Dale Reed
Program 4: Decode
TA: Sandy Storm, T 6:00 AM
Choose from the following options:
1. Encode some text
2. Decode using user-entered values
3. Decode the ciphertext given with the assignment
4. Exit program
Your choice: 2
Enter the cipherText to be decoded: gsf zjjpyugeghwyuab jlh zzqum
Enter the Vigenere keywords to be tried: one secret two
ciphertext is: gsf zjjpyugeghwyuab jlh zzqum
String of possible keywords is: one secret two
New best case. Dictionary hits: 1, keyword: one, transposition: 3, direction: Forwards
vie piyfxjwdvxvnkzq iax oppjc
New best case. Dictionary hits: 3, keyword: secret, transposition: 23, direction: Reverse
all generalizations are false
Decrypted text is:
all generalizations are false
( 20 points) Add the option to automatically decode the ciphertext given with the assigment. Rather than choosing one of the words from the paragraph shown above (Students I hope you are learning ... this course will be over in about five weeks) the encryption keyword must come from a file called analysis.txt which could be large, such as this file. I recommend you read in all the words, only storing unique words. You can ignore punctuation characters and consider them a space when it comes to trying to identify what is a word.
Author: Dale Reed Program 4: Decryt TA: Billie Joe Armstrong, T 6:00 AM March 29, 2009 Choose from the following options: 1. Encode some text 2. Decode using user-entered values 3. Decode the ciphertext given with the assignment 4. Exit program Your choice: 3
ciphertext is: IQHN OH XALM EJG JYOQ NBGNRV KMBQJ ZTNZZV
String of possible keywords taken from file: analysis.txt
New best case. Dictionary hits: 1, keyword: students, transposition: 12, direction: Reverse wzjc an vdtr acd yxqc ejgavg ckvfj ptsmbi New best case. Dictionary hits: 8, keyword: computer, transposition: 9, direction: Reverse help me give you five points extra credit Decrypted text is: help me give you five points extra credit
Exiting program...
Once you have this working, apply your program to decode the following:
wbc bpje qg ecsyikh iqmnju gux :rszmqypbd bnpwv sbig yiekkjk rm vux.lg/nnxmyg13
Turn in your program into Blackboard. Do not turn in dictionary.txt nor the analysis.txt file. Only turn in your program which should be called: Decode.cpp