The Compressocrat Cipher

The Compressocrat cipher was described by SHMOO (Larry Loen) in the May-June 1983 of the Cryptogram magazine. An interesting feature of this cipher is that the ciphertext is usually shorter than the plaintext.

To encode a plaintext as a compressocrat cipher you first translate the plaintext into a string made of copies of the digits 1,2 and 3. Then you translate the digit string into ciphertext letters using a table based on a secret key word or phrase.

Here is the table for translating a plaintext into a digit string. Note that different letters can translate into digits strings of different lengths. For example E translates to the two digit string '31' but Z translates into the six digit string '321113'.

Table for plaintext to digits:

E 31 I 322 P 3212 B 32112

T 12 R 323 F 3213 G 32113

A 13 S 112 C 1112 V 11111

0 22 H 113 U 1113 K 11112

N 23 L 212 M 2111 Q 11113

D 213 W 2112 X 321111

Y 2113 J 321112

Z 321113

As an example, suppose the plaintext is "We strike at dawn". Translating into a digit string we get:

w e s t r i k e a t d a w n

2112 31 112 12 323 322 11112 31 13 12 213 13 2112 23

To translate the digit string into ciphertext, we choose a key, say CODE, and form a key alphabet by removing from the key duplicated letters except for the first occurrence, and appending in alphabetical order all letters than do not occur in the key. Writing this key alphabet above the list of possible arrangements of the digits 1,2,3 we get:

Table for encoding:

CODEABFGHIJKLMNPQRSTUVWXYZ-

111111111222222222333333333

111222333111222333111222333

123123123123123123123123123

To continue our example we divide our digit string into groups of three and use the above encoding table to translate the digit string into letters. If the digit string does not divide exactly into three letter groups we append either '1' of '11' to complete the final three letter group.

211 231 112 123 233 221 111 231 131 221 313 211 223

I P O B R L C P F L U I N

Rewritten in five letter groups the ciphertext is:

IPOBR LCPFL UIN

Note that the ciphertext has 13 letters while the plaintext has 14 letters.

Decoding is the reverse of encoding. We translate the ciphertext into a digit string using the encoding table, then translate the digit string into plaintext using the plaintext to digits table.

There are a couple of questions about this procedure. In the encoding table the digit string '333' has no ciphertext letter above it. It turns out that a symbol for '333' is not needed because SHMOO cleverly constructed the plaintext to digits table so that '333' never occurs in any digit string. Another question is how do you know that decoding is unique? Couldn't a digit string be divided in different ways to give more than one plaintext? The answer is no. No digit string in the plaintext to digits table is a prefix for any other digit string in the table.