The Chaocipher Clearing House

Progress Report #6

Moshe Rubin (mosher@mountainvistasoft.com)

Repetitions in the Chaocipher Exhibits

Greg Mellen [5] has the following to say about repetitions in Chaocipher:

Repetitions.  Repetitions are in accord with what one expects for random text.  There is a 5-letter repetition, XXACN, at an interval of 602 characters in lines 122 and 133 in Exhibit 1.  The corresponding plaintext is different, however, being respectively ESTAB and NISAH.

I wanted to double-check Mellen's results and ran Exhibit 1 through a repetition-finding software program looking for repetitions of five letters or more.  Here are the results:
Repetitions in Exhibit 1: Plaintext/Ciphertext

+------------------------------------------------------------------------------+
|            |      Line     |     Offset    |           |      Plaintext      |
| Repetition +-------+-------+-------+-------+ Distance  +----------+----------+
|            |  1st  |  2nd  |  1st  |  2nd  |           |    1st   |   2nd    |
+------------+-------+-------+-------+-------+-----------+----------+----------+
| WQKRXD     |     5 |   138 |   250 |  7538 |    7288   | RLAZYD[O]| ENHISG[O]| 2*2*2*911
| TXUVO      |    20 |   166 |  1057 |  9125 |    8068   | KBROW[N] | LPOWE[R] | 2*2*2017
| PQHMN  (*) |    22 |    34 |  1159 |  1819 |     660   | OODQQ[U] | OODQQ[U] | 2*2*3*5*11
| KNDXD      |    22 |   108 |  1179 |  5909 |    4730   | UMPOV[E] | THEEA[R] | 2*5*11*43
| LQYMR      |    39 |    95 |  2145 |  5195 |    3050   | ALLGO[O] | MPOVE[R] | 2*5*5*61
| DLNAA      |    50 |    68 |  2717 |  3690 |     973   | SJUMP[O] | ODQQU[I] | 7*139
| MOWLH      |    79 |    92 |  4345 |  5057 |     712   | ALLGO[O] | TYWAL[L] | 2*2*89
| XXACN      |   122 |   133 |  6686 |  7288 |     602   | ESTAB[L] | NISAH[I] | 2*7*43
| ETOSX      |   202 |   240 | 11067 | 13167 |   12100   | LEWNO[R] | DHERE[T] | 2*2*5*5*11*11
| EISOT      |   202 |   236 | 11097 | 12929 |   11832   | TIONS[T] | DICAT[E] | 2*2*2*3*17*29
+------------+-------+-------+-------+-------+-----------+----------+----------+

(*) Although the plaintexts are in sync, the repeats stop after five letters. 
    The corresponding ciphertexts are PQHMN[FHX] and PQHMN[MID].
The bracketed letters in the plaintext columns denote the  letter that follows the repetition.  I inserted it to possibly indicate why the ciphertexts change at that point.

As it turns out, there are numerous five-letter repetitions, and even a six-letter one.  If a repetition is meant to denote identical plaintexts, then only PQHMN can be considered causal (even the six-letter repeat seems to be accidental).  The strange thing is that, although the plaintext continues to be in sync, the ciphertext repetition stops after five letters.  I checked to see where the respective offsets fall within 13-letter and 26-letter blocking:
1159=2 mod 13     1819=12 mod 13
1159=15 mod 26    1819=25 mod 26
Had the sixth plaintext letter occurred at a block break, that might have explained it.  Alas, it does not.

No other exhibit displays causal repetitions (see below) except for Exhibit 5 message 3, which shows a tantalizing five-letter repetition:

RepetitionOffsets
FirstSecond
ZH167
TK848
TT10120
TL11156
NV1962
YH2546
SM27149
MG2832
XJ30122
FL43112
XU52124
CB85142
BBNKF105136

But there is another reason to conclude that all the repeats (excluding the one in Exhibit 5 message 3) are accidental.  Cryptologia carries the article "Kasiski's Test: Couldn't the Repetitions be by Accident?" written by Klaus Pommerening [6].  It is an excellent article which touches on the question "what is the probability of having a repetition of length R in a message of length M from alphabet size A?" (he uses the same method used to calculate the Birthday Paradox).  I wrote a script to compute the probabilities for all five exhibits and got the following results:
+---------+------------------------------------------------------+---+----------------+
|         |                 Repetition Length                    |Max|                |
| Message +------------------------------------------------------+Rep|    Comment     |
| Length  | 1      2        3          4         5         6     |Len|                |
+---------+------------------------------------------------------+---+----------------+
|  13336  | 1      1     1.0000     1.0000    0.999437  0.249958 | 6 |  Exhibit 1     |
|   1263  | 1      1     1.0000                                  | 3 |  Exhibit 2     |
|    910  | 1      1     1.0000                                  | 3 |  Exhibit 3     |
|   1908  | 1      1     1.0000     0.981204                     | 4 |  Exhibit 4     |
|    162  | 1   1.0000   0.516118   0.027116  0.001043           | 5 |  Exhibit 5, #3 |
+---------+------------------------------------------------------+---+----------------+

Results:
There is one thought that I had but I don't think it's feasible.  Looking at the 6-letter repetition in Exhibit #1, let us suppose that, fortuitously, the Chaocipher machine returned to the exact same settings both times.  The individual letters of the first plaintext may have affected the machine in precisely same way as those of the second plaintext string (e.g., some letters have the influence on the machine).  We could then infer that the following letter pairs have a similar effect on the machine:
R <-> E
L <-> N
A <-> H
Z <-> I
Y <-> S
D <-> G
There are two problems with this idea:
  1. Why wouldn't the "O" in the seventh position extend the repetition?  Both O's do not come at the end of a 13- or 26-letter block.
  2. If  the same ciphertext with the same machine settings can decipher to different plaintexts then we have a polyphonic-like ambiguity problem (similar to decrypting Key Phrase ciphers).
If I were more convinced that the six-letter repetition were causal I'd feel more confident pursuing this track.  Something to bear in mind in the future.

Coincidences Between the 100 "All Good ..." Encipherments

I was curious to see if correlating the coincidences (or "hits") between each pair of the first 100 lines in Exhibit 1 would produce something of value.  These 100 lines are 55-letter blocks of the identical plaintext beginning "All Good, Quick Brown Foxes ..." where the comma and period are enciphered as Q and W, respectively.  The expected number or coincidences for two random blocks is computed as Kr (pronounced 'kappa-random') times 55 = 0.0385 * 55 = 2.12, while a causal number of coincidences is Kp ('kappa-plain') times 55 = 0.0667 * 66 = 3.67.  The following table shows the lines that had seven or more hits:

LineNumber of Hits
FirstSecond
223411
8469
10889
6478
11898
19348
21808
311008
1927
8297
9377
11397
21907
23417
26957
29927
31777
41837
53897
55767
58867
63837
77787
77807
81967
82867
2246
3246
.........
88926
88996

As impressive as finding eleven hits may be, graphing the full results shows the distribution is definitely Poisson:

Graphing number of coincidences between "All Good" blocks

Using a Poisson Calculator with an average rate of success of Kr * 55 = 2.12 and (100 x 99)/2 = 4950 distinct comparisons, I calculated the expected number of coincidences:

Poisson Calculations
Number of
Coincidences
Frequency(o-e)2/e
Expected
(e)
Observed
(o)
05945562.43
1126012830.42
2134513100.91
39439851.87
45004980.01
52122240.68
674680.49
722180.73
8650.17
91.420.26
100.300.3
110.06114.73
Total4957495022.98

Need to find the confidence interval.

References in the Open Cryptographic Literature with Relevance to Chaocipher

I had a thought the other day: are there any covert references to Chaocipher in the open cryptographic literature?  My idea was that cryptographic authors may have had Chaocipher in mind while authoring a cryptographic article or text.  Did William F. Friedman refer to Chaocipher, even in an oblique way, when writing his "Advanced Military Cryptography"?  Are there any such vague references in interviews written up in Cryptologia?  So, armed with some quiet and a hot drink, I started pulling out and scouring books from my library.  Here are some interesting quotes I found.

William F. Friedman in Advanced Military Cryptography

In Advanced Military Cryptography [1], written in 1944, William F. Friedman writes the following in paragraphs 72 and 74:

72.  Substitution-cipher machines. -- a.  The substitution principle lends itself very rapidly to the construction of cipher machines for effecting it.  The cryptographs described in the preceding two sections [Ed. the Wheatstone cipher and the M-94 device], as well as the simpler varieties making use merely of two or more superimposed, concentric disks are in the nature of hand-operated substitution-cipher mechanisms that are difficult to use, cannot be employed for rapid or automatic cryptographic manipulations, and are quite markedly susceptible to errors in their operation.  For a long time these defects have been recognized and many men have striven to produce and to perfect devices more automatic in their functioning.  However, the would-be inventors have not, as a rule, realized the complexity of the problems confronting them; nor have they approached these problems with the necessary and thorough knowledge of both theoretical and practical cryptography, with its many limitations, and theoretical as well as practical cryptanalysis, with its wide possibilities for the exercise of human ingenuity.

74.  Machines affording polyalphabetic substitution. -- a.  In recent years there have been placed upon the commercial market several cipher machines of more than ordinary interest, but they cannot be described here in detail.  In some of them the number of secondary alphabets is quite limited, but the method of their employment, or rather the manner in which the mechanism operates to bring the cipher alphabets into play is so ingenious that the solution of cryptograms prepared by means of the machine is exceedingly difficult.  The point should be clearly recognized and understood: other things being equal, the manner of shifting about or varying the cipher alphabets contributes more to cryptographic security than does the number of alphabets involved, or their type.  For example, it is possible to employ 26 direct-standard alphabets in such an irregular sequence as to yield greater security than is afforded by the use of 1,000 or more mixed alphabets in a regular or an easily-ascertained method.  The importance of this point is not generally recognized by inventors.


This was written some 20 years after Friedman analyzed the Chaocipher (see [2]).  I assume the Chaocipher crossed Friedman's mind while writing these paragraphs.

Lambros D. Callimahos in "The Legendary William F. Friedman"

In Lambros D. Callimahos's fascinating article "The Legendary William F. Friedman" [3] we find reference to William F. Friedman's standard request for material from prospective cryptographic inventors, quoted by Callimahos from Friedman's technical paper "The Principles of Indirect Symmetry of Position in Secondary Alphabets and Their Application in the Solution of Polyalphabetic Substitution Ciphers":

A set of 50 test messages, each 25 letters in length and beginning at the same initial enciphering juxtaposition, was submitted by Mr. Burdisk.

This is quite similar to Friedman's request from Byrne, quoted in [2]:

In a letter, September 7, 1922, William F. Friedman, responding to a previous question of Byrne's about the type of material he needed to solve the Chaocipher, said "a series of fifty messages of approximately twenty-five words each might be sufficient ..."

Notice the request for 25 words per message for Chaocipher versus 25 letters from Mr. Burdisk, but the "50 messages / 25 elements" formula is still there.  Also interesting is that Friedman's request for "in-depth" messages is phrased as "beginning at the same initial enciphering juxtaposition".  Deavours's and Kruh's messages in Exhibit 5 fulfill this requirement: they probably begin at the same machine settings but diverge immediately.

In [3] Callimahos alludes to an interesting cryptographic machine -- could he be referring to Chaocipher?

Friedman studied many proposals for cryptographic systems, embracing both manual and machine methods, demolishing everything that came his way.  Good cryptographic ideas were hard to come by, as requirements were stiff and standards high. ... In another case, an ingenious machine fractionated a plaintext letter into two parts, subjected these fractional parts to a complex substitution, and finally recombined the parts to produce a single plaintext letter: this was a brilliant idea that did not long withstand Friedman's scrutiny.

The following quote from Chapter 21 in Byrne's "Silent Years" [4] makes reference to 'splitting' or fractionating the written word:

In a preceding chapter I have referred to Rutherford's achievement in 1919 of splitting an atom for the first time.  In the preceding year, 1918, I had discovered a method of doing something to the written word, in any language, which affected that written word so as to result in its chaotic disruption.

Could Callimahos be referring to Chaocipher, addressing the fractionating nature of the mechanism?  I know it's a long shot, but in any case it gives an interesting direction to pursue with Chaocipher.

Misunderstanding the Byrne-Friedman Relationship

A casual reader of Byrne's "Silent Years" could be excused for drawing a negative impression of William F. Friedman in his relationship with John F. Byrne.  In "Silent Years" pp. 275-276 we read the following:

In the week after receipt of this letter [i.e., from Parker Hitt], I arrived once more with my first model in Washington, where I was met on March 17, 1922, by Colonel Hitt, who immediately escorted me in person to give me a glowing introduction to both Major Moorman, and Mr. W. F. Friedman, Cryptanalyst.

Nearly five month later I wrote to Major Moorman and received the following reply:
. . .
August 26, 1922
. . .
Dear Sir:--
I have for acknowledgment of your letter of August 21st and wish to assure you that I have not forgotten the profitable hour we spent together.  I am sending a letter to Mr. Friedman with request that he communicate with you with reference to your cipher device.

Very sincerely yours,
Frank Moorman
Major, General Staff

And a few days afterwards I received by parcel post from Washington a package containing my cipher model smashed into smithereens.


The reader might conclude that Friedman or the military establishment were vindictive thugs out to impede any progress Byrne might make.  Curiously, Byrne omits a letter Friedman sent him on September 7, 1922, just twelve days after Moorman replied to Byrne:

In a letter, September 7, 1922, William F. Friedman, responding to a previous question of Byrne's about the type of material he needed to solve the Chaocipher, said "a series of fifty messages of approximately twenty-five words each might be sufficient ...". [2]

We can assume that the package Byrne received from Washington contained his cipher machine (broken in transit?) and a letter from Friedman requesting cipher material according to Friedman's standards.  Byrne does not tell the reader about Friedman's request, nor whether he sent the requested ciphertexts, nor whether Friedman succeeded in breaking the requested material.  In my opinion, Byrne's narrative unfairly lacks all the facts.  His side of the story leaves us with the feeling that he wrote more to ease his own feelings about Friedman and the military than to set the record straight.

Locating the Plaintext to Exhibit 5, Message 3?

What can we make of the five-letter, highly causal repetition found in Exhibit 5 message 3?  I believe it should be examined more closely.  I wondered whether we could place message 3 in Easton's book by locating a five-letter plaintext repetition at a distance of 31 characters.  A cursory examination in chapters 1 and 2 found such repetitions, but the resulting plaintexts are not complete sentences, or even word-complete (i.e., begin and end with complete words).  The closest match (at offset 8457 in chapter 2) began at the beginning of a sentence but was cut off at the end:

"HOWTHERAIL" and "HOWTHELOCA" at offset 8457
PT: ATTHATEARLYAGEHELEARNEDBYIMITATIONANDEXPERIMENTATIONQSPURREDONBYHISINTERESTINTHEM
CT: JZHASQNRTKTTLZDYOWLNVDMWNYHSMGXJMGZQTHRIWTIFLXYHTKBOXUYEANJUDXNVOGFZHMJEGRDGGPUGS

PT: ECHANICSOFHUMANACTIVITYHHOWTHERAILWAYSTATIONWASMANAGEDQHOWTHELOCALFLOURMILLOPERAT
CT: XVBACBEPKWHVSBIJGOHKVKAIBBNKFHFFLSFMIINTTXJXUHWQAPTSNBTBBNKFUCBPIONQSMVEHUXTLMRRA
                            ^^^^^                          ^^^^^

I hope to continue examining this phenomenon.

References

[1] Friedman, William F.  Advanced Military Cryptography.  Aegean Park Press, 1976.

[2] John Byrne, Cipher A. Deavours and Louis Kruh.  Chaocipher enters the computer age when its method is disclosed to Cryptologia editors.  Cryptologia, 14(3): 193-197.

[3] Callimahos, Lambros D. The Legendary William F. Friedman.  Cryptologic Spectrum, Winter 1974, Volume 4 Number 1.  Reprinted in Cryptologia, July 1991 (pp 219-236), and available as an NSA declassified document at http://www.nsa.gov/public_info/_files/cryptologic_spectrum/legendary_william_friedman.pdf.

[4] Byrne, John F. 1953.  Silent Years.  New York: Farrar, Straus & Young.

[5] Mellen, Greg.  1979.  J. F. Byrne and the Chaocipher, Work in Progress.  Cryptologia, 3(3): 136-154.

[6] Pommerening, Klaus.  Kasiski's Test: Couldn't the Repetitions be by Accident?  Cryptologia, October 2006, 30(4): 346-352.

[7] Easton, Stewart C.  Rudolf Steiner: Herald of a New Epoch.  Anthroposophic Press.  1980.


Copyright (c) 2009 Moshe Rubin
Created: 20 March 2009
Last updated: 30 September 2009

Return to the home page