The Chaocipher Clearing House

Progress Report #14

Moshe Rubin (mosher@mountainvistasoft.com)


No Chaocipher-Related Material Found at GCHQ

On 4 August 2009, Mike Cowan submitted a request to GCHQ (the UK Government Communications Headquarters which is the center for Her Majesty's Government's Signal Intelligence (SIGINT) activities) for any Chaocipher-related material they might have.  As can be seen below, GCHQ has no knowledge of such material on record.

Dear Press Office,

As discussed on the phone this morning, I am writing with details of a Cipher I am researching and on which I would be very obliged for any information that GCHQ can release to me from their archives.

The cipher is called Chaocipher and it was invented by John F Byrne in 1920 and publicised in his book 'Silent Years'. (Details appended.) It is a machine cipher and Byrne has not openly disclosed the machine design nor the nature of the keys. Instead he supplied lengthy plaintext and matching ciphertext, and challenged cryptanalysts to deduce how the machine works. To date this has not been achieved, at least not in the public domain.

In 1922 Byrne, an American of Irish extraction, submitted information on his cipher and a model of his machine to William F. Friedman, then a cryptanalyst in the Military Intelligence Division of the U.S. War Office in Washington. The model was returned after Friedman requested 50 enciphered messages of about 25 words each, to which Byrne apparently did not respond.

In 1937 Byrne approached the US Navy department with a booklet and a working model but no interest was evoked.

If in GCHQ archives there is any record of this cipher I would much appreciate a copy of material that can be made available.

With many thanks,

Michael J. Cowan.

. . .

Some  more details of Chaocipher:

J. F. Byrne's autobiography "Silent Years: An Autobiography with Memoirs of James Joyce and Our Ireland", published in 1953 tells the story of his cryptographic invention he called "Chaocipher" in Chapter 21. He relates the history of his invention, his attempts to interest
numerous organizations and concludes with 23 pages of corresponding plaintext and ciphertext enciphered using the Chaocipher system.

Mike received the following reply on 26 August 2009:

We have absolutely nothing in our archives about Chaocipher, and nothing to suggest that anybody at GCHQ (or GC&CS, as we were called from 1919 to 1946) has ever spent time looking at it.

This is not an absolute negative: all relevant surviving GCHQ files dated 15 August 1945 or earlier (as well as a small amount of subsequent material) have been released to The National Archives where they can be found in Class HW.  Details of HW can be found here:

http://www.nationalarchives.gov.uk/catalogue/displaycataloguedetails.asp?CATID=781&CATLN=2&Highlight=&FullDetails=True

I should warn you that this represents an enormous amount of material, and I would be extremely surprised if it contained anything about Chaocipher.

If the system was offered to the US Signals Corps, might it have been offered to the British Armed Forces?  GC&CS's was responsible for providing advice on the security of UK governmental and military communications but such advice was not always sought.

Kudos to Mike for initiating the request!

Jeff Hill submits research paper entitled "A Feasible Mechanism for the 1937 Byrne Cryptograph"

Jeff Hill has submitted a new paper entitled "A Feasible Mechanism for the 1937 Byrne Cryptograph".  In this paper Hill claims that an electro-mechanical cryptograph can be built, using 1937 technology, that would replicate the statistical signature of Byrne's machine, as was derived from analysis of Byrne's Exhibit 1.

The importance of this paper is in laying the boundaries of what Byrne may have done when implementing his 1937 Chaocipher model (in contrast with the earlier 1918 "cigar box" model).

An important consideration for the (1, 1, 2, 4) Hidden Markov Model steppings

To date, Jeff Hill's Hidden Markov Model (HMM) approach provides the best fit for Byrne's Exhibit 1.  If plaintext is enciphered with Jeff's C98A model (see his paper "Chaocipher: Analysis and Models" for the description) using a stepping vector of (1, 1, 2, 4), the resulting interval graph will best fit the corresponding interval graph of Exhibit 1.  As of yet, no better stepping vector has been found.

There is, however, one possible argument against the (1, 1, 2, 4) stepping vector, regardless of which C98* model is used.  The fact that there is a step of four in vector means that, in large enough text, there is a high probability of encountering an interval of 8 or even 7.  An interval of 7 could occur if the key stepping sequence were, for example, 2-4-4-4-4-4-4 = 26, while an interval of 8 could occur with a key stepping sequence of, say, 1-1-4-4-4-4-4-4 = 26.  Statistically we would expect 0.33+ instances of interval 7 and 1.9898+ instances of interval 8.

Empirical results with a C98A system and the plaintext of Exhibit 1 show the following:

Interval of 7Interval of 8Percentage
0024%
>002%
0>062%
1112%

In other words, in 25% of the time we would not expect any intervals of either 7 or 8.  Therefore, the fact that Byrne's ciphertext for Exhibit 1 has no intervals of 7 or 8 might not be statistically significant.

Mike Cowan is of the opinion that Chaocipher utilized a second enciphering disk, necessary to produce the same bigram variety (and other metrics) as found in Chaocipher.  In Mike's opinion, the existence of a second disk would drop the probability of having no intervals of 7 or 8 from 25% to 5%, making the absence of such intervals in Chaocipher much more significant.  TCCH looks forward to a future paper by Mike amplifying and explaining his thoughts on this important topic.

If the Chaocipher settings used by Byrne for Exhibit 1 theoretically did not allow for intervals less than nine, it would be logical that the stepping vector consisted solely of 1s, 2s, and 3s.  I was intrigues by this possibility and wrote a program that checked all 2,600 different stepping vectors, from length 3 to 26, consisting only of 1s, 2s, and 3s.  For each vector I:
  1. Enciphered the Exhibit 1 plaintext one hundred (100) times
  2. For each resultant ciphertext I calculated the chi-squared statistic, with a lower chi-squared value denoting a better goodness-of-fit with the observed Incidence Wave of Byrne's Exhibit 1 pt and ct.  I selected the lowest chi-squared value of the one hundred as representative of the stepping vector.
  3. I did the same for Jeff Hill's proposed (1, 1, 2, 4) stepping vector as a means of comparison.
Here were my results:

Computing wave for 1124                       :   0.071240
Computing wave for 111111122333333            :   0.102385
Computing wave for 11111111112223333333333    :   0.109733
Computing wave for 1111111112333333333        :   0.109797
Computing wave for 111123333                  :   0.120008
Computing wave for 11111233333                :   0.127012
Computing wave for 111111112233333333         :   0.127217
Computing wave for 111111111233333333         :   0.128618
Computing wave for 1111111112222333333333     :   0.129264
Computing wave for 11111111111222233333333333 :   0.130621
Computing wave for 11111111111112233333333333 :   0.131672
Computing wave for 1111111112233333333        :   0.131947
Computing wave for 111111222333333            :   0.131969
Computing wave for 1111112333333              :   0.132776
Computing wave for 1111111111122223333333333  :   0.133224
Computing wave for 11111111122222333333333    :   0.135109
Computing wave for 1111111111223333333333     :   0.137105
Computing wave for 11111111122333333333       :   0.137445
Computing wave for 111111111122233333333333   :   0.137569
Computing wave for 1112333                    :   0.137651
Computing wave for 111111111122333333333      :   0.137691
Computing wave for 11111111122233333333       :   0.139171
Computing wave for 11111111111122333333333333 :   0.139541
Computing wave for 11111112333333             :   0.139901
Computing wave for 11111111112222223333333333 :   0.140468
. . .

The stepping vector (1,1,2,4) was the best match by far, beating out its nearest contender (111111122333333) by a large margin.  My conclusion is that (1,1,2,4) is the correct stepping vector, that intervals less than nine could theoretically occur, and the fact that they did not is not statistically significant.

The Freedom of Information Act (FOIA) reply from NSA was a "Granted in Full" disposition

Progress Report #10 on this site detailed my Freedom of Information Act (FOIA) request from NSA back in March 2009 and the subsequent meager information returned (see the Historical Correspondences Related to Chaocipher web page for what NSA seems to have misplaced since 1985).  I was never informed whether my FOIA was granted in full, or whether any material was knowingly withheld.

On 2 August 2009 I submitted another on-line query to NSA asking what the disposition of my request was: granted in full, partial disclosure, material withheld, etc.  Here is NSA's reply of 10 August 2009.

Mr. Rubin,

This responds to the Freedom of Information Act (FOIA) request you submitted via the Internet on 2 August 2009, which was received by this office on 3 August 2009.  In your request, you state that you "would like to know what the disposition of FOIA Case 58395 was:  'Granted in Full', 'Partial Denial', or any other disposition?"  For tracking purposes, your 2 August 2009 request has been assigned FOIA Case 59326.

We are not processing your submission as a FOIA request as it does not ask for specific Agency records.  However, as a courtesy, we provide you the following explanation:

FOIA Case 58395 was a previous FOIA request from you dated 21 March 2009.  In that request, you asked for records related to John F. Byrne's "Chaocipher" machine device.  We conducted a search and located only one document, a segment of a book written by Mr.Byrne.  This document was released by NSA in a previous FOIA request.  Since this was the only document NSA located and since it was released in full, your FOIA Case 58395 was closed as a "Granted in Full."

Very Respectfully,

Marianne Stupar
POC, FOIA Requester Service Center
National Security Agency

The irony of it all is that NSA had declassified large amounts of Chaocipher-related material in 1985 (see Historical Correspondences Related to Chaocipher).  It is dismaying to think that, in 2009, they're either trying to hide the material or, worse, that they've lost track of it since 1985.

Copyright (c) 2009 Moshe Rubin
Created: 24 October 2009
Last updated:: 5 December 2009

Return to the home page