The Chaocipher Clearing House

Progress Report #11

Moshe Rubin (mosher@mountainvistasoft.com)


Table of Contents

Correspondences relating to Jeff Hill's Paper "Chaocipher: Analysis and Models"

Correspondences relating to Jeff Hill's Paper "Chaocipher: Analysis and Models"

Since uploading Jeff Hill's exciting paper "Chaocipher: Analysis and Models" to The Chaocipher Clearing House, there have been numerous correspondences and e-mails about the models mentioned in the paper.  These include e-mails between Jeff Hill, Moshe Rubin, and Mike Cowan (the current editor of the Computer Column in the American Cryptogram Association's periodical, The Cryptogram).  Here is a selection of the correspondences.

Moshe Rubin to Jeff Hill (16 April 2009)

I stayed up till 3 AM this morning reading the article -- I enjoyed every minute of it.  I do have some comments and questions which I would like to list for you.
  • You write on page 6 "Thus the possibility that each sector contained 1 to 3 teeth is considered first and when this does not yield a convincing fit to the graph of Figure 1, ...".  Solving the equations (1) p1+p2+p3=1; (2) p1 + 2p2 + 3p3 = 2 yields the equations (3) p2 = 1 - 2p3 and (4) p1 = p3.  So, given a value for p3 we can compute p1 and p2.  It would take work for me to set up code to fit p1/p2/p3 to the graph.  I take it you weren't able to find a good fit?
  • MM1, with steps of five, does not explain why Exhibit 1 does not repeat machine states in under nine steps, but MM2 does.  Playing with steps of 1, 2, and 4, we can reach 26 steps by individual steps of +4, +4, +4, +4, +4, +4, +2.  The probability of this occurring is p = (0.25)7 = 0.000061035.  In 13,615 characters we would expect 13,615p = 0.831+ occurrences.  It is plausible that a repeat could have happened within nine steps but didn't statistically.  On the other hand, how would you explain repeat machine steps in the other exhibits?
  • BTW, MM2's steps of 1, 2, and 4 seem very binary and logical, allowing multiple step length from 1 to 7 economically.
  • On page 8: "These probabilities arise when a trial has two outcomes, A and B, ...".  Do you have any ideas on what these outcomes could be based on in Chaocipher?  Comparing the plaintext letter with the ciphertext letter, i.e., some autokey condition?  Byrne alludes to a simple system that even a ten-year-old could learn quickly.  I've wondered if it's some grouping of the A-Z alphabet.  For example, A-M could be outcome 'A', while N-Z would be outcome 'B'.  If it were some autokey, that would explain the extreme sensitivity to error.
  • On page 9, your discussion of possible cryptographs is most interesting.  Regarding half-rotors, have you had an opportunity to see John Savard's attempt at a half-rotor?
  • Regarding M2 periods discussed starting from page 10: you raise the point that, within a single M2 period, the equivalent C94 cryptograph will have no more than three contact varieties.  I haven't checked yet, but can any counter cases be found where there are too many varieties within a single M2 block?  Although we don't know what the M2 periods are, a plethora of counter cases could be indicative.
  • Still regarding M2 blocks: if two adjacent M2 blocks *can* cause identical adjacent machines states, wouldn't we expect more such adjacent machine states in Exhibit 1 at M2 boundaries?
  • I like your section on Complexity Reduction.  It is a clever nuance to reduce the cryptanalytic complexity by reducing the work from N alphabets to fewer ones.
  • In table 7, N=14 shows that C98U had two (2) occurrences.  How odd!  Is this correct?
  • Figure 14 shows rows 22-34 and 43-55 juxtaposed.  Do any of the models you mention explain how, say, there are five consecutive repeat machine states which suddenly stops matching?  What keying factor would have allowed five consecutive states to be identical, but then suddenly become non-identical?

Jeff Hill to Moshe Rubin (19 April 2009)

Hi Moshe,
 
<< You write on page 6 "Thus the possibility that each sector contained 1 to 3 teeth is considered first and when this does not yield a convincing fit to the graph of Figure 1, ...".  Solving the equations (1) p1+p2+p3=1; (2) p1 + 2p2 + 3p3 = 2 yields the equations (3) p2 = 1 - 2p3 and (4) p1 = p3.  So, given a value for p3 we can compute p1 and p2.  >>
 
That brings back memories. <g>  That is the method I used for analyzing sectors having 1 to 3 and 1 to 4 teeth.
 
<< I take it you weren't able to find a good fit? >>
 
I pulled together my notes and MathCad worksheets to get a summary of my analysis and also learned that .mcd files can be converted to .htm files.   I will send Figure02.htm and image files to you later today.  It should answer several of your questions, but I don't believe it is suitable as it stands to be made available on TCCH.
 
<< MM1, with steps of five, does not explain why Exhibit 1 does not repeat machine states in under nine steps, but MM2 does.  >>
 
As you will see from probability tables given in Figure02.htm, MM1 and MM2 both predict non-zero probabilities for repeats at Steps 6 through 8 and MM1 does so for Step 5 as well.  While both models predict peaks and troughs with reasonable accuracy, neither completely predicts every probability found in the graph of Exhibit 1 data.   There are two reasons for the discrepancies:
 
(1) The statistics were collected without taking into account possible M2 boundaries, which undoubtedly leads to distortions if M2 periods exist as suggested in my paper.  This is not an oversight, merely an attempt to glean as much information as possible from Exhibit 1 without introducing assumptions that might prove to be wrong.  In 1994, of course, when the original studies were made, I knew nothing of M2 periods.
 
(2) Byrne's device may itself cause at least some of the distortion by having a keystream that is biased in some, as yet unexplained, way.
 
<< It is plausible that a repeat could have happened within nine steps but didn't statistically.  On the other hand, how would you explain repeat machine steps in the other exhibits? >>
 
I haven't given much thought to Exhibits 2 and 3, but as far as Exhibit 4 is concerned, I believe that Byrne used an M2 period 5 letters long instead of 55.  Short M2 periods increase the number of spurious "hits" that are collected when "comparisons" are made.  I have run C98U simulations which support this idea.
 
<< On page 8: "These probabilities arise when a trial has two outcomes, A and B, ...".  Do you have any ideas on what these outcomes could be based on in Chaocipher?  Comparing the plaintext letter with the ciphertext letter, i.e., some autokey condition?  >>
 
I believe that some form of iteration (output becomes input) was used by Byrne and it is probably based on the PT/CT or M1/M2 disk alphabets.  However, I have no clear idea as to how this would work.
 
<< Byrne alludes to a simple system that even a ten-year-old could learn quickly.  I've wondered if it's some grouping of the A-Z alphabet.  For example, A-M could be outcome 'A', while N-Z would be outcome 'B'.  If it were some autokey, that would explain the extreme sensitivity to error. >>
 
I believe that except for manually dialing the M1 disk into position based on a Key letter presented by the KY disk, which any ten-10-year-old could do, the rest of the work is handled internally by the device.   For a given M1 position, there could be electrical connections and stepping motors that determine the next Key.
 
<< On page 9, your discussion of possible cryptographs is most interesting.  Regarding half-rotors, have you had an opportunity to see John Savard's attempt at a half-rotor? >>
 
I have indeed seen John's device and I'm sure it is effective as an alternative to a rotor machine.   The problem is that, as it stands, it cannot replicate the statistical signature that is required.   I suspect that if you start with John's machine and reconfigure so that it replicates this signature, then you will end up with C98U.
 
<<Although we don't know what the M2 periods are, a plethora of counter cases could be indicative.  <snip> Still regarding M2 blocks: if two adjacent M2 blocks *can* cause identical adjacent machines states, wouldn't we expect more such adjacent machine states in Exhibit 1 at M2 boundaries? >>
 
This is an area where more work needs to be done to clarify the issue. I intend to explore this area and hopefully shed some light on it in a week or two.
 
<< In table 7, N=14 shows that C98U had two (2) occurrences.  How odd!  Is this correct? >>
 
Yes, this is correct.  The disk drive for C98U is simply a long, multi-sector "belt" with steps of 1, 2, and 4 randomly distributed over the belt. Byrne's drive mechanism seems to work more efficiently than this.
 
<< Do any of the models you mention explain how, say, there are five consecutive repeat machine states which suddenly stops matching?  What keying factor would have allowed five consecutive states to be identical, but then suddenly become non-identical? >>
 
It comes down to the fact that there are only a few possibilities for movement at each step.  The existence of repeated short sequences is evidence that the 1-2-4 stepping model for Byrne's device is correct, since for combinations of 4 steps (as in Rows 22-34 and 43-55) there are only 3 to the 4th power, or 81, possible stepping sequences.  In terms of C98U, this means that at various places in the drive belt there are sequences of four sectors which have exactly the same sequence of teeth.  When these sequences come into alignment at just the right place in Exhibit 1, the result is four consecutive Pt/Ct identities.  The run is ended because the fifth sectors in each sequence do not have the same number of teeth.  The downside of this is that autokey alone is unlikely to explain the repeats, since it seems likely that there must be a hidden component, or scrambler, which takes pt/ct combinations as a signal, but has its own internal cycle to further disrupt the key stream.

Moshe Rubin to Jeff Hill (20 April 2009)

Hi Jeff,

First, many thanks for your e-mail with answers to my list of questions.

The goal of this letter is to update you that, after investigating C98, I believe it
leads to contradictions.  The conclusion is that Byrne's mechanism is not C98.  This now
leaves C98U.

Here's what I found with C98.

(*) Let's select a specific block size, say, N=55.  This gives us many blocks of 55
characters each.

(*) For each block of N characters we list all occurrences of identical plaintext
letters with a distance of 1, 2, 3, 4, 5, or 6.  We note the two corresponding
ciphertext letter, c1 and c2.

(*) We can derive information not only from adjacent identical plaintext letters, but
even identical plaintext letters up to 6 places apart.  This is because the Markov
Process of 1, 2, or 4 shifts leaves a discernible pattern.

(*) If the two identical plaintext letters have a distance of 1, then c1 and c2 are
either 1, 2, or 4 places apart on the periodic cipher slide.

(*) If the two identical plaintext letters have a distance of 2, then c1 and c2 are
either 2, 3, 4, 5, 6, or 8 places apart.

(*) If the two identical plaintext letters have a distance of 3, then c1 and c2 are
either 3, 4, 5, 6, 7, 8, 9, 10, or 12 places apart.

(*) Etc. for distances of 4, 5, and 6


(*) As a rule:

--------  -----------------------------------------------------------------
Distance              Possible places apart on the cipher slide
           1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
--------  -----------------------------------------------------------------
  1        x x   x
  2          x x x x x   x
  3            x x x x x x x  x     x
  4              x x x x x x  x  x  x  x  x     x
  5                x x x x x  x  x  x  x  x  x  x  x  x     x
  6                  x x x x  x  x  x  x  x  x  x  x  x  x  x  x  x     x

In other words, an interval of <i> means the possible distances of the two ciphertext
letters are spaced:

   i, i+1, i+2, ... 4i

   where 4i-1 is not included in the sequence.

It's an interesting characteristic that at distance I, given steps of 1, 2, and 4, there
is an interval that cannot be reached using any combination of the steps.  For example,
at distance 2, using 1, 2, and 4, one cannot create a sum of steps totaling 7.  This is
true for all distances up to 6.

(*) Assuming a block size of 55 and examining line 234 of Exhibit 1 we get the
following:

Line 234:  VETHEIRLIVESTHATTHATNATIONMIGHTLIVEWITISALTOGETHERFITTI
           NWBJJNPNCWSOTTOIBBMNZTKFMBMTWPRHJZGWFARDJCDCEKLLPYDUHMX

Collating identical plaintext letters within a range of 1-6 for line 234 gives us:

       Occurrence          CT Letters
       ----------          ----------
   PT   1st  2nd  Distance  1st  2nd
   --  ---------- -------- ----------          
   E     2    5      3       W    J
   E     5   11      6       J    S
   I     6    9      3       N    C
   T    13   16      3       T    I
   T    13   17      4       T    B
   H    14   18      4       T    B
   A    15   19      4       O    M
   T    16   17      1       I    B
   T    16   20      4       I    N
   T    17   20      3       B    N
   T    17   23      6       B    K
   A    19   22      3       M    T
   T    20   23      3       N    K
   N    21   26      5       Z    B
   I    24   28      4       F    T
   I    28   33      5       T    J
   I    33   37      4       J    F
   I    33   39      6       J    R
   I    37   39      2       F    R
   T    38   43      5       A    D
   T    43   47      4       D    L
   E    46   49      3       K    P
   T    47   53      6       L    H
   I    52   55      3       U    X
   T    53   54      1       H    M

These results can be presented as a directed graph (see the attached GIF).

The important thing is to check for contradictions.  There are two general
contradictions I can think of:

   Contr-1: "x--(1)-->y" and "x--(5/6)-->y"

   Contr-2: "x--(1)-->y" and "y--(1/2/3/4/5)-->x"
            "x--(2)-->y" and "y--(1/2/3/4)-->x"
            "x--(3)-->y" and "y--(1/2/3)-->x"
            "x--(4)-->y" and "y--(1/2/3)-->x"
            "x--(5)-->y" and "y--(1/2)-->x"
            "x--(6)-->y" and "y--(1)-->x"

           

[Ed. 19 December 2009: Mike Cowan correctly pointed out that the last three lines are incorrect.  The above should read as follows:

   Contr-1: "x--(1)-->y" and "x--(5/6)-->y"


   Contr-2: "x--(1)-->y" and "y--(1/2/3/4/5)-->x"
            "x--(2)-->y" and "y--(1/2/3/4)-->x"
            "x--(3)-->y" and "y--(1/2/3)-->x"
            "x--(4)-->y" and "y--(1/2)-->x"
            "x--(5)-->y" and "y--(1)-->x"

Rerunning the tests, however, produced the identical results: all block sizes from 5 to 1010 show contradictions.
 For some block sizes there were fewer contradictions because of the corrections, but all block sizes are still ruled out.]

Contradiction #1 has two assertions that do not overlap, while Contradiction #2 looks at

the interval between two CT letters and their reversal.  In both these cases the ct
letters "x" and "y" are to be positioned in contradictory positions on the periodic
cipher slide.

In the line 234 results above, there are no contradictions, and the double occurrence of
"T--(4)-->B" is reassuring.  Block size 55, however, does have 14 contradictory blocks.
Here are a few of them (see the attached ZIP for a full printout of block sizes 5 -
110):


Line 9

       Occurrence          CT Letters
       ----------          ----------
   PT   1st  2nd  Distance  1st  2nd
   --  ---------- -------- ----------
   L     2    3       1      U    T     
   O     5    6       1      F    K     
   Q     8    9       1      S    G     
   O    16   20       4      V    Y     
   O    37   40       3      O    J     
   T    39   45       6      E    Y     
   E    44   47       3      G    S    CONTRADICTION: G--(?)-->S and S--(?)-->G
   R    49   52       3      H    T 

Line 124

       Occurrence          CT Letters
       ----------          ----------
   PT   1st  2nd  Distance  1st  2nd
   --  ---------- -------- ----------
   E     1    4       3      V    M     
   H     5    8       3      X    Q     
   H     5   10       5      X    H     
   H     8   10       2      Q    H     
   H    10   15       5      H    W     
   T    14   17       3      L    J     
   A    16   19       3      Q    S     
   A    19   25       6      S    W     
   N    20   23       3      M    Y     
   R    26   30       4      O    H     
   E    27   31       4      D    N     
   S    34   37       3      W    D     
   O    36   41       5      N    E     
   S    37   42       5      D    Z     
   F    44   45       1      D    Q     
   E    53   54       1      D    Z    CONTRADICTION: D--(1)-->Z and D--(5/6)-->Z


Line 192

       Occurrence          CT Letters
       ----------          ----------
   PT   1st  2nd  Distance  1st  2nd
   --  ---------- -------- ----------
   E     3    8       5      R    O     
   H     7    9       2      Q    Y     
   H     7   12       5      Q    Z     
   E     8   14       6      O    T     
   H     9   12       3      Y    Z     
   S    13   16       3      C    J     
   E    14   20       6      T    R     
   A    15   21       6      T    R     
   A    21   23       2      R    O     
   A    21   27       6      R    E     
   R    22   24       2      D    A     
   A    23   27       4      O    E     
   A    23   29       6      O    D     
   S    26   32       6      T    O     
   A    27   29       2      E    D     
   T    33   34       1      R    O    CONTRADICTION: R--(1)-->O and R--(5/6)-->O
   R    38   44       6      T    F     
   T    43   47       4      B    R     
   O    48   52       4      Z    U     
   E    50   54       4      X    P 

These contradictions are enough to demolish block size 55.  I expanded my script (see attached Perl script [change ".txt" suffix to ".pl" after downloading]) to run through all block sizes from 5 to 110, producing a brief output noting which ones have contradictions.  Here are the results:

   $>perl ProcessC98Blocks.pl -pt raw\exhibit-1.pt.txt
                              -ct raw\exhibit-1.ct.txt
                              -blocksize-from 5
                              -blocksize-to 110
                              -printsummary

   Session options
   ===============
           Plaintext file:         raw\exhibit-1.pt.txt
           Ciphertext file:        raw\exhibit-1.ct.txt
           Block size (from):      5
           Block size (to):        110
           Print Matches:          false
           Print Summary:          true

   Blocksize   5:
   Blocksize   6:
   Blocksize   7:
   Blocksize   8:
   Blocksize   9:
   Blocksize  10:
   Blocksize  11:
   Blocksize  12:
   Blocksize  13:
   Blocksize  14:
   Blocksize  15:
   Blocksize  16:
   Blocksize  17:
   Blocksize  18:
   Blocksize  19:
   Blocksize  20: CONTRADICTION (  2 found)
   Blocksize  21:
   Blocksize  22: CONTRADICTION (  2 found)
   Blocksize  23: CONTRADICTION (  1 found)
   Blocksize  24: CONTRADICTION (  1 found)
   Blocksize  25: CONTRADICTION (  2 found)
   Blocksize  26: CONTRADICTION (  1 found)
   Blocksize  27: CONTRADICTION (  3 found)
   Blocksize  28: CONTRADICTION (  2 found)
   Blocksize  29: CONTRADICTION (  2 found)
   Blocksize  30: CONTRADICTION (  3 found)
   Blocksize  31: CONTRADICTION (  3 found)
   Blocksize  32: CONTRADICTION (  2 found)
   Blocksize  33: CONTRADICTION (  5 found)
   Blocksize  34: CONTRADICTION (  5 found)
   Blocksize  35: CONTRADICTION (  5 found)
   Blocksize  36: CONTRADICTION (  7 found)
   Blocksize  37: CONTRADICTION (  3 found)
   Blocksize  38: CONTRADICTION (  5 found)
   Blocksize  39: CONTRADICTION (  6 found)
   Blocksize  40: CONTRADICTION (  5 found)
   Blocksize  41: CONTRADICTION (  6 found)
   Blocksize  42: CONTRADICTION (  6 found)
   Blocksize  43: CONTRADICTION (  6 found)
   Blocksize  44: CONTRADICTION (  7 found)
   Blocksize  45: CONTRADICTION (  7 found)
   Blocksize  46: CONTRADICTION (  8 found)
   Blocksize  47: CONTRADICTION (  4 found)
   Blocksize  48: CONTRADICTION (  5 found)
   Blocksize  49: CONTRADICTION (  8 found)
   Blocksize  50: CONTRADICTION (  8 found)
   Blocksize  51: CONTRADICTION ( 10 found)
   Blocksize  52: CONTRADICTION (  6 found)
   Blocksize  53: CONTRADICTION ( 11 found)
   Blocksize  54: CONTRADICTION ( 11 found)
   Blocksize  55: CONTRADICTION ( 14 found)
   Blocksize  56: CONTRADICTION ( 11 found)
   Blocksize  57: CONTRADICTION (  9 found)
   Blocksize  58: CONTRADICTION ( 11 found)
   Blocksize  59: CONTRADICTION ( 13 found)
   Blocksize  60: CONTRADICTION ( 12 found)
   Blocksize  61: CONTRADICTION ( 12 found) (1 case(s) of multiple contradictions)
   Blocksize  62: CONTRADICTION ( 11 found)
   Blocksize  63: CONTRADICTION ( 12 found) (1 case(s) of multiple contradictions)
   Blocksize  64: CONTRADICTION ( 13 found)
   Blocksize  65: CONTRADICTION ( 13 found)
   Blocksize  66: CONTRADICTION ( 12 found)
   Blocksize  67: CONTRADICTION ( 16 found) (1 case(s) of multiple contradictions)
   Blocksize  68: CONTRADICTION ( 13 found)
   Blocksize  69: CONTRADICTION ( 14 found) (1 case(s) of multiple contradictions)
   Blocksize  70: CONTRADICTION ( 20 found)
   Blocksize  71: CONTRADICTION ( 16 found) (1 case(s) of multiple contradictions)
   Blocksize  72: CONTRADICTION ( 16 found) (2 case(s) of multiple contradictions)
   Blocksize  73: CONTRADICTION ( 16 found) (1 case(s) of multiple contradictions)
   Blocksize  74: CONTRADICTION ( 14 found) (1 case(s) of multiple contradictions)
   Blocksize  75: CONTRADICTION ( 18 found)
   Blocksize  76: CONTRADICTION ( 18 found) (2 case(s) of multiple contradictions)
   Blocksize  77: CONTRADICTION ( 13 found)
   Blocksize  78: CONTRADICTION ( 15 found)
   Blocksize  79: CONTRADICTION ( 15 found)
   Blocksize  80: CONTRADICTION ( 17 found) (1 case(s) of multiple contradictions)
   Blocksize  81: CONTRADICTION ( 16 found) (1 case(s) of multiple contradictions)
   Blocksize  82: CONTRADICTION ( 20 found)
   Blocksize  83: CONTRADICTION ( 15 found)
   Blocksize  84: CONTRADICTION ( 20 found) (3 case(s) of multiple contradictions)
   Blocksize  85: CONTRADICTION ( 20 found) (4 case(s) of multiple contradictions)
   Blocksize  86: CONTRADICTION ( 27 found) (1 case(s) of multiple contradictions)
   Blocksize  87: CONTRADICTION ( 24 found) (3 case(s) of multiple contradictions)
   Blocksize  88: CONTRADICTION ( 23 found) (1 case(s) of multiple contradictions)
   Blocksize  89: CONTRADICTION ( 26 found) (4 case(s) of multiple contradictions)
   Blocksize  90: CONTRADICTION ( 26 found) (1 case(s) of multiple contradictions)
   Blocksize  91: CONTRADICTION ( 18 found)
   Blocksize  92: CONTRADICTION ( 26 found) (3 case(s) of multiple contradictions)
   Blocksize  93: CONTRADICTION ( 24 found) (1 case(s) of multiple contradictions)
   Blocksize  94: CONTRADICTION ( 23 found) (2 case(s) of multiple contradictions)
   Blocksize  95: CONTRADICTION ( 26 found) (3 case(s) of multiple contradictions)
   Blocksize  96: CONTRADICTION ( 20 found) (2 case(s) of multiple contradictions)
   Blocksize  97: CONTRADICTION ( 23 found) (1 case(s) of multiple contradictions)
   Blocksize  98: CONTRADICTION ( 27 found) (3 case(s) of multiple contradictions)
   Blocksize  99: CONTRADICTION ( 30 found) (2 case(s) of multiple contradictions)
   Blocksize 100: CONTRADICTION ( 28 found) (4 case(s) of multiple contradictions)
   Blocksize 101: CONTRADICTION ( 27 found) (4 case(s) of multiple contradictions)
   Blocksize 102: CONTRADICTION ( 29 found) (6 case(s) of multiple contradictions)
   Blocksize 103: CONTRADICTION ( 31 found) (4 case(s) of multiple contradictions)
   Blocksize 104: CONTRADICTION ( 34 found) (5 case(s) of multiple contradictions)
   Blocksize 105: CONTRADICTION ( 30 found) (4 case(s) of multiple contradictions)
   Blocksize 106: CONTRADICTION ( 30 found) (3 case(s) of multiple contradictions)
   Blocksize 107: CONTRADICTION ( 38 found) (8 case(s) of multiple contradictions)
   Blocksize 108: CONTRADICTION ( 30 found) (4 case(s) of multiple contradictions)
   Blocksize 109: CONTRADICTION ( 32 found) (6 case(s) of multiple contradictions)
   Blocksize 110: CONTRADICTION ( 37 found) (4 case(s) of multiple contradictions)

   Finished!
  
Assuming the pt/ct in Exhibit 1 are correct, C98 is an option only for block sizes 5-19
and 21.  Unless the block size is as small as these, my conclusion is that C98 as
presented in your paper is not used in Exhibit1.

This is encouraging because it allows one to concentrate now on C98A and C98U.  Have you
reached a similar conclusion re C98?

Best regards,

Moshe

Jeff Hill to Moshe Rubin (20 April 2009)

Hi Moshe,
 
Thank you for a very interesting analysis.  I will need some time to read this carefully and consider the consequences.   I'm not sure that I will have it completely evaluated before next weekend.
 
However, if I understand the main thesis of your analysis, what you seem to have invalidated is the possibility that C98 uses simple probabilities of 1/2, 1/4, and 1/4.    One reason that I included both MM1 and MM2 in my paper is that this shows that there are more ways than one to interpret the Exhibit 1 repeat frequencies.
 
As far as which models in the C98-series are still viable, I am sure, as I wrote in my paper, that C94 is not.   I am also sure that C98A is not viable based on analysis similar to that in Table 7 of my paper. The main feature that gives either C98 or C98U a chance is that both divide the text up into M2 periods. However, I doubt that C98 is a model of Byrne's device even though it comes closest to matching Langen's description.  My doubts stem from the fact that the M1 disk must also serve as the KY disk which, as your analysis confirms, exposes the Key transitions to direct analysis.
 
Best regards,

Jeff

Jeff Hill to Moshe Rubin (21 April 2009)

Hi Moshe,
 
You have proven that C98 cannot be stepping with simple probabilities of 1/2, 1/4/ and 1/4, but the question is, which do we reject, C98, M2 periods of 55 letters, or the simple stepping probabilities? 
 
After reviewing the 14 pt/ct matches that your analysis located, I believe that these can be reconciled with C98 by using a Markov model that includes steps of 3, 5, 6, and 7 steps in addition to steps of 1, 2, and 4, thereby giving up the simple probabilities.  However, I prefer to keep my options open on this question and not choose between C98 and simple probabilities until further research has clarified the issue.
 
I find after reviewing my Cryptograph simulators that the 55-letter block will have to be given up.  I wrote my paper C:A&M as a concept piece and was focused on what I consider the essential clues found in Silent Years and the Chaocipher Exhibits.   By not going into greater detail about the construction of the C98 series of cryptographs and the statistics generated by them, I overlooked the fact that the M2 period has to be much longer than 55 letters in order for C98 and C98U to replicate the probabilities derived from Exhibit 1.  My apologies for any confusion this may have caused.
 
Best regards,

Jeff

Mike Cowan's Chaocipher Strategy Paper (20 April 2009)

Mike Cowan is the editor of the American Cryptogram Association's (ACA) "Computer Column".  This column discusses computer programming approaches to cryptanalysis.  Mike is also the author of a paper published in the January 2008 issue of Cryptologia entitled "Breaking Short Playfair Ciphers with the Simulated Annealing Algorithm".  This paper describes the adaptation of simulated annealing to solve short Playfair ciphers (80-120 letters) without using a probable word.

While corresponding with Mike about using Simulated Annealing to solve the Wheatstone cryptograph, I mentioned to Mike about The Chaocipher Clearing House.  As you can see, Mike has joined the corps with gusto <g>.  Mike sent in a paper describing his view of a strategy for tackling Chaocipher research.

Hi Moshe,

I've found your website really excellent as a source of interesting and useful information on Chaocipher and it's good to have it all in one place.

In trying to get thoughts in order on how to tackle this one, I've drafted a strategy on which I'd much appreciate comments, criticisms, additions...or whatever may occur to you and other folk. This is attached. It includes one or two aspects not previously reported, that may be of interest.

For me the hunt is now on for a machine that churns out the right kind of ciphertext! As Jeff Hill has commented, once we've found that, cryptanalysis to find Byrne's keys should be possible. Look forward to hearing findings on a Wheatstone-type machine when you're ready.

Best,

Mike.

Mike's e-mail for comments is Mike Cowan's e-mail address as graphic.

Mike Cowan to Moshe Rubin (21 April 2009)

Hi Moshe,

I think I'm beginning to see why this is so difficult for us but why Friedman really wasn't interested.

I can see that with 2 cipher wheels the machine will use [676] different alphabets during enciphering. The order it uses these alphabets will be set by some kind of mechanism that advances one of the disks by a different amount after each encipherment -- for example in a cycle of 1,2,1,4 as Jeff wrote about -- and the other disk at less frequent intervals.

With so many alphabets there are less than 10 letters enciphered with any one alphabet in the 5500 cipher letters we've been given -- and we can't find out which letters belong to each alphabet because we don't know the advancement pattern.

However if we stole the machine, as the French stole Enigma, we'd be laughing. Each of the [676] cipher disk positions could be tried in little time.

In a computer driven world, you could think of having daily changes to the mixed alphabets and the advancements but I doubt this was a practical proposition for Byrne's machine in the 1920's. Friedman must have thought he had better devices available in-house, and taken the easy way out to put Byrne down gently.

All that doesn't help us! so on with the merry activity.

. . .

Best,

Mike.



Copyright (c) 2009 Moshe Rubin
Created: 22 April 2009
Updated: 12 November 2018

Return to the home page