kruger@16bits.dec.com (I've got 50nS memory. What did you say?) (03/02/88)
While I agree that plaintext has to much regularity to use as a key, it also has a tremendous amount of variety. The trick is to correctly extract the random element, while somehow 'chopping up' the regularities. The basic concept of the book method is good -- you've got a large, effectively one-time pad (justkeep a list of books if you send LOTS of messages!) As a good start, I would propose a finite state machine. Given the current letter, and the position in the word, do a table lookup using the next word to (excuse me, letter) to determine what the next character should be. Even before this, it is probably good to store the letters in a six-bit code, as there is obviously way too many zeros in the high bits of a byte. Now given this finite- state code, do some sort of scrambling (exchange bits at various intervals, for example) and you're well on the way to reducing the pattern of the key. If there is still too much 'pattern' in the code, taking the low-order bits and leaving the high-order ones would probably kill most of it. And for that matter,removing the vowels in the original plaintext key would also up the randomness. Well, on with the criticism! dov
jmm@thoth15.berkeley.edu (03/02/88)
Would removing all words of less than x characters significantly improve the quality of one-time book pads? And what would a useful value of x be? (I would suggest x=5 at a minimum to eliminate all instances of 'this,' 'that,' etc. Would this be enough?) / James Moore / | jmm@bartleby.berkeley.edu / / |--------------------------------------------| / Ma ta Gaeilge agut, / | Nil aon speis ag Ollscoile na | / scriobh i! / | California im bharulacha fein. |
mlm@NL.CS.CMU.EDU (Michael Mauldin) (03/02/88)
In article <8803011848.AA05787@decwrl.dec.com>, kruger@16bits.dec.com writes: > The trick is to correctly extract the random element, while somehow > 'chopping up' the regularities. For something more demonstrably "random" why not use your favorite redundancy reducer (read: file compressor) to destroy the pattern? Huffman code based on bigrams, or use Lempel-Ziv compress. The better the compress routine, the less redundancy there will be in key stream. You may still have to deal with some regularities in the resulting bits. Your suggestion of ignoring vowels and using the low order bits doesn't seem to work well at all (did you test it?). Here are histograms for the text of the US Constitution (50,090 characters): Character counts (ignoring punctuation and digits < 100 occurences): SP 10397 **************************************************************** , 642 *** . 377 ** a 2779 ***************** b 659 **** c 1276 ******* d 1307 ******** e 5385 ********************************* f 1077 ****** g 454 ** h 2041 ************ i 2607 **************** j 100 k 46 l 1496 ********* m 792 **** n 2708 **************** o 2856 ***************** p 834 ***** q 52 r 2325 ************** s 2831 ***************** t 3913 ************************ u 883 ***** v 481 ** w 385 ** x 129 y 553 *** z 31 Low 4 bits of all letters except vowels: 0000 11248 **************************************************************** 0001 2943 **************** 0010 3037 ***************** 0011 4149 *********************** 0100 5241 ***************************** 0101 6287 *********************************** 0110 1584 ********* 0111 862 **** 1000 2198 ************ 1001 3200 ****************** 1010 160 1011 128 1100 2166 ************ 1101 918 ***** 1110 3113 ***************** 1111 2856 **************** So you see there is still quite a lot of structure left. Michael L. Mauldin (Fuzzy) Department of Computer Science ARPA: Michael.Mauldin@NL.CS.CMU.EDU Carnegie Mellon University Phone: (412) 268-3065 Pittsburgh, PA 15213-3890