[net.crypt] One-time pads

spaf@gatech.UUCP (Gene Spafford) (02/15/84)

The following comments are derived from notes I took during a class by
George Davida when he was at Georgia Tech.  It is in response to
someone's query about difficulties with one-time pads.

The only "unbreakable" kind of encryption is a one-time pad.  That is,
there is no way to find the true key based on the encrypted text alone
-- in theory.  Any encryption done using a function can eventually be
broken, although the time involved may be so extreme as to imply that
the system is unbreakable.

One way of implementing a one-way pad is to generate a random bit
string and perform an xor with the text to be encrypted.  To decrypt
the message requires that the receiver simply xor the encrypted bit
string with the key.  Theoretically speaking, someone attempting to
break the encryption can come up with all messages of length N (the
length of the message) by generating all possible keys.  Without
information to confirm the key, it is not possible to tell which
message is the correct one (and some of the candidates may well be
direct opposites of the correct message).

A difficulty is that in any truly random bit string, there is very
possibly a run of M zeros, with M up to the size of the message.  That
is, it is expected that at some time there will be a long enough run of
zeros so as to not encrypt a major portion of the text.  In fact, it is
entirely possible that a random key could be all zeros, thus producing
an encrypted text equal to the plain text!

Therefore, one-time systems may not use a truly random generator, but
may add constraints such as "out of every 100 bits, at least 60 are
ones."  This greatly reduces the number of potential keys that someone
needs to examine when attempting to decrypt the message.  In fact, if
taken to extremes, it enables certain probabilistic attacks on the
encrypted text which may result in the text being read without access
to the real key.

The mechanism described is, of course, generalizable to other systems
which don't use the "xor" method, but apply the random bit string for
encryption in some other way.
-- 
Off the Wall of Gene Spafford
The Clouds Project, School of ICS, Georgia Tech, Atlanta GA 30332
CSNet:	Spaf @ GATech		ARPA:	Spaf.GATech @ CSNet-Relay
uucp:	...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!spaf

billp@azure.UUCP (Bill Pfeifer) (02/17/84)

Gene Spafford writes:
>>	A difficulty is that in any truly random bit string, there is very
>>	possibly a run of M zeros, with M up to the size of the message.  That
>>	is, it is expected that at some time there will be a long enough run of
>>	zeros so as to not encrypt a major portion of the text.  In fact, it is
>>	entirely possible that a random key could be all zeros, thus producing
>>	an encrypted text equal to the plain text!

That is true.  However, we must look at the probabilities.  There is a chance
that a string of zeroes leaves a portion of the text unencrypted.
There is an *equal* chance that a certain combination of random digits
produces cleartext of opposite meaning, or text from a future OSHA regulation,
or a portion of a german poem.  In other words, if you find something readable,
in whatever language, in a message that has been encrypted with a one-time-pad
of *truly* random digits, it is equally meaningful as "skpebcvus;fjwuy*z".

	Bill Pfeifer
{cbosgd,decvax,harpo,ihnss,ogcvax,pur-ee,ucbvax,zehntel} !tektronix!tekmdp!billp

mark@cbosgd.UUCP (Mark Horton) (02/20/84)

I've seen two people claim that the chances of the cleartext getting
sent out unencrypted due to a long string of zeros in the key is no
greater than some other message coming out due to a random key.

I think these people are assuming that the random number generator
used is sure to generate uniformly distributed random numbers.  With
a good psuedo random number generator, you can easily prove that this
is the case.  But with a true random number generator, based on noise
from some analog source, for example, you are NOT assured that the
numbers are uniform.  Maybe there were excessive sunspots the day
the key was generated, and the noise levels were very low, resulting
in all the analog values being sampled being below the zero cutoff,
and a zero random number value for each try.  Or perhaps one of the
components breaks, with the same result.

The obvious thing to do is to have the algorithm watch the key, and
if it sees a long string of zeros or ones, to refuse to send the
message, but to alert an operator instead.

	Mark Horton

spaf@gatech.UUCP (Gene Spafford) (02/20/84)

As has been pointed out, it is certainly probable that some run of
truly random digits will result in a readable text, but what is read
may have no relation to the actual message.  In fact, that type of
situation is more likely to occur than the case of a string of zeros
leaving the text unencrypted (consider how many strings of length N
have valid meanings, as opposed to the one "true" meaning).

However, suppose you are the person reading the encrypted message.  If
you see text which happened to be readable and described a recipe for
Aunt Pat's oatmeal cookies, you probably would ascribe to the
randomness of the encryption. (You might also decide to see if it was a
code of some sort (as opposed to cypher)).  On the other hand, if you
saw something like the formula for a new nerve gas, you'd definitely
try to have your chemists put it together and give it a try.  Your
chances might not be good that it was actually a formula for nerve gas,
but if it was, you'd have it.  (Actually, with the kind of randomness
we all know and love: 1) Aunt Pat's cookies could be more lethal; 2)
Aunt Pat's recipe could be encrypted into a valid formula for nerve gas
of which the sender never knew!)

If I send my password via a one-time code and it gets transformed into
some other string of characters, what do you lose by trying that
string?  Nothing, but there is a very small chance that what you have
is a correct password.  In fact, with a truly random sequence, and a
six bit character code with all bits significant, the probability of
finding a six character password unchanged by the encryption is on the
order of 1 in 2 ** 36.  That's not very good, but it may be enough.
However, that kind of probability is usually much more than sufficient
for most applications.

As an aside, a good story embodying some of the thought in this
discussion is "The Library" by Jorge Luis Borges in "Labyrinths."

-- 
Off the Wall of Gene Spafford
The Clouds Project, School of ICS, Georgia Tech, Atlanta GA 30332
CSNet:	Spaf @ GATech		ARPA:	Spaf.GATech @ CSNet-Relay
uucp:	...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!spaf

billp@azure.UUCP (Bill Pfeifer) (02/22/84)

Gene Spafford writes:

>	However, suppose you are the person reading the encrypted message.  If
>	you see text which happened to be readable and described a recipe for
>	Aunt Pat's oatmeal cookies, you probably would ascribe to the
>	randomness of the encryption. (You might also decide to see if it was a
>	code of some sort (as opposed to cypher)).  On the other hand, if you
>	saw something like the formula for a new nerve gas, you'd definitely
>	try to have your chemists put it together and give it a try.  Your
>	chances might not be good that it was actually a formula for nerve gas,
>	but if it was, you'd have it.

Horsepuckey!

I suppose Gene does not appreciate the magnitude of the numbers involved!
To get a feel for the numbers, consider the famous "Problem of a Printed Line",
as described by George Gamow in his book "One Two Three ... Infinity"
(Bantam Books, New York).
He describes the attempt of continuously printing one line after the other,
each line having a different combination of letters, numbers and punctuations,
until all possible combinations have been printed.  There are 26 letters, 10
numbers and 14 common punctuations, 50 symbols altogether.  George Gamow
limits the length of the line to 65 characters, that of an average printed
line.  That is 50^65 or 10^110 combinations!
Not impressed yet?
Assume that every atom in the entire universe (3*10^74 of them) represents a
separate printer, working simultaneously.  Assume further that these printers
are printing at the rate of atomic vibrations, 10^15 lines per second
(that's a quadrillion lines, 16 2/3 trillion pages, or about 5 billion boxes
of paper per second EACH), and that they have been printing uninterrupted
since the beginning of the universe (3*10^9 years, or 10^17 seconds).
By now they would have printed 3*10^106 lines, which is one thirtieth of
1 percent of the total required.

The probability of text left unencrypted by a string of zeroes in a truly
random bitstream is similar to that of finding the real nerve gas formula
in this output.  Besides, how much of a formula could you pack into a
65 characters?  Remember, each added character multiplies the output volume
by 50.
Along with the real formula, there will also be 10^xxx phony formulas, some
of which will actually yield Aunt Pat's oatmeal cookies.  There aren't enough
molecules on this planet to put together all formulas that might look like
nerve gas, even if you tried.

	Bill Pfeifer
{cbosgd,decvax,harpo,ihnss,ogcvax,pur-ee,ucbvax,zehntel} !tektronix!tekmdp!billp