[alt.sources] padrand

istvan@hhb.UUCP (Istvan Mohos) (09/13/90)

In article <2260@lectroid.sw.stratus.com> cme (Carl Ellison) writes:
>In article <2208@ux.acs.umn.edu> dhoyt@vx.acs.umn.edu writes:
>> I'll try to find a
>>difference between noise and a pad.  Could it have something to do with noise
>>being additive and the pad is an exclusive or?  Statistical quantum theory?

>It's nothing so difficult.  After either XOR or ADD with a true random
>1-time pad, every possible cyphertext value is *exactly equally* likely.
>Therefore, every possible plaintext value is *exactly equally* likely.

The C source of the padrand() routine posted here, is hereby placed in
the Public Domain.  A primitive driver (main) is enclosed for convenient
testing.  The verbal description of the algorithm immediately below, is
"Copyright 1990, Istvan Mohos, All Rights Reserved".

An often cited and axiomatic observation of cryptology is the security
of information encrypted using one-time pads as keys.  Practical
one-time pads are non-cyclic non-monotone text, of sufficient length to
mask the plaintext.  An instance of a one-time pad represents a
perfectly random selection of a single key out of a potentially infinite
number of choices.  One should be able to capture and use this "perfect
randomness" in seeking other goals than security of encryption.  The
random number generator described here defines an alternate utility of
one-time pads.

Although the text of one-time pads is non-cyclic, the byte stream is
subject to regularities of character distribution as the function of the
language.  Two methods of combating uneven distribution are implemented
in the padrand() routine.  Firstly, the *minimum* of information is
extracted from each pad byte, evaluating only whether the (ASCII) value
of the byte is even or odd.  The second method simply increments the
value of every other pad byte by one, with a "limping adder".

If the pad stream contains an equal number of even and odd bytes, the
adder flips a (statistically) equal number of even bytes to odd and odd
bytes to even, so the overall distribution remains the same.  If there
is an imbalance of even/odd bytes in the pad stream, half of the
additional (even or odd) bytes causing the imbalance is likely to be
converted to the opposite, dampening the divergence.

The resulting even/odd flags are shifted into a binary bit field; the
user can select the width of the field.

The routine opens and reads the specified pad file, "using up"
bitfield-width-bytes at each call for generating a new random number.
The value returned is between 0 and the maximum value that can fit in
the selected bit width.  When the pad is exhausted, the routine closes
the file and returns -1.  At the next call, the (new or reused) file is
opened and the process starts again.

The program is somewhat wasteful of pad text, and is intended for Unix
environments where on-line text is abundant (as evidenced by directories
/usr/dict, /usr/man, ~TeX/TeXdoc and the like) but hardware random
number generators are absent.  Accordingly, "int" is assumed to be
32-bits wide; bits 1 through 31 are used for controlling the range of
the generated numbers.  A different magnitude may be selected at every
call.  Requests for larger ranges are silently masked down to 31 bits.

As additional controls, (char *)NULL passed for the pad name closes an
open pad in preparation for switching to a new pad; negative field
widths (not subject to the range limitation) are interpreted as commands
to bump the file pointer by the absolute value of the negative field;
zero field width flips the "limping adder" to its complement.  Control
operations or I/O failures return small negative int values specific to
the operation or to the failure, and do not produce random numbers.

#include <stdio.h>
main ()
{
	char namebuf[1024];
	register char *name = namebuf;
	register int bits, rand;

	for (;; name = namebuf) {
		fprintf(stderr, "bit width: "), gets(name), bits = atoi(name);
		fprintf(stderr, "file name: "), gets(name);
		if (!strlen(name))
			name = NULL;
		do
			printf("%d\n", rand = padrand(name, bits)), fflush(stdout);
		while (rand >= 0);
	}
}

/********************************************
* generate random numbers from text
* Istvan Mohos, 1990 --- in the Public Domain
*********************************************/
#define INTWID 32
int
padrand (pad, bits)
char *pad;
register int bits;
{
	FILE *fopen();
	static FILE *fp = NULL;
	static int silkie = 0;
	register int rand = 0;
	register int chr;

	if (pad == NULL) {
		if (fp != NULL)
			(void) fclose (fp), fp = NULL;
		silkie = 0;
		return (-6);
	}
	if (fp == NULL && (fp = fopen (pad, "r")) == NULL) {
		silkie = 0;
		return (-5);
	}
	if (bits < 0) {
		if (fseek (fp, (long)abs(bits), 1) == 0)
			return (-4);
		(void) fclose (fp);
		fp = NULL;
		silkie = 0;
		return (-3);
	}
	if ((bits &= INTWID-1) == 0) {
		silkie = !silkie;
		return (-2);
	}
	for (; --bits >= 0; silkie = !silkie) {
		if ((chr = fgetc (fp)) == EOF) {
			(void) fclose (fp);
			fp = NULL;
			silkie = 0;
			return (-1);
		}
		chr += silkie;
		rand <<= 1;
		rand += chr&1;
	}
	return (rand);
}
-- 
        Istvan Mohos
        ...uunet!pyrdc!pyrnj!hhb!istvan
        1000 Wyckoff Ave. Mahwah NJ 07430 201-848-8000
======================================================

koning@koning.enet.dec.com (Paul Koning) (09/14/90)

|>...
|> The C source of the padrand() routine posted here, is hereby placed in
|> the Public Domain.  A primitive driver (main) is enclosed for convenient
|> testing.  The verbal description of the algorithm immediately below, is
|> "Copyright 1990, Istvan Mohos, All Rights Reserved".
|> ...
|> Although the text of one-time pads is non-cyclic, the byte stream is
|> subject to regularities of character distribution as the function of the
|> language.  
|> ...
|> The program is somewhat wasteful of pad text, and is intended for Unix
|> environments where on-line text is abundant (as evidenced by directories
|> /usr/dict, /usr/man, ~TeX/TeXdoc and the like) but hardware random
|> number generators are absent.  

It seems to me that you have missed the one most crucial part of the
definition of "one time pad": not only must the one-time pad be non-cyclic,
but the individual bytes must be random.

When you're talking about using text files as a source of key data, you
aren't describing a one-time pad at all.  Instead, what you have is a
"book code" or "running key cypher".  Those are easy to solve; the method
for doing so dates back to the 19th century.  (See D. Kahn, "The Codebreakers")

	paul

istvan@hhb.UUCP (Istvan Mohos) (09/18/90)

My efforts to provide a "well rounded" functionality with my earlier
posting of padrand(), nevertheless neglected a segment of readership
keen on speed and on doing things with inter-process pipes.  By way of
atonement, I'm posting a second, stand-alone implementation of padrand
(padrand.c).  The routine can be made a lot faster still, by changing
to buffered I/O and by swapping the inner for-loop for an expanded
block using a fixed "bit-width" parameter.

==============================CUT HERE===============================
/************************************************
* padrand.c --- random numbers from one-time pads
* Istvan Mohos, 1990 --- in the Public Domain
*************************************************/

#include <stdio.h>
#ifdef RAW_INT
#define OUTPUT write(1,(char*)(&rand),sizeof(int))
#else
#define OUTPUT printf("%d\n",rand)
#endif

main (argc, argv)
int argc;
char *argv[];
{
	register int bits, rand, silkie = 0;
	register char *bp, *end;
	char buf[sizeof(int)<<3];

	if (argc != 2)
		fprintf(stderr, "Usage: %s bits\n", argv[0]), exit(1);
	if ((bits = abs(atoi(argv[1]))) > (sizeof(int)<<3) || !bits)
		fprintf(stderr, "Maximum bits %d, minimum 1\n", sizeof(int)<<3),
			exit(1);
	for (; read(0, buf, bits) == bits; OUTPUT)
		for (rand = 0, bp = buf, end = bp + bits; bp < end; bp++)
			rand <<= 1, rand += (*bp + silkie)&1, silkie = !silkie;
	exit (0);
}
==============================CUT HERE===============================

The next two paragraphs are a continuation of the original description
of the one-time pad method of random number generation, and are
"Copyright 1990, Istvan Mohos, All Rights Reserved".

   Just as with encryption, a caveat may be in order: the warning that
   one-time pads not be monotone is not to be taken lightly.  The track
   record of one-time pad security may lull one into believing that the
   method is forgiving of minor breaches in the ground rules.
   Surrounded by mountainous ballasts of idle source code, the average
   programmer may even strive to be convinced that source files are
   suitable for one-time pads, to be able to bring otherwise static data
   back into play.  And yet observing that C text lines inevitably start
   with spaces or tabs, the code breaker could blow cyphertext encrypted
   with C code "chock-full-o-cribs" on a first attempt by globally
   XOR-ing with spaces or tabs, and in addition to clearing parts of the
   plaintext gain significant insights about the the key.
   
   At least with padrand, monotone pads only destroy the perfect
   distribution of random numbers in the output.  Still, it is best to
   strip pad text of redundancy.  Run the pad through compress for
   example, or (consider it as an incentive for saving space!) keep
   entire /pub or /src directories compressed.

And since this gives you FAST TRUE RANDOM numbers in SOFTWARE, don't
let me catch anyone manufacturing pseudo-random numbers again!  :-)
-- 
        Istvan Mohos
        ...uunet!pyrdc!pyrnj!hhb!istvan
        1000 Wyckoff Ave. Mahwah NJ 07430 201-848-8000
======================================================