[comp.sources.misc] v10i086: Names2, a random names generator

jrk@sys.uea.ac.uk (Richard Kennaway) (02/27/90)

Posting-number: Volume 10, Issue 86
Submitted-by: jrk@sys.uea.ac.uk (Richard Kennaway)
Archive-name: names2/part01

This is names2.c, a program which generates random names for FRP
characters or placenames.  It is a development of names.c, which I posted
to comp.sources.misc in July 1989.

The posting is in two parts.  This is part 1, containing the program;
part 2 contains the data files.

For those who saw the earlier version, this is what's new:

-  some data files are included to get you started (Sindarin, German,
Chinese, and Chaucerian English).

-  you can specify which characters in the input should be considered
"letters"; for example, you can have it recognise punctuation marks and
accents, or apply it to Greek or Cyrillic text (provided each character
is represented by a single byte).

-  the internal data structures are much smaller, at the cost of taking
longer to analyse the input and begin generating names.

For those who didnt see the earlier version, unlike all similar programs
I've seen, names2 will generate output to match any language you like.
Feed it with text in that language, and it will generate words
statistically similar to the input text.

It runs on a Macintosh (if you have MPW C) and Unix.

For further information and examples, see the manual entry (near the
begining of the shar archive).

The program is public domain.  Share and enjoy.

--
Richard Kennaway          SYS, University of East Anglia, Norwich, U.K.
Internet:  jrk@uk.ac.uea.sys                uucp:  ...mcvax!ukc!uea-sys!jrk

#!/bin/sh
echo x - MANIFEST1
sed 's/^X//' >MANIFEST1 <<'*-*-END-of-MANIFEST1-*-*'
XMANIFEST1		This file.
Xnames2.1		The manual entry.
Xnames2.c		The source.
XNames2.make		The Macintosh makefile.  (See comment at the
X			beginning before using it.)
XMakefile.unix		The UNIX makefile.
*-*-END-of-MANIFEST1-*-*
echo x - names2.1
sed 's/^X//' >names2.1 <<'*-*-END-of-names2.1-*-*'
X.TH NAMES2 1 "January 1990"
X.UC
X.SH NAME
Xnames2 \- generate random names \- version 2
X.SH SYNOPSIS
X.B names2
X[
X.B \-3
X] [
X.BR \-w\ |\ \-s
X] [
X.B -l
X.I nnn
X] [
X.I files
X]
X.SH DESCRIPTION
X.I Names2
Xis a random name generator. It will read text from standard input or from
Xfiles given on the command line, and generate a random stream of words
Xwhose statistical characteristics are similar to those of the input. Thus
Xif you give it a database of Elvish names, it will generate Elvish-like
Xnames; if a database of Orcish names, it generates Orc-like names, etc.
X.PP
XIt does this by counting the frequency of all 1-, 2-, 3-, and 4-character
Xsequences of letters or spaces in the input. Case is ignored by default,
Xand all runs of non-letters are seen as single spaces. The first character
Xto be output, say "r", is generated according to the relative frequencies
Xwith which each character was found to follow a space in the input. The
Xsecond, say "o", is generated according to the relative frequencies with
Xwhich each character can appear following the digraph " r". The third, say
X"l" is generated according to the relative frequencies with which each
Xcharacter follows the trigraph " ro", and thereafter each character is
Xgenerated according to the frequencies with which the different possible
Xcharacters follow the preceding three.
X.PP
XBy default, the letters are the characters a-z and A-Z, with case
Xdifferences ignored. There are options letting you specify that case is
Xsignificant, or to add or remove characters from the set of "letters".
XThus you can use it to generate names in Greek, Cyrillic, Japanese
Xkana, etc. provided the input text is encoded in some way with one byte
Xper character.
X.PP
XThe larger the input, the better. You need at least a few thousand bytes
Xof input for good results. If the input is not large enough, you will
Xtend to get words from the input appearing verbatim in the output, as much
Xof the time three consecutive characters will uniquely determine the next
Xcharacter. (To see an extreme form of this, try running it on the text
X"the cat sat on the mat".) If more input of the desired form is not
Xavailable, the program can be made to use a third-order approximation
Xinstead, each character of the output depending only on the two preceding
Xcharacters.
X.PP
XThe output is wrapped to 76 chars maximum, hyphenating any word that has
Xto be broken over a line-end.
X.PP
X.I Names2
Xwill run on Unix, and on a Macintosh as an MPW shell tool with MPW C
Xversion 3.
X.SH EXAMPLES
XFor your inspiration, here are some examples demonstrating its
Xversatility.  To generate Elvish names, feed it with the Sindarin words
Xfrom a Sindarin-English dictionary.  You get something like this:
X.PP
X.na
X.in +3
X.ll -3
X.I
Xthaiglin thoromirin mallorien orth girithrass bregalad imloth
X.I
Xmenel berhael cirion saur celebdil aradrif oroth ered eryd mindon
X.I
Xmirandros aer balrond adui narfingor emyn gaurnen sernui silien
X.I
Xglorn fuin celegost bladhren gil breth argonuiliath hinguruthon...
X.PP
X.ad
XAs you can see, not all the output is directly usable, but by exercising
Xsome selection you can obtain results like:
X.PP
X.na
X.in +3
X.ll -3
X.I
XTolfalad, Lothlain, Ossarnen, Malbarahir, Minarwen, Eredil,
X.I
XSuldor Belebrethand, Berielegor, Gaurgor, Mithron, Galadhril,
X.I
XSammathremmir, Erufinrod, Fangband, Turingamar, Ninuviel, Elwinion...
X.PP
X.ad
XPretty convincing, eh?  Yet none of these words actually occurred in the
Xinput file.  Using the output as inspiration, you can even construct
XElvish text.  Here is an extract from the tale of Suldor Belebrethand's
Xjourney through the orclands of Gaurgor to the foothills of Tolfalad, in
Xan ancient and little-known Sindarin dialect (thus forestalling any
Xcriticisms from smart-alecs who actually speak Elvish :-)):
X.PP
X.na
X.in +3
X.ll -3
X.I
X...Argirien emyn druadar lhun arach sarinan-duhirion Erufinrod.
X.I
XSuldor bas caracharon arad mor gwanui alfiriel, i los hin arant
X.I
Xdruadan-calad.  Til edras, "Barachaer bas gaurondon", orguldur...
X.PP
X.ad
XIn contrast, here's some output from the names in the index of a book on
Xearly mediaeval Germany.  I told names2 to consider '"' a letter, and
Xused it to represent a dieresis on the following vowel.  On a Macintosh
Xyou could use the accented characters themselves.
X.PP
X.na
X.in +3
X.ll -3
X.I
Xpassen chutizi albrem m"olden nordgard kunich sched salzburg
X.I
Xbaldwig capendhausengar dal vith assau gisela hildestins pr"ubeck
X.I
Xsclavaringau edgau rick hed albert alcuin ruodlingen hodo boleslas
X.I
Xmisti bert hrothard bold ekkarl wettinau tegenburg zevenstedt...
X.I
X.PP
X.ad
XJust the thing for Warhammer.  I can see it now: chaos tribes have crossed
Xthe Chutizi river and threaten the territory of Count Hrothard of the
XRuodling family, ruler of the town of Zevenstedt and province of
XNordgard, who sends his faithful servant Hodo to ask for help from King
XBaldwig in Wettinau; the party encounter the dying Hodo, waylaid by
Xruffians in the pay of evil Baron Ekkarl; also in the plot are Abbess
XHildestin of Pr"ubeck, Bishop Albrem of M"olden, the village of
XCapendhausen, and the city state of Tegenburg-Assau ... the scenario
Xwrites itself once you get the names right.
X.PP
XHere is some output from a list of names of characters in "The Story of
Xthe Stone", a Chinese novel of the 18th century, in Pinyin romanization:
X.PP
X.na
X.in +3
X.ll -3
X.I
Xxian weng rong shixilua yu lian lin liao wen shiyin hun qian
X.I
Xyun xue chuan bing xiaoqing lian yin xian wan jing wen wan tian
X.I
Xsiji song xue shen xiang deng langzhe ruhai xin xifei lan hun
X.I
Xyouang xi baojinxiao ziten xifei xue erji huan zhe xingyun jie...
X.PP
X.ad
XLastly, this is generated from English text, with all punctuation marks
Xrecognised as "letters" and upper and lower case distinguished:
X.PP
X.in +3
X.ll -3
X.I
Xof a Ch'huen as just is are your prime, stupidition. -- Alfred
X.I
Xin a mismal comple it in that fortune und fall Lighs take you
X.I
Xdoubtful fleeps. I that just Dented, somed formen. Sit chese
X.I
X.PP
XWith names2, who needs James Joyce?
X.SH OPTIONS
XWhen an option takes an argument, there must be no space between the
Xoption and the argument.  Options must be written separately, e.g. -3 -c,
Xnot -3c.
X.TP
X.B \-3
XUse trigraph frequencies instead of tetragraph frequencies.  Gives better
Xresults when input data is limited.  It is interesting to experiment with
Xthis option even if you have enough data to use tetragraphs.
X.TP
X.B \-axxxx
XAdd the characters in the string xxxx to the set of "letters".
X.TP
X.B \-c
XTreat letters in different cases as different.  By default they are not
Xdistinguished.  When this option is given, output is in lower-case.
X.TP
X.B \-dxxxx
XRemove the characters in the string xxxx from the set of "letters".
X\-a and \-d options are processed sequentially, and the \-c option, if
Xpresent, is applied last.
X.TP
X.B \-lnnn
XGenerate nnn lines of output.  Default is 20.  No space between the \-l
Xand the nnn.  If \-l is given with no argument, the output will go on
X(nearly) forever.
X.TP
X.B \-rnnn
XUse nnn as the seed for the random number generator.  As the value of the
Xseed is printed on stderr, this enables you to reproduce the output.  By
Xdefault, the value of the seconds clock is used.
X.TP
X.B \-s
XNegation of
X.B \-w
Xoption.
XThe first character of each word will depend on the last three characters
Xgenerated (i.e. the last two characters of the preceding word, and the
Xinter-word space).
X.TP
X.B \-w
XNegation of
X.B \-s
Xoption.
XGenerate successive words independently, i.e. each word begins as if it
Xwas the beginning of the whole output, ignoring how the preceding word
Xended.  (Default.)
X.SH DIAGNOSTICS
X.I Names2
Xgives a usage message if the arguments are bad.  Exits with status 0 if all
Xwent well.  Exits with status 1 if there were bad arguments (other than
Xnon-existent files), or insufficient memory.  No names are generated.
XOtherwise, exits with status 2 if any files were not found (however, it
Xwill read all the files it could find and generate names).
X.PP
XWrites to stderr a count of the number of different "letter" characters, a
Xcount of the characters read (i.e. letters and runs of non-letters), and
Xthe seed for the random number generator.
X.PP
XIf compiled with SHOWTABLE defined, it dumps the tables to standard output
Xbefore the random names.
X.SH CHANGES SINCE PREVIOUS VERSION
XAdded -a, -c, -d, -r options.  Vastly improved memory allocation.  Ported
Xto MPW version 3.
X.SH BUGS
XThe ignoring of case only applies to the characters a-z and A-Z, not to
Xthe accented letters and ligatures in the Macintosh character set. If you
Xwant to accept all the extra characters and ignore case differences, you
Xcan will need to preprocess your input to map, say, A-dieresis into
Xa-dieresis, OE to oe, etc.
X.SH FURTHER IDEAS
XThere is still some room for improvement in the efficiency of
Xrepresentation of the tables. The space required is approximately four
Xtimes the size of the input, plus eight times the square of the size of
Xthe alphabet. With the -3 option, it would be possible to compress the
Xtables by half, but this is not done - the presence of the -3 option makes
Xno difference to the amount of memory used.
X.PP
XArrange to write the tables to a file and read them in again, to avoid
Xhaving to reconstruct them every time you run the program on the same
Xinput.
X.PP
XThe enthusiastic may want to convert the program to run as a stand-alone
Xapplication on the Macintosh.
X.SH ACKNOWLEDGEMENTS
XThe distribution includes several word-lists from various sources: the
XSindarin dictionary contained in "An Introduction to Elvish", edited by
XJim Allan, published by Bran's Head Books Ltd, 91 Wimborne Avenue, Hayes,
XMiddlesex, U.K., 1978 (but I hear they've gone bust, so that address may
Xnot be any use); the personal and place names from the index of "Rule and
XConflict in an early Medieval Society", by Karl Leyser (Basil Blackwell,
XOxford, U.K., 1989); the names from the index of characters of "The Story
Xof the Stone", by Cao Xueqin (trans. David Hawkes, 3 volumes, Penguin
X1973); and some text from Chaucer.
X.PP
XThe Chinese file is rather short; the example above was produced with
Xthe -3 option.
X.SH AUTHOR
XRichard Kennaway.
X.TP
Xjrk@sys.uea.ac.uk (INTERNET), ...mcvax!uea-sys!jrk (UUCP).
X.TP
XThis program is public domain.
*-*-END-of-names2.1-*-*
echo x - names2.c
sed 's/^X//' >names2.c <<'*-*-END-of-names2.c-*-*'
X/* names2.c */
X/* Random name generator */
X
X/* Richard Kennaway */
X/* INTERNET:	jrk@uk.ac.uea.sys */
X/* UUCP:	...mcvax!uea-sys!jrk */
X
X/* Public domain! */
X/* August 1989:   First version. */
X/* January 1990:  Ported to MPW3.
X		  Removed some untidiness (lint warnings).
X		  Print randseed to stderr and take randseed as option
X		  to allow reproducibility.
X		  Better representation of tetragraph table.
X		  Ability to specify character set.  */
X
X
X#define FALSE  0
X#define TRUE   1
X
X/* Choose one... */
X#define UNIX   TRUE   /* Version for Unix */
X#define MPW    FALSE    /* Version for Apple Macintosh (MPW C) */
X
X/* If MPW is TRUE, define one of MPW2 or MPW3 as TRUE, the other as FALSE. */
X#define MPW2   FALSE   /* MPW version 2 */
X#define MPW3   FALSE    /* MPW version 3 */
X
X
X/* System declarations */
X
X#include <stdio.h>
X#if MPW
X#include <Memory.h>	/* For BlockMove(). */
X#include <QuickDraw.h>	/* For random numbers. */
X#include <OSUtils.h>	/* For GetDateTime(). */
X#endif
X
X#define EOFCHAR     (-1)
X
Xextern char *malloc();
X
X
X/* Compatibility */
X
Xtypedef char int8;
Xtypedef unsigned char uint8;
Xtypedef short int16;
Xtypedef unsigned short uint16;
Xtypedef unsigned long uint32;
Xtypedef long int32;
X
X#define MAXUINT8		((uint8) ((int8) (-1)))
X#define MAXUINT16		((uint16) ((int16) (-1)))
X#define MAXUINT32		((int32) ((int32) (-1)))
X#define NUMCHARS	256
X#define chartoint(c)	((int16)(uint8)(c))
X#define A_CHAR		chartoint('A')
X#define Z_CHAR		chartoint('Z')
X#define a_CHAR		chartoint('a')
X#define z_CHAR		chartoint('z')
X#define SPACE_CHAR	chartoint(' ')
X#define A_TO_a		(a_CHAR-A_CHAR)
X
X#if MPW2
X#define NEWLINECHAR     chartoint('\r')
X#endif
X#if UNIX || MPW3
X#define NEWLINECHAR     chartoint('\n')
X/* Note: the actual value of '\n' is different in UNIX and MPW3,
X and '\n' in MPW3 is the same as '\r' in MPW2. */
X#endif
X
X
X/* Where is the random number generator? */
X
X#if UNIX
Xtypedef char *Ptr;
X#define Boolean		int
X#define BlockMove	bcopy
Xint32 random();
X#define Random()	((int16) (random()))
X#endif
Xuint32 RandSeed;
X
X
X/* Globals. */
X
Xint Argc;
Xunsigned char **Argv;
Xint ExitStatus = 0;
XBoolean FileArgs = FALSE, Big = TRUE, SeparateWords = TRUE;
XBoolean CaseSignificant = FALSE, Letters[NUMCHARS];
Xint16 CurFile;
X
X
X/* Layout. */
X
X#define BREAK1		60
X#define BREAK2		75
Xint16 Column = 0;
Xint32 Lines = 0;
X#define DEFAULTMAXLINES		20
Xint32 MaxLines = DEFAULTMAXLINES;
X
X
X/* Tables */
X
Xint16 NumChars = 0;
X#define SPACEINDEX      0
Xint32 t2size, t3size, t4size;
X
X#define NOTCHOICE		MAXUINT16
X
Xint16 CharToIndex[NUMCHARS], IndexToChar[NUMCHARS];
X
Xint32	table0 = 0,
X	*table1 = NULL,
X	*table2 = NULL;
X
X#define BLOCKSIZE(n)		(sizeof(int32) + sizeof(int32) + (n)*sizeof(uint16))
X#define INITSIZE		10
X#define GROWNUM			5
X#define GROWDEN			4
X
Xtypedef struct DigraphBlock {
X	int32 size, maxsize;
X	uint16 data[1];
X} DigraphBlockRec, *DigraphBlockPtr;
X
XDigraphBlockPtr *quadtable = NULL;
X
X
X/* Sorting */
X
Xstatic void SortArray();
X
Xtypedef Boolean (*ComparisonProc)();
X
X
X/* Memory allocation */
X
Xchar *trymemory( bytesNeeded, mustGet )
Xint32 bytesNeeded;
XBoolean mustGet;
X{
Xchar *result;
X
X    result = (char *) malloc( bytesNeeded );
X    if ((result==NULL) && (mustGet)) {
X	fprintf( stderr, "Could not get %lu bytes - terminating.%c",
X		 bytesNeeded, NEWLINECHAR );
X	ExitStatus = 1;
X	exit( ExitStatus );
X    }
X    return( result );
X}  /* char *trymemory( bytesNeeded, mustGet ) */
X
Xvoid zero( start, numBytes )
Xchar *start;
Xint32 numBytes;
X{
X/* Your system may well have a faster way of zeroing memory. */
X/* In fact, the static arrays to which this procedure is applied */
X/* may be automatically initialised to zero already. */
X/* But portability would be impaired by asssuming that. */
X
Xint32 remainder, i, num32bits;
X
X    remainder = numBytes % ((int32) 4);
X    for (i=1; i <= remainder; i++) start[numBytes-i] = 0;
X    num32bits = numBytes / ((int32) 4);
X    for (i=0; i<num32bits; i++) ((int32 *) start)[i] = 0;
X}  /* void zero( start, numBytes ) */
X
Xvoid getmemory()
X{
Xint32 i;
X
X    table1 = (int32 *) trymemory( NumChars * sizeof(int32), TRUE );
X    table2 = (int32 *) trymemory( t2size * sizeof(int32), TRUE );
X    quadtable = (DigraphBlockPtr *) trymemory( t2size * sizeof(DigraphBlockPtr), TRUE );
X
X    zero( (char *) table1, NumChars * sizeof(int32) );
X    zero( (char *) table2, t2size * sizeof(int32) );
X    for (i=0; i<t2size; i++) quadtable[i] = NULL;
X}  /* void getmemory() */
X
Xvoid freememory()
X{
X    if (table1 != NULL) free( (char *) table1 );
X    if (table2 != NULL) free( (char *) table2 );
X}  /* void freememory() */
X
X
X/* Preliminary setup */
X
Xvoid setchar( c, accept )
Xuint8 c;
XBoolean accept;
X{
X    Letters[c] = accept;
X    if (! CaseSignificant) {
X	if ((A_CHAR <= c) && (c <= Z_CHAR)) Letters[c + A_TO_a] = accept;
X	if ((a_CHAR <= c) && (c <= z_CHAR)) Letters[c - A_TO_a] = accept;
X    }
X}  /* void setchar( c, accept ) */
X
Xvoid setchars( s, accept )
Xuint8 *s;
XBoolean accept;
X{
Xint16 i;
Xuint8 c;
X
X    if (s==NULL) return;
X    i = 0;
X    while ((c = s[i++]) != 0) setchar( c, accept );
X}  /* void setchars( s, accept ) */
X
Xvoid maketranstable()
X{
Xint16 c;
X
X    for (c=0; c < NUMCHARS; c++) {
X	CharToIndex[(uint8)c] = SPACEINDEX;
X	IndexToChar[(uint8)c] = SPACE_CHAR;
X    }
X    NumChars = 1;
X    if (!CaseSignificant) {
X	for (c=a_CHAR; c<= z_CHAR; c++) {
X	    if (Letters[(uint8)(c - A_TO_a)] != Letters[(uint8)c]) {
X	    	Letters[(uint8)c] = TRUE;
X	    	Letters[(uint8)(c - A_TO_a)] = TRUE;
X	    }
X	}
X    }
X    for (c=0; c < NUMCHARS; c++) {
X	if (Letters[(uint8)c] && (CaseSignificant || (c < A_CHAR) || (Z_CHAR < c))) {
X	    CharToIndex[(uint8)c] = NumChars;
X	    IndexToChar[(uint8)NumChars] = c;
X	    NumChars++;
X	}
X    }
X    if (!CaseSignificant) {
X	for (c=a_CHAR; c<= z_CHAR; c++) {
X	    CharToIndex[(uint8)(c - A_TO_a)] = CharToIndex[(uint8)c];
X	}
X    }
X    IndexToChar[(uint8)SPACEINDEX] = SPACE_CHAR;
X
X    t2size = NumChars*NumChars;
X    t3size = t2size*NumChars;
X    t4size = t2size*t2size;
X}  /* void maketranstable() */
X
X
X/* Input */
X
XBoolean openfile()
X{
XFILE *temp;
X
X    temp = freopen( Argv[CurFile], "r", stdin );
X    if (temp == NULL) {
X	fprintf( stderr, "%s: could not open file \"%s\"%c",
X	    Argv[0], Argv[CurFile], NEWLINECHAR );
X	ExitStatus = 2;
X    }
X    return( temp != NULL );
X}  /* Boolean openfile() */
X
XBoolean getnextfile()
X{
X    while (((++CurFile) < Argc) && (! openfile())) { /* nothing */ }
X    return( CurFile < Argc );
X}  /* Boolean getnextfile() */
X
Xint16 getrawchar()
X{
Xint16 c;
X    c = getchar();
X    while ((c==EOFCHAR) && getnextfile()) {
X	c = getchar();
X    }
X    return(c);
X}  /* int16 getrawchar() */
X
X#define WASSPACE    0
X#define WASNONSPACE 1
X#define END         2
Xint16 Where = WASSPACE;
X
Xint16 nextchar()
X{
Xint16 c;
X
X    switch (Where) {
X	case WASSPACE:
X	    while (((c = getrawchar()) != EOFCHAR) &&
X		   (!Letters[(uint8)c])) {
X		/* nothing */
X	    }
X	    if (c==EOFCHAR) {
X		Where = END;
X		return(-1);
X	    } else {
X		Where = WASNONSPACE;
X		return(CharToIndex[(uint8)c]);
X	    }
X	case WASNONSPACE:
X	    c = getrawchar();
X	    if (c==EOFCHAR) {
X		Where = END;
X		return(SPACEINDEX);
X	    } else if (Letters[(uint8)c]) {
X		return(CharToIndex[(uint8)c]);
X	    } else {
X		Where = WASSPACE;
X		return(SPACEINDEX);
X	    }
X	case END:
X	    return(-1);
X    }
X    return(-1);	/* Never happens. */
X}  /* int16 nextchar() */
X
XDigraphBlockPtr NewBlock( size )
Xint32 size;
X{
XDigraphBlockPtr temp;
X    temp = (DigraphBlockPtr) malloc( BLOCKSIZE(size) );
X    return( temp );
X}  /* DigraphBlockPtr NewBlock( size ) */
X
XBoolean insertdigraph( t, cd )
XDigraphBlockPtr *t;
Xuint16 cd;
X{
XDigraphBlockPtr temp;
Xint32 newSize;
X
X    if (t==NULL) return( FALSE );
X    if (((*t)==NULL) || ((*t)->size >= (*t)->maxsize)) {
X	newSize = (*t)==NULL ? INITSIZE : ((*t)->size * GROWNUM)/GROWDEN;
X	temp = NewBlock( newSize );
X	if (temp==NULL) return( FALSE );
X	if ((*t)==NULL) {
X	    temp->size = 1;
X	} else {
X	    BlockMove( (Ptr) (*t), (Ptr) temp, BLOCKSIZE((*t)->size) );
X	    temp->size = (*t)->size + 1;
X	    free( (char *) (*t) );
X	}
X	temp->maxsize = newSize;
X	*t = temp;
X	(*t)->data[(*t)->size-1] = cd;
X    } else {
X    	(*t)->data[(*t)->size++] = cd;
X    }
X    return( TRUE );
X}  /* Boolean insertdigraph( t, cd ) */
X
Xint16 AA = 0, BB = 0, CC = 0;
X
Xvoid entergroup( d )
Xint16 d;
X{
Xint32 ab, cd;
X
X    ab = AA*NumChars + BB;
X    cd = CC*NUMCHARS + d;
X    if (table2[ab] < MAXUINT16) {
X	if (insertdigraph( &(quadtable[ab]), (uint16) cd )) {
X	    table0++;
X	    table1[d]++;
X	    table2[ab]++;
X	}
X    }
X    AA = BB;  BB = CC;  CC = d;
X}  /* void entergroup( d ) */
X
Xvoid buildtable()
X{
Xint16 a0, b0, c0, d;
X
X    a0 = nextchar();
X    if (a0==SPACEINDEX) a0 = nextchar();
X    b0 = nextchar();
X    c0 = nextchar();
X    if (c0 == -1) return;
X    AA = a0;  BB = b0;  CC = c0;
X    while ((d = nextchar()) != (-1)) {
X	entergroup( d );
X    }
X    if (CC==SPACEINDEX) {
X	entergroup( a0 );
X	entergroup( b0 );
X	entergroup( c0 );
X    } else {
X	entergroup( SPACEINDEX );
X	entergroup( a0 );
X	entergroup( b0 );
X	entergroup( c0 );
X    }
X}  /* void buildtable() */
X
X
X#ifdef SHOWTABLE
X
X/* Dump the tables. */
X/* Only called if SHOWTABLE is defined at compile time. */
X
Xvoid showtable()
X{
Xuint8 i, j, k;
Xint32 *t2;
Xuint8 *t4;
X
X    for (i=0; i<NumChars; i++) if (table1[i] != 0) {
X	printf( "%c\t%lu%c", IndexToChar[i], table1[i], NEWLINECHAR );
X	t2 = table2 + i*NumChars;
X	for (j=0; j<NumChars; j++) if (t2[j] != 0) {
X	    printf( "%c%c\t%u", IndexToChar[i], IndexToChar[j], t2[j] );
X	    t4 = (uint8 *) (quadtable[i*NumChars + j]->data);
X	    for (k=0; k<t2[j]; k++) {
X		if ((k%20==0) && (k>0)) { putchar( NEWLINECHAR );  putchar( '\t' ); }
X		putchar( ' ' );
X		putchar( IndexToChar[ (uint8)(t4[k+k]) ] );
X		putchar( IndexToChar[ (uint8)(t4[k+k+1]) ] );
X	    }
X	    putchar( NEWLINECHAR );
X	}
X	putchar( NEWLINECHAR );
X    }
X    fflush( stdout );
X}  /* void showtable() */
X
X#endif
X
X
X/* Generation of output */
X
Xuint16 Rand16()
X{
X    return( (uint16) Random() );
X}  /* uint16 Rand16() */
X
Xint32 randint( max )
Xint32 max;
X{
X    if (max==0) return( 0 );
X    if (max <= MAXUINT16) return( ((int32) Rand16())%max );
X    return( ((((int32) Random()) << 16) + ((int32) Random())) % max );
X}  /* int32 randint( max ) */
X
Xuint16 randchoice32( tot, dist )
Xint32 tot;
Xint32 *dist;
X{
Xint32 i;
Xuint8 j;
X
X    if (tot==0) return(NOTCHOICE);
X    i = randint( tot );
X    for (j=0; j<NumChars; j++) {
X    	if (i < dist[j]) return(j);
X	i -= dist[j];
X    }
X    return( NOTCHOICE );	/* Should never happen. */
X}  /* uint16 randchoice32( tot, dist ) */
X
Xcleanupquads()
X{
Xint32 i;
X
X    for (i=0; i<t2size; i++) if (table2[i] > 0) {
X	SortArray( quadtable[i]->data, quadtable[i]->size );
X    }
X}  /* cleanupquads() */
X
Xuint16 randtrip( a, b )
Xuint8 a, b;
X{
Xuint16 aNb;
Xint32 t2;
Xuint8 *t4;
Xint32 r;
X
X    aNb = a*NumChars+b;
X    t2 = table2[ aNb ];
X    t4 = (uint8 *) (quadtable[ aNb ]->data);
X    r = randint( t2 );
X    return( (uint16) (t4[r+r]) );
X}  /* uint16 randtrip( a, b ) */
X
Xuint16 randquad( a, b, c )
Xuint8 a, b, c;
X{
Xuint16 aNb;
Xint32 t2;
Xuint8 *t4;
Xint32 lo, hi, i, r;
X
X    aNb = a*NumChars+b;
X    t2 = table2[ aNb ];
X    t4 = (uint8 *) (quadtable[ aNb ]->data);
X    lo = 0;
X    hi = 0;
X    for (i=0; i<t2; i++) {
X    	if (t4[i+i] <= c){
X	    hi++;
X    	    if (t4[i+i] < c) lo++;
X	} else break;
X    }
X    if (lo >= hi) {
X    	/* This should never happen. */
X    	return( NOTCHOICE );
X    }
X    r = lo + randint( hi-lo );
X    return( (uint16) (t4[r+r+1]) );
X}  /* uint16 randquad( a, b, c ) */
X
Xvoid outchar( c )
Xint16 c;
X{
X    if (Column < BREAK1) {
X	putchar(c);  Column++;
X    } else if (c==chartoint(' ')) {
X	putchar( NEWLINECHAR );
X	Column = 0;  Lines++;
X    } else if (Column >= BREAK2) {
X	putchar('-');  putchar( NEWLINECHAR );
X	Column = 0;  Lines++;
X	if (Lines < MaxLines) {
X	    putchar(c);  Column++;
X	}
X    } else {
X	putchar(c);  Column++;
X    }
X}  /* void outchar( c ) */
X
Xvoid generateword()
X{
Xuint16 a, b, c, d;
X
X    a = (uint16)SPACEINDEX;
X    b = randchoice32( (int32) (table1[a]), table2 + a*NumChars );
X    if (b==NOTCHOICE) return;
X    outchar( IndexToChar[(uint8)b] );
X    if (SeparateWords && (b==SPACEINDEX)) return;
X
X    c = randtrip( (uint16)SPACEINDEX, (uint8)b );
X    outchar( IndexToChar[(uint8)c] );
X    if (SeparateWords && (c==(uint16)SPACEINDEX)) return;
X
X    while (Lines < MaxLines) {
X    	d = Big ? randquad( (uint8)a, (uint8)b, (uint8)c )
X		: randtrip( (uint8)b, (uint8)c );
X	if (d==NOTCHOICE) {
X	    outchar( '.' );
X	    return;
X	}
X	outchar( IndexToChar[(uint8)d] );
X	if (SeparateWords && (d==(uint16)SPACEINDEX)) return;
X	a = b;  b = c;  c = d;
X    }
X}  /* void generateword() */
X
Xvoid generate()
X{
X    if (table0 > 0) while (Lines < MaxLines) generateword();
X}  /* void generate() */
X
X
X/* Argument parsing */
X
Xvoid usageerror()
X{
X    fprintf( stderr, "Usage: %s [-3] [-s|-w] [-c] [-axxxx] [-dxxxx] [-lnnn] [-rnnn] [file]%c",
X	Argv[0], NEWLINECHAR );
X    fprintf( stderr, "\t-3: 3rd-order statistics.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-w: Successive words are independent (default).%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-s: (Sentences) Successive words are dependent.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-c: Treat case differences as significant.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-axxxx: Accept characters in string \"xxxx\" as 'letters'.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-dxxxx: Treat characters in string \"xxxx\" as spaces.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\tSuccessive -a/-d options are processed sequentially.%c",
X	NEWLINECHAR );
X    fprintf( stderr, "\t-lnnn: Generate nnn lines of output (default %d).%c",
X	DEFAULTMAXLINES, NEWLINECHAR );
X    fprintf( stderr, "\t-rnnn: Specify random generator seed.%c",
X	DEFAULTMAXLINES, NEWLINECHAR );
X    ExitStatus = 1;
X    exit( ExitStatus );
X}  /* void usageerror() */
X
Xvoid processoptions()
X{
Xint16 i;
X
X/* getopt()?  What's that? :-) */
X
X    CaseSignificant = FALSE;
X    for (i=0; i<NUMCHARS; i++) Letters[(uint8)i] = FALSE;
X    setchars( (uint8 *) "abcdefghijklmnopqrstuvwxyz", TRUE );
X
X    CurFile = Argc;
X    for (i=1; i<Argc; i++) {
X	if (Argv[i][0] == '-') {
X	    switch (Argv[i][1]) {
X		case 's':
X		    SeparateWords = FALSE;
X		    break;
X		case 'w':
X		    SeparateWords = TRUE;
X		    break;
X		case 'a':
X		    setchars( &(Argv[i][1]), TRUE );
X		    break;
X		case 'd':
X		    setchars( &(Argv[i][1]), TRUE );
X		    break;
X		case 'c':
X		    CaseSignificant = TRUE;
X		    break;
X		case '3':
X		    Big = FALSE;
X		    break;
X		case 'l':
X		    if (Argv[i][2]==0) {
X			MaxLines = MAXUINT32;
X		    } else if ((sscanf( &(Argv[i][2]), "%lu", &MaxLines ) != 1) ||
X			(MaxLines < 0)) {
X			usageerror();  /* exits */
X		    }
X		    break;
X		case 'r':
X		    if ((Argv[i][2] != 0) &&
X			(sscanf( &(Argv[i][2]), "%lu", &RandSeed ) != 1)) {
X			usageerror();  /* exits */
X		    }
X		    break;
X		default:
X		    usageerror();  /* exits */
X	    }
X	} else if (Argv[i][0] == 0) {
X	    FileArgs = FALSE;
X	} else {
X	    FileArgs = TRUE;
X	    CurFile = i-1;
X	    (void) getnextfile();
X	    return;
X	}
X    }
X}  /* void processoptions() */
X
X
X/* Control */
X
X#if UNIX
Xcleanup( status, ignore )
Xint status;
Xchar *ignore;
X#endif
X#if MPW
Xvoid cleanup( status )
Xint status;
X#endif
X{
X    freememory();
X}  /* cleanup( status, ignore ) */
X
Xvoid SeedRand()
X{
X#if MPW
X    qd.randSeed = (int32) RandSeed;
X#endif
X#if UNIX
X    srandom( RandSeed );
X#endif
X}  /* void SeedRand() */
X
Xmain( argc, argv )
Xint argc;
Xuint8 **argv;
X{
X    Argc = argc;  Argv = argv;
X
X#if MPW
X    InitGraf( &(qd.thePort) );  /* for random numbers */
X    GetDateTime( &RandSeed );
X#endif
X#if UNIX
X    RandSeed = time(0);
X    on_exit( cleanup, NULL );		/* Probably not necessary. */
X#endif
X#if MPW2
X    onexit( cleanup );			/* Maybe necessary? */
X#endif
X
X    processoptions();
X
X    SeedRand();
X    maketranstable();
X    getmemory();
X    fprintf( stderr, "Reading input...%c", NEWLINECHAR );
X    buildtable();
X    fprintf( stderr, "%d different letters, %u characters input.  Randseed = %lu%c",
X	NumChars-1, table0, RandSeed, NEWLINECHAR );
X    if (table0 > 0) {
X#ifdef SHOWTABLE
X	showtable();
X#endif
X	if (Big) cleanupquads();
X#ifdef SHOWTABLE
X	showtable();
X#endif
X	generate();
X	fflush( stdout );
X    }
X    exit( ExitStatus );
X}  /* main() */
X
X
X/* Heapsort. */
X
Xstatic uint16 *TheArray;
X
Xuint16 Temp;
X
X#define SWAPITEM( i, j )	\
X    Temp = TheArray[(i)];	\
X    TheArray[(i)] = TheArray[(j)];	\
X    TheArray[(j)] = Temp	\
X
Xstatic void MakeHeap( theElement, numElements )
Xint32 theElement, numElements;
X{
Xint32 left, right;
X
X    while ((left = theElement+theElement+1L) < numElements) {  
X	right = left+1L;
X	if (TheArray[theElement] < TheArray[left]) {
X	    if ((right < numElements) &&
X		(TheArray[left] < TheArray[right])) {
X		/* M<L<R */
X		SWAPITEM( theElement, right );
X		theElement = right;
X	    } else {  /* M<L, M<R<L, R<M<L */
X		SWAPITEM( theElement, left );
X		theElement = left;
X	    }
X	} else if ((right < numElements) &&
X		   (TheArray[theElement] < TheArray[right])) {
X	    /* L<M<R */
X	    SWAPITEM( theElement, right );
X	    theElement = right;
X	} else {
X	    /* L<M, L<R<M, R<L<M */
X	    break;
X	}
X    }
X}  /* static void MakeHeap( theElement, numElements ) */
X
Xstatic void SortArray( theArray, length )
Xuint16 *theArray;
Xint32 length;
X{
Xint32 i;
X
X    TheArray = theArray;
X    
X    for (i = (length / 2L) - 1L; i >= 0L; i--) {
X	MakeHeap( i, length );
X    }
X    for (i = length-1L; i >= 1L; i--) {
X	SWAPITEM( 0L, i );
X	MakeHeap( 0L, i );
X    }
X}  /* static void SortArray( theArray, length ) */
*-*-END-of-names2.c-*-*
echo x - Names2.make
sed 's/^X//' >Names2.make <<'*-*-END-of-Names2.make-*-*'
X#	Replace each backslash by option-d, and each colon by option-f.
X#	Then use the Build command on the Build menu to build Names2.
X
X#   File:       Names2.make
X#   Target:     Names2
X#   Sources:    names2.c
X#   Created:    Thursday, February 15, 1990 20:42:06 with MPW C version 3
X
Xnames2.c.o : Names2.make names2.c
X	 C  names2.c
X
XSOURCES = names2.c
XOBJECTS = names2.c.o
X
XNames2 :: Names2.make {OBJECTS}
X	Link -w -c 'MPS ' -t MPST \
X		{OBJECTS} \
X		"{Libraries}"stubs.o \
X		"{CLibraries}"CRuntime.o \
X		"{Libraries}"Interface.o \
X		"{CLibraries}"StdCLib.o \
X		"{CLibraries}"CSANELib.o \
X		"{CLibraries}"Math.o \
X		"{CLibraries}"CInterface.o \
X		"{Libraries}"ToolLibs.o \
X		-o Names2
*-*-END-of-Names2.make-*-*
echo x - Makefile.unix
sed 's/^X//' >Makefile.unix <<'*-*-END-of-Makefile.unix-*-*'
Xnames2 : names2.c
X	cc names2.c -o names2
*-*-END-of-Makefile.unix-*-*
exit