[comp.sources.unix] v11i028: Arabic numerals to multi-lingual natural language

rs@uunet.UUCP (09/02/87)

Submitted-by: mcvax!imag!scott (Scott Deerwester)
Posting-number: Volume 11, Issue 28
Archive-name: number

[  This is great!  The comment at the start of the source is almost as good.
   Beware of the "ln" calls this archive makes...  -r$  ]

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	README
#	Makefile
#	number.6
#	number.c
#	lang
export PATH; PATH=/bin:$PATH
echo shar: extracting "'README'" '(6275 characters)'
if test -f 'README'
then
	echo shar: will not over-write existing file "'README'"
else
sed 's/^X//' << \SHAR_EOF > 'README'
X	::::	Description	::::
X
XThis version of number was inspired from the standard
Xversion in /usr/games, but with a big difference:  It
Xknows how to count in a *lot* of languages, and can
Xbe taught almost anything that it doesn't already know.
XThe information on how to count in a given language
Xis given in a grammar, written in a meta-language described
Xa little later.  Grammars exist, as of this moment, for:
X
XCantonese		French		Japanese	Spanish
XChinese (Mandarin)	Gaellic (Irish)	Malinke		Susu
XEnglish			German		Pular		Vietnamese
XEsperanto		Italian		Romanian
X
XIt takes about twenty minutes with a native speaker to
Xwrite a grammar, once you understand how to do so.  Instructions
Xon how to write a grammar are included at the end of this
Xfile.  It is also highly recommended that you study some
Xof the existing grammars before attempting your own.
X
X	::::	Installation	::::
X
X- Make a directory for this package, and cd to it.
X- Run 'sh' on the file that you put this shell archive
X  in in order to extract all of the files in it.
X- Edit the Makefile to indicate where you want the
X  executable, and the grammar files to be kept.
X
X** Important: THIS MUST BE DONE BEFORE YOU COMPILE NUMBER!
X
X- Type 'make install'
X
X	::::	Adding languages	::::
X
X** Note:  If all you want to do is use number, you can skip the
X	  rest of this file.
X
XThe grammar files used by number contain rules that
Xnumber uses to translate numbers into words.  The
Xrules are generally of the form:
X
Xn	rules
X
Xwhere 'n' is the "base" number, separated by a single tab from
Xthe rule per se.  The rules must be in ascending order of
Xbase number.  The basic algorithm that number uses to
Xtranslate a number to words is to find the rule with
Xthe largest base that is smaller than the number, and
Xevaluate the rule for that base, given the number.  For
Xexample, one might have the rule:
X
X5	"five"
X
Xwhich, if the number being translated were 5, would cause
Xthe rule "five" to be evaluated.  In this case, the rule
Xis a simple string, which would be printed.  Rules may
Xcontain:
X
X- Strings, delimited by double quotes, which are simply printed.
X- Spaces, which are printed as well.
X- A number, which causes the number to be spelled.
X- The following special characters, with the meanings:
X  /  Spell the number divided by the base.
X  %  Spell the number modulus the base.
X- A conditional, described below.
X- A single character macro.
X- A comma, which functions as a no-op.
X
XIn addition, grammars contain macro definitions, and may
Xcontain lines of comments, which have a '#' in the first
Xcolumn.  Rules may be continued across multiple lines by
Xterminating the line with a '\'.  The next line must
Xbegin with a tab character.
X
XConditionals have the following syntax:
X
X(L C R	rule)
X
Xwith a tab separating the 'R' from the rule.  L and R
Xare either numbers, or one of the characters:
X
X/	Number divided by base.
X%	Number modulus base.
XB	Base.
X#	Number.
XL	Recursion level.
X
XC is one of '<', '>', '=' or '~', meaning less than,
Xgreater than, equal to, or not equal to, respectively.
XThe following (common) rule, taken from the grammar
Xfor Esperanto, expresses the fact that for units
Xof 10, the number divided by 10 is only added before
Xthe word for 10 if is greater than 1:
X
X10	(/ > 1	/)"dek" %
X
XThus, if 23 were being evaluated, the conditional would
Xcheck to see if 23 / 10 is greater than 1, and, since it
Xis, evaluate the rule '/' causing 2 to be spelled.  Given
Xthe following grammar:
X
X0	"zero"
X1	"unu"
X2	"du"
X3	"tri"
X4	"kvar"
X5	"kvin"
X6	"ses"
X7	"sep"
X8	"ok"
X9	"na[bre]u"
X10	(/ > 1	/)"dek" %
X100	(/ > 1	/) "cent" %
X
Xand the number 23, the following would happen:
X
X- Select rule for base 10, number 23, since 10 is the largest
X  base that is smaller than 23.
X  - Evaluate the conditional '(/ > 1	/)'.
X    - Evaluate / -> 23 / 10 -> 2.
X    - Evaluate 2 > 1 -> TRUE.
X    - Evaluate / -> 23 / 10 -> 2.
X      - Evaluate rule for base 2, number 2.
X	- Print "du".
X  - Print "dek".
X  - Print " ".
X  - Evaluate % -> 23 % 10 -> 3.
X    - Evaluate rule for base 3, number 3.
X      - Print "tri"
X
XThe net result is that "dudek tri" is printed.
X
X	::::	Macros	::::
X
XA macro is defined by the following syntax:
X
X/	c	rule
X
Xwhere c is any letter except L and B.  (Other characters
Xare allowed as well, but don't bump into special characters...)
XIn the grammar given above, for Esperanto, we could have
Xused a macro to simplify the rules for 10 and 100, thus:
X
X/	d	(/ > 1	/)
X10	d"dek" %
X100	d "cent" %
X
XThe following is about the most complicated piece of
Xcode in a grammar, and illustrates most of the features
Xof the grammar metalanguage.
X
X10	(/ > 1	/) "m'u~'"(/ = 1	"[`]")"'o~'i" a
X/	a	(/ > 1	(% = 1	"m[^']ot")(% > 1	c))(/ = 1	c)
X/	c	(% = 5	"l[)]am")(% ~ 5	%)
X100	...
X
XIt expresses the following:
X
X- For any number greater than or equal to 10, but less
X  than 100, first spell the number divided by ten, if
X  the quotient is greater than one.
X- Print the string "m'u~'".
X- If the quotient is 1, then print the string "[`]".
X- Print the string "'o~'i".
X- Expanding the macro 'a', if the quotient is greater
X  than 1:
X  - If the number modulus the base is 1, print the string "m[^']ot".
X  - If the modulus is greater than 1, expanding macro c:
X    - If the modulus is 5, print "l[)]am".
X    - If the modulus is not 5, evaluate the modulus.
X- If the quotient is 1, expanding macro c:
X  - If the modulus is 5, print "l[)]am".
X  - If the modulus is not 5, evaluate the modulus.
X
XThe essence of all of this is that if, in Vietnamese, something
Xcomes before the word "m'u~'[`]'o~'i", which means 10, it
Xloses its tone ("[`]").  Further, the word for 1 (normally
X"m[^.]ot" changes its tone in some circumstances, and the
Xword for 5 ("n[)]am") changes its initial letter to an "l"
Xin still other situations.  Whew.
X
XMacros can be handy for making very short rules.  For example,
Xin German, 30, 40, etc. are spelled by adding the letters "zig"
Xto the word for the number divided by 10.  Thus, with the
Xfollowing macro:
X
X/	z	"zig"
X
Xthe rule for 30 can be reduced from:
X
X30	"dreizig"
X
Xto:
X
X30	3z
X
X	::::	Conclusion	::::
X
XIf you write any grammars, I'd greatly appreciate having
Xthem.  My permanent net address is:
X
X	scott@cerberus.uchicago.edu
X
XUSnail address:
X
X	Scott Deerwester
X	1100 E. 57th, GLS
X	University of Chicago
X	Chicago, Illinois 60637
SHAR_EOF
if test 6275 -ne "`wc -c < 'README'`"
then
	echo shar: error transmitting "'README'" '(should have been 6275 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'Makefile'" '(421 characters)'
if test -f 'Makefile'
then
	echo shar: will not over-write existing file "'Makefile'"
else
sed 's/^X//' << \SHAR_EOF > 'Makefile'
XBINDIR =	/users/lgi/rechdoc/scott/bin
XGRAMMARDIR =	/users/lgi/rechdoc/scott/etc/number
XDEBUG =		-O
XCFLAGS =	$(DEBUG) -DGRAMMARDIR=\"$(GRAMMARDIR)/\"
X
Xnumber:		number.o
X	cc $(DEBUG) number.o -o number
X
Xinstall:	install-number install-lang
X
Xinstall-number:	number
X	install number $(BINDIR)
X
Xinstall-lang:
X	-mkdir $(GRAMMARDIR)
X	(cd lang; tar cf - .) | (cd $(GRAMMARDIR); tar xf -)
X
Xclean:
X	rm -f *.o number *~ core lang/*~
SHAR_EOF
if test 421 -ne "`wc -c < 'Makefile'`"
then
	echo shar: error transmitting "'Makefile'" '(should have been 421 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'number.6'" '(690 characters)'
if test -f 'number.6'
then
	echo shar: will not over-write existing file "'number.6'"
else
sed 's/^X//' << \SHAR_EOF > 'number.6'
X.TH NUMBER 6 "August 11, 1987"
X.AT 3
X.SH NAME
Xnumber \- convert Arabic numerals to natural language
X.SH SYNOPSIS
X.B number
X.I "[language]"
X.SH DESCRIPTION
X.I Number
Xcopies the standard input to the standard output,
Xchanging each decimal number to a fully spelled out version.
XIt works with a lot of languages, among which are English,
XFrench, Chinese, Vietnamese...  If it is given a language
Xthat it doesn't know, it prints out the list of languages
Xthat it does know.  Some languages are listed under more
Xthan one name, such as Gaellic and Irish.
X.SH BUGS
XIt currently doesn't handle very large numbers.  It
Xcould be made to do so, and maybe it will someday.
X.SH AUTHOR
XScott Deerwester
SHAR_EOF
if test 690 -ne "`wc -c < 'number.6'`"
then
	echo shar: error transmitting "'number.6'" '(should have been 690 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'number.c'" '(14405 characters)'
if test -f 'number.c'
then
	echo shar: will not over-write existing file "'number.c'"
else
sed 's/^X//' << \SHAR_EOF > 'number.c'
X#include <stdio.h>
X#include <ctype.h>
X/*
X * number
X *
X *	Number is a program that counts in lots of languages.
X *	It was originally written during a sanity break while
X *	I was writing my PhD thesis.  That version got left
X *	on a machine somewhere in upstate New York.  This one
X *	was done while on leave in Grenoble, after realizing
X *	that I hadn't written a computer program in over two
X *	months.
X *
X *	Number is inspired, of course, from /usr/games/number, but
X *	it uses a series of grammars that define counting in
X *	different languages.  The language that is used to
X *	write the grammars is described below, in evalrule().
X *	If you write any new grammars, I'd greatly appreciate
X *	having them.  Grammars aren't very hard to write, if
X *	you know how to count in something that isn't defined
X *	here.  The longest grammar (french) only has 30 rules
X *	and 5 macros, and correctly pronounces any number less
X *	1,000,000,000,000.  The shortest is for cantonese, which
X *	has 14 rules.
X *
X * A note on the output of number:
X *
X *	The characters that are output conform to the TIRA
X *	character representation standard.  Essentially, strings
X *	in anything except the latin alphabet (what you're reading
X *	now) are preceded by an indication of the alphabet that
X *	they are part of.  The exceptions to this are mandarin,
X *	cantonese and japanese.  These three are written in
X *	pin-yin, roughly Wade Giles, and romanji, respectively.
X *	The only other thing special about this format is that
X *	accents and tone markings are given in [] brackets
X *	before the letter to which they are attached.
X *
X *	TIRA stands for Textual Information Retrieval and Analysis
X *	research group, and is a research group at the University
X *	of Chicago containing computer and information scientists,
X *	literary scholars and linguists.  TIRA is working on a
X *	research environment for doing textual research.  Watch
X *	this space.
X *
X * Copyright 1987, Scott Deerwester.
X *
X *	This code may be freely distributed and copied, provided
X *	that a copy of this notice accompanies all copies and
X *	that no copy is sold for profit.
X */
X
X/*
X *	Constants for array bounds.  Both of these are overkill.
X */
X
X#define	MAXRULES	100
X#define	MAXSPECIALS	50
X
X/*
X *	Structure to hold macro definitions.
X */
X
Xstruct {
X	char	c;
X	char	*rule;
X} specials [MAXSPECIALS];
X
Xint	nspecials = 0;
Xint	maxdigits;
X
X/*
X *	Definition of a grammar rule.
X */
X
Xstruct {
X	int	base;
X#ifdef	COND
X	int	cond;
X#endif	COND
X	char	*rule;
X} rule[MAXRULES];
X
Xint	nrules;
Xchar	*lang = "english";		/* You can change this if you like */
Xchar	*malloc ();
Xunsigned long	parsenumber();
Xlong	atol(), random();
X
Xint	dbgflag = 0;
X
Xmain (argc, argv)
X     char *argv[];
X{
X        int errflg = 0;
X
X	chkdbg ();
X	srandom (getpid ());
X	domaxdigits ();
X/*
X *	Someday, maybe, I'll enable this to take a number
X *	on the command line.
X */
X
X        switch (argc) {
X	case 1:	break;
X	case 2:	lang = argv[1];	break;
X	default:
X		errflg++;
X	        break;
X	}
X
X	if (errflg)
X	{
X	        fprintf (stderr, "Usage: number [language]\n");
X		exit (0);
X	}
X
X/*
X *	read_grammar finds the grammar for the language and
X *	reads it in.  It exits if it can't find the grammar.
X */
X	read_grammar (lang);
X/*
X *	Main loop.  Read in numbers.  Make sure that the input
X *	is a number, and spell it in the requested language.
X */
X	while (1)
X	{
X		char lbuf [512];
X		register i, l;
X		unsigned long u;
X		long n;
X
X		if (isatty (0))
X			printf ("> ");
X
X		if (!gets (lbuf))
X		{
X			break;
X		}
X
X		if ((l = strlen (lbuf)) > maxdigits)
X		{
X			printf ("My limit is ");
X			for (i = 0; i < maxdigits; i++)
X				putchar ('9');
X			putchar ('\n');
X			continue;
X		} else if (l == 0)
X			continue;
X
X		n = 0;
X		if (sscanf (lbuf, "%ld", &n) != 1)
X		{
X			printf ("%s is not a non-negative integer.\n", lbuf);
X			continue;
X		}
X
X		if (n < 0)
X		{
X			printf ("I don't handle negative numbers.\n");
X			continue;
X		}
X
X		sscanf (lbuf, "%ld", &u);
X		spell (u, 0);
X		outchar ('\n');
X	}
X	outchar ('\n');
X}
X
Xdomaxdigits ()
X{
X	unsigned long maxint = 0;
X	register i;
X	char str [128];
X
X	for (i = 0; i < sizeof (long) * 8; i++)
X		maxint |= 1 << i;
X	sprintf (str, "%lu", maxint);
X	maxdigits = strlen (str) - 1;
X
Xdbg ("domaxdigits computes %lu as %d reliable digits.\n", maxint, maxdigits);
X}
X
X/*
X *	read_to_eol is equivalent to fgets, except that it
X *	reads the string into a temporary buffer, allocates
X *	enough space for it, and copies the string into the
X *	allocated space.  In other words, it does what fgets()
X *	would do if C had proper memory management. :-)
X */
X
Xchar *read_to_eol (fp)
X	FILE *fp;
X{
X	char	*tmpbuf, *cp;
X	char	rbuf [512];
X	register l = 0;
X
X	cp = rbuf;
X	while (1)
X	{
X		fgets (cp, sizeof (rbuf) - l, fp);
X		l = strlen (rbuf);
X		if (rbuf [l - 2] != '\\')
X			break;
X		cp = rbuf + l - 2;
X		if (getc (fp) != '\t')
X		{
X			fprintf (stderr, "read_to_eol didn't find a tab\n");
X			exit (0);
X		}
X		if (l >= sizeof (rbuf))
X		{
X			fprintf (stderr, "rule too long in read_to_eol\n");
X			exit (0);
X		}
X	}
X	
X	tmpbuf = malloc (l = strlen (rbuf));
X	rbuf [l - 1] = '\0';	/* get rid of the newline */
X
X	strcpy (tmpbuf, rbuf);
X
X	return (tmpbuf);
X}
X
X
Xstatic char filename[128];
X
X/*
X *	Cutesy error messages.  They all say the same thing.
X */
X
Xchar *errorfmt[] =
X{
X	"No se habla \"%s\".  Se habla:\n",
X	"I don't speak \"%s\".  I speak:\n",
X	"On ne parle pas \"%s\" ici.  On parle plut[^]ot:\n",
X	"Ich kann nicht \"%s\" sprechen.  Ich spreche:\n",
X	"Ng[?]o [_]m s[^]ik g[']ong \"%s\" w[`]a.  Ng[?]o s[^]ik:\n",
X	"W[?]o b[`]u hu[`]e \"%s\".  W[?]o hu[`]e:\n",
X	"\CYR'Ya' n'ye' govor'yu' po-\"%s\".  'Ya' govor'yu':\n",
X	"Nt[`]e \"%s\" kan m[`]en.  N[`]e be:\n"
X};
X
X#define	nerrfmt	8
X
Xrand (n)
X{
X	return (random () % n);
X}
X
X/*
X *	read_grammar depends on a set of grammar files being
X *	found in GRAMMARDIR.  It expects to find a file with
X *	the name of its parameter, which it opens and reads.
X *	If it can't find one, it prints out a message saying
X *	that it doesn't speak the language, and lists the
X *	known languages by exec'ing /bin/ls.  Note that this
X *	is equivalent to exitting.  It simply puts each of
X *	the rules and macros into arrays.  The format of the
X *	rules in the grammar files is:
X *
X *		n \t rule
X *
X *	where "n" is the base unit of the rule, and "rule"
X *	conforms to the syntax described below in evalrule().
X *	Macros definitions are of the form:
X *
X *		/ \t c \t rule
X *
X *	where "c" is the character to be expanded.  The character
X *	must not be a reserved character.
X *
X *	Grammars may also contain comment lines, which begin with
X *	a '#'.
X */
Xread_grammar (lang)
X	char *lang;
X{
X	register i, c;
X	FILE *fp;
X
X	strcat (filename, GRAMMARDIR);
X	strcat (filename, lang);
X
X	if ((fp = fopen (filename, "r")) == NULL)
X	{
X		if ((fp = fopen (lang, "r")) == NULL)
X		{
X			printf (errorfmt [rand (nerrfmt)], lang);
X			execl ("/bin/ls", "number-ls", GRAMMARDIR, 0);
X		}
X	}
X
X	for (i = 0; !feof (fp);)
X	{
X#ifdef	COND
X		rule[i].cond = 0;
X#endif	COND
X
X		if ((c = getc (fp)) == '/')
X		{
X			register j;
X
X			while ((c = getc (fp)) == '\t')
X				;
X			j = nspecials++;
X			specials[j].c = c;
X			while ((c = getc (fp)) == '\t')
X				;
X			ungetc (c, fp);
X			specials[j].rule = read_to_eol (fp);
X
Xdbg ("macro '%c': %s\n", specials[j].c, specials[j].rule);
X
X			continue;
X		} else if (c == EOF)
X		{
X			break;
X		} else if (c == '\n')
X		{
X			continue;
X		} else if (c == '#')
X		{
X			while (getc (fp) != '\n')
X				;
X			continue;
X		} else if (!isdigit (c))
X		{
X			printf ("Read a '%c' in rule %d\n", c, i);
X			break;
X		} else
X			ungetc (c, fp);
X
X		if (fscanf (fp, "%d", &rule[i].base) != 1)
X			break;
X
X		if ((c = getc (fp)) != '\t')
X		{
X#ifdef COND
X			rule[i].cond = c;
X#endif COND
X			while (getc (fp) != '\t')
X				;
X		}
X
X		rule[i].rule = read_to_eol (fp);
X
Xdbg ("rule %d: %d %s\n", i, rule[i].base, rule[i].rule);
X
X		i++;
X	}
X	nrules = i;
X}
X
X/*
X *	spell is the function called to spell a number.  It
X *	is initially called with condition 'I' (init).  This
X *	is a hack to get around the problem of when to pronounce
X *	0.  Spell essentially just figures out what the appropriate
X *	rule is, and calls evalrule() to do the work.
X */
Xspell (n, level)
X	unsigned long n;
X{
X	register i;
X
X	if (n == 0 && level)
X		return;
X
X	for (i = nrules - 1; rule[i].base > n; i--)
X		;
X
X	evalrule (rule[i].rule, rule[i].base, n, level);
X}
X
X/*
X * next
X *	This is a simple function to bounce around in strings
X *	with a syntax that includes balanced parens and double
X *	quotes. There's something like this in Icon, but this
X *	program is in C, so...
X */
Xchar *next (s, c)
X	char *s, c;
X{
X	register char *e;
X
X	for (e = s; *e != c; e++)
X	{
X		if (*e == '"')
X			e = next (e + 1, '"');
X		if (*e == '(' && c != '"')
X			e = next (e + 1, ')');
X	}
X	return (e);
X}
X
X/*
X *	evalrule does the dirty work.  It takes a rule, a
X *	base, and a number, and prints the number according
X *	to the rule.  Rules may use the following characters:
X *
X *	B	the base
X *	%	n % base
X *	/	n / base
X *	,	no-op
X *	"..."	for strings
X *
X *	conditionals are of the form:
X *
X *		(L C R \t rule)
X *
X *	where L and R are either a special character or a
X *	number, and C is one of '>', '<', '=' and '~', meaning,
X *	of course, less than, greater than, equal, and not equal.
X *	Conditionals are evaluated by doconditional(), which
X *	evaluates the condition, and, if it is true, evaluates
X *	the rule.
X *
X *	To give an example of a rule, taken from the grammar
X *	for mandarin:
X *
X *	10	/ "sh[']i" %
X *
X *	means that if the largest number that is smaller than
X *	the number we're trying to say is 10, then we say the
X *	number by saying the number divided by 10, followed
X *	by the word "sh[']i", followed by the remainder of the
X *	number divided by ten.  In other words, to say 23,
X *	you say (23 / 10) = 2, then "sh[']i", then (23 % 10) = 3,
X *	or 2 "sh[']i" 3.  After evaluating the rules for 2 and
X *	3, the string "e[`]r sh[']i s[^]an" is printed.
X */
X
Xevalrule (rule, base, n, level)
X	char	*rule;
X	unsigned long	n;
X	int	base, level;
X{	
X	register j, c;
X
Xdbg ("evalrule (\"%s\", %d, %ld)\n", rule, base, n);
X
X	while (c = *rule)
X	{
X		if (isdigit (c))
X		{
X			spell (atol (rule), level + 1);
X			while (isdigit (*++rule))
X				;
X			continue;
X		} else switch (c) {
X		case ',':	break;
X		case 'B':	spell ((long) base, level + 1);	break;
X		case '%':	spell (n % base, level + 1);	break;
X		case '/':	spell (n / base, level + 1);	break;
X
X		case '"':	while ((c = *++rule) != '"')
X					outchar (c);
X				break;
X		case '(':	docondition (rule, base, n, level);
X				rule = next (rule + 1, ')');
X				break;
X		default:	for (j = 0; j < nspecials; j++)
X				{
X				    if (specials[j].c == c)
X				    {
X					evalrule (specials[j].rule, base,
X						  n, level);
X					break;
X				    }
X				}
X				if (j == nspecials)
X					outchar (c);
X		}
X		rule++;
X	}
X}
X/*
X *	docondition evaluates conditionals, which are delimited
X *	by parentheses, and which contain two parts: a very
X *	simple Boolean expression and a rule.  The Boolean
X *	expression can, at the moment, only be a simple comparison.
X *	OR's (if the conditions are exclusive) can be done by
X *	putting multiple conditions in a row, and AND's by
X *	making the rule a conditional.  docondition calls
X *	parsecond (parse conditional) to pick out the various
X *	parts of the conditional, evaluates the comparison,
X *	and calls evalrule with the rule as an argument if the
X *	comparison evaluates to true.
X *
X *	Two additional special characters that are accepted here
X *	are:
X *
X *		L	Current recursion level
X *		#	The number itself
X */
Xdocondition (rule, base, n, level)
X	char	*rule;
X	unsigned long	n;
X	int	base, level;
X{
X	char	subrule [128];
X	unsigned long	leftside, rightside;
X	int	truth;
X	char	comparator;
X
X/*
X *	This is to check for bad grammars or buggy parser.
X */
X	if (!parsecond (rule, base, n, level,
X			&leftside, &comparator, &rightside, subrule))
X	{
X		printf ("Gagged on rule \"%s\"\n", rule);
X		return;
X	}
X
X	switch (comparator) {
X	case '>':	truth = leftside > rightside;	break;
X	case '=':	truth = leftside == rightside;	break;
X	case '<':	truth = leftside < rightside;	break;
X	case '~':	truth = leftside != rightside;	break;
X	}
X
Xdbg ("docondition (%d, %d, %d %c %d) -> %s\n",
X	base, n, leftside, comparator, rightside,
X	truth ? subrule : "FAILS");
X
X	if (!truth)
X		return;
X
X	evalrule (subrule, base, n, level);
X}
X
X/*
X *	parsecond parses the rule according to the base,
X *	and assigns the parts to the variables passed
X *	as arguments.
X */
X
Xparsecond (rule, base, n, level, lp, cp, rp, subrule)
X	char	*rule, *cp, *subrule;
X	unsigned long	*lp, *rp, n;
X	int	base, level;
X{
X	char	*index(), *rindex();
X	register char *start, *end;
X	char	leftstring[20], rightstring[20];
X
X	if (sscanf (rule, "(%s %c %s", leftstring, cp, rightstring) != 3)
X	{
X
Xdbg ("parsecond failed sscanf (\"%s\", ...)\n", rule);
X
X		return (0);
X	}
X
X	*rp = parsenumber (rightstring, base, n, level);
X	*lp = parsenumber (leftstring, base, n, level);
X
X	if (!(start = index (rule, '\t')))
X	{
X
Xdbg ("parsecond couldn't find a tab in \"%s\"\n", rule);
X
X		return (0);
X	}
X
X	end = next (++start, ')');
X
X	while (start < end)
X		*subrule++ = *start++;
X		
X	*subrule = '\0';
X	return (1);
X}
X
X/*
X *	parsenumber figures out the numerical value of the
X *	string that it is passed, based on the base and the
X *	number n.
X */
X
Xunsigned long parsenumber (s, base, n, level)
X	unsigned long n;
X	char *s;
X{
X	if (isdigit (s[0]))
X		return (atoi (s));
X
X	switch (s[0]) {
X	case '/':	return (n / base);
X	case '%':	return (n % base);
X	case '#':	return (n);
X	case 'L':	return (level);
X	case 'B':	return (base);
X	default:	fprintf (stderr, "bad number string \"%s\"\n", s);
X			return (-1);
X	}
X}
X
X/*
X *	outchar is a slightly clever version of putchar.  It
X *	won't put a space at the beginning of a line, and it
X *	won't put two spaces in a row.
X */
X
Xoutchar (c)
X{
X	static	lastspace = 0,
X		bol = 1;
X
X	if ((lastspace || bol) && c == ' ')
X		return;
X
X	if (c == '\n')
X		bol = 1;
X	else
X		bol = 0;
X
X	if (c == ' ')
X		lastspace = 1;
X	else
X		lastspace = 0;
X
X	putchar (c);
X}
X
X/*
X *	Well, see, I had this bug, and I left my debugger in
X *	Chicago, and...
X */
X
Xdbg (fmt, a1, a2, a3, a4, a5, a6, a7, a8, a9)
X	char *fmt, *a1, *a2, *a3, *a4, *a5, *a6, *a7, *a8, *a9;
X{
X	int tmpdbgflag = dbgflag;
X
X	if (dbgflag > 0)
X	{
X		dbgflag = 0;
X		fprintf (stderr, fmt, a1, a2, a3, a4, a5, a6, a7, a8, a9);
X		dbgflag = tmpdbgflag;
X	}
X}
X
Xchkdbg ()
X{
X	extern char *getenv ();
X	register char *cp;
X
X	if ((dbgflag == 0) && (cp = getenv ("DEBUG=")))
X		dbgflag = atoi (cp);
X}
SHAR_EOF
if test 14405 -ne "`wc -c < 'number.c'`"
then
	echo shar: error transmitting "'number.c'" '(should have been 14405 characters)'
fi
fi # end of overwriting check
if test ! -d 'lang'
then
	echo shar: creating directory "'lang'"
	mkdir 'lang'
fi
echo shar: entering directory "'lang'"
cd 'lang'
echo shar: extracting "'cantonese'" '(386 characters)'
if test -f 'cantonese'
then
	echo shar: will not over-write existing file "'cantonese'"
else
sed 's/^X//' << \SHAR_EOF > 'cantonese'
X#
X#	Cantonese is a chinese dialect, spoken in Hong
X#	Kong, the province of Canton, and in many parts
X#	of the world in Chinese communities.  The following
X#	rules describes how it is spoken, not written.
X#
X0	"ling"
X1	"y[^]at"
X2	"y[_]i"
X3	"s[^]am"
X4	"sei"
X5	"[?]ng"
X6	"l[`]uk"
X7	"ts[^]at"
X8	"b[_]aat"
X9	"g[']ao"
X10	(/ > 1	/) "s[_]ap" %
X100	/ "b[_]aak" %
X1000	/ "tsin" %
X10000	/ "maan" %
SHAR_EOF
if test 386 -ne "`wc -c < 'cantonese'`"
then
	echo shar: error transmitting "'cantonese'" '(should have been 386 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'english'" '(372 characters)'
if test -f 'english'
then
	echo shar: will not over-write existing file "'english'"
else
sed 's/^X//' << \SHAR_EOF > 'english'
X0	"zero"
X1	"one"
X2	"two"
X3	"three"
X4	"four"
X5	"five"
X6	"six"
X7	"seven"
X8	"eight"
X9	"nine"
X/	t	"teen"
X10	"ten"
X11	"eleven"
X12	"twelve"
X13	"thir"t
X14	4t
X15	"fif"t
X16	6t
X17	7t
X18	"eigh"t
X19	9t
X20	"twenty" %
X30	"thirty" %
X40	"forty" %
X50	"fifty" %
X60	6"ty" %
X70	7"ty" %
X80	8"y" %
X90	9"ty" %
X100	/ "hundred" %
X1000	/ "thousand" %
X1000000	/ "million" %
X1000000000	/ "billion" %
SHAR_EOF
if test 372 -ne "`wc -c < 'english'`"
then
	echo shar: error transmitting "'english'" '(should have been 372 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'french'" '(1026 characters)'
if test -f 'french'
then
	echo shar: will not over-write existing file "'french'"
else
sed 's/^X//' << \SHAR_EOF > 'french'
X#
X#	French
X#
X0	zero
X1	"un"
X2	"deux"
X3	"trois"
X4	"quatre"
X5	"cinq"
X6	"six"
X7	"sept"
X8	"huit"
X9	"neuf"
X10	"dix"
X11	"onze"
X12	"douze"
X13	"treize"
X14	"quatorze"
X15	"quinze"
X16	"seize"
X17	"dix-sept"
X18	"dix-huit"
X19	"dix-neuf"
X20	"vingt" R
X30	"trente" R
X40	"quarante" R
X50	"cinquante" R
X60	"soixante" R
X#
X#	For 80, there is no "et" before either 1 or 11.
X#	For 60, there is an "et" before both.
X#
X80	"quatre-vingt"S %
X100	D "cent"* %
X1000	D "mille" %
X1000000	/ "million"* %
X1000000000	/ "milliard"* %
X/	*	(/ > 1	S)
X#
X#	This rule takes care of the "s" after "cent" or "quatre-vingt".
X#	The former takes an "s" when "/" is greater than one, and
X#	there is no remainder.  The latter (naturally!) only when
X#	there is no remainder.
X#
X/	S	(% = 0	"s")
X#
X#	For numbers between 20 and 99, if the remainder is one,
X#	the word "et" (and) is added between the word for the
X#	factor of ten, and the word for the remainder.
X#
X/	R	(% = 1	"et ")(% = 11	"et ")%
X#
X#	For 100 and 1000, / is only written if it is greater
X#	than one.
X#
X/	D	(/ > 1	/)
SHAR_EOF
if test 1026 -ne "`wc -c < 'french'`"
then
	echo shar: error transmitting "'french'" '(should have been 1026 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'german'" '(354 characters)'
if test -f 'german'
then
	echo shar: will not over-write existing file "'german'"
else
sed 's/^X//' << \SHAR_EOF > 'german'
X1	"zero"
X1	"ein"
X2	"zwei"
X3	"drei"
X4	"vier"
X5	"f[:]unf"
X6	"sechs"
X7	"sieben"
X8	"acht"
X9	"neun"
X10	"zehn"
X11	"elf"
X12	"zw[:]olf"
X13	3,10
X14	4,10
X15	5,10
X16	"sech"10
X17	"sieb"10
X18	8,10
X19	9,10
X/	z	"zig"
X20	*"zwant"z
X30	*3z
X40	*4z
X50	*5z
X60	*"sech"z
X70	*"sieb"z
X80	*8z
X90	*9z
X100	/ "hundert" %
X1000	/ "tausend" %
X1000000	/ "million" %
X/	*	(% > 0	% "und ")
SHAR_EOF
if test 354 -ne "`wc -c < 'german'`"
then
	echo shar: error transmitting "'german'" '(should have been 354 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'irish'" '(483 characters)'
if test -f 'irish'
then
	echo shar: will not over-write existing file "'irish'"
else
sed 's/^X//' << \SHAR_EOF > 'irish'
X#
X#	Irish, or Gaellic.  Thanks to Alexis Donnelly for
X#	serving as linguistic informant.
X#
X1	"aon"
X2	"d[']o"
X3	"tr[']i"
X4	"ceathair"
X5	"c[']uig"
X6	"s[']e"
X7	"seacht"
X8	"ocht"
X9	"nao[']i"
X10	(# = 10	"deich")(# > 10	"a" b "d[']eag")
X20	"fiche" b
X30	"tr[']iocha" b
X40	"daichead" c
X50	"caogo" b
X60	"seasca" b
X70	"seachta" b
X80	"ocht[']o" b
X90	"n[']ocha" b
X100	d "c"h"[']ead" c
X1000	d "m[']ile" %
X/	a	(% > 0	" a ")
X/	b	(% = 1	"h")%
X/	c	(% = 1	"ah")%
X/	d	(/ > 1	/)
X/	h	(/ ~ 1	(/ ~ 4	"h"))
SHAR_EOF
if test 483 -ne "`wc -c < 'irish'`"
then
	echo shar: error transmitting "'irish'" '(should have been 483 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'japanese'" '(598 characters)'
if test -f 'japanese'
then
	echo shar: will not over-write existing file "'japanese'"
else
sed 's/^X//' << \SHAR_EOF > 'japanese'
X#
X#	Japanese, written in romanji.  Thanks to Malcolm
X#	Bennett for providing the grammar that I got this
X#	out of, and for debugging it.
X#
X0	"r[']ei"
X1	"ich[']i"
X2	"n[']i"
X3	"san"
X4	"y[']on"
X5	"g[']o"
X6	"rok[']u"
X7	"n[']ana"
X8	"hach[']i"
X9	"ku[']u"
X10	b"j[-]u"a
X100	c
X200	"ni"c
X300	"s[']am"c
X400	"y[']on"c
X500	"go"c
X600	"roppyak[']u" %
X700	"nan[']ahyaku" %
X800	"happyak[']u" %
X1000	m
X2000	"ni"m
X3000	"sanz[']en" %
X4000	"yon"m
X5000	"go"m
X6000	"roku"m
X7000	"nana"m
X8000	"has"m
X9000	"kyu[']"m
X10000	e"m[']an" %
X/	c	"hyak[']u" %
X/	m	"s[']en" %
X/	b	(/ > 1	/)
X/	a	(/ > 1	" ")%
X/	e	(/ = 3	"sam")(/ ~ 3	/)
SHAR_EOF
if test 598 -ne "`wc -c < 'japanese'`"
then
	echo shar: error transmitting "'japanese'" '(should have been 598 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'mandarin'" '(205 characters)'
if test -f 'mandarin'
then
	echo shar: will not over-write existing file "'mandarin'"
else
sed 's/^X//' << \SHAR_EOF > 'mandarin'
X0	"l[']in"
X1	"y[^]i"
X2	"e[`]r"
X3	"s[^]an"
X4	"s[`]i"
X5	"w[?]u"
X6	"l[`]iu"
X7	"q[^]i"
X8	"b[^]a"
X9	"ji[?]u"
X10	(/ > 1	/) "sh[']i" %
X100	/ "b[?]ai" %
X1000	/ "qi[^]an" %
X10000	/ "w[`]an" %
X100000000	/ "y[`]i" %
SHAR_EOF
if test 205 -ne "`wc -c < 'mandarin'`"
then
	echo shar: error transmitting "'mandarin'" '(should have been 205 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'romanian'" '(420 characters)'
if test -f 'romanian'
then
	echo shar: will not over-write existing file "'romanian'"
else
sed 's/^X//' << \SHAR_EOF > 'romanian'
X#
X#	Romanian
X#
X0	"zero"
X1	"unu"
X2	"doi"
X3	"trei"
X4	"patru"
X5	"cinci"
X6	"[,]sase"
X7	"[,]sapte"
X8	"opt"
X9	"nou[)]a"
X10	(% > 0	%"spre")"zece"
X/	z	"zeci" a
X20	"dou[)]a"z
X30	3z
X40	4z
X50	5z
X60	"[,]sai"z
X70	7z
X80	8z
X90	9z
X100	(/ = 1	"o sut[)]a")(/ > 1	g "sute") %
X1000	(/ = 1	"o mie")(/ > 1	g f "mii") %
X1000000	g f "milio"s %
X/	a	(% > 0	"[,]si") %
X/	f	(/ > 19	"de")
X/	g	(/ = 2	"dou[)]a")(/ ~ 2	/)
X/	s	(/ > 1	"ane")(/ = 1	"n")
SHAR_EOF
if test 420 -ne "`wc -c < 'romanian'`"
then
	echo shar: error transmitting "'romanian'" '(should have been 420 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'vietnamese'" '(347 characters)'
if test -f 'vietnamese'
then
	echo shar: will not over-write existing file "'vietnamese'"
else
sed 's/^X//' << \SHAR_EOF > 'vietnamese'
X#
X#	Vietnamese.
X#
X0	"kh[^]ong"
X1	"m[.^]ot"
X2	"hai"
X3	"ba"
X4	"b['^]on"
X5	"n[)]am"
X6	"s[']au"
X7	"b[~]ay"
X8	"t[']am"
X9	"ch[']in"
X10	(/ > 1	/) "m'u~'"(/ = 1	"[`]")"'o~'i" a
X100	/ "tr[)]am" d
X1000	/ "ng[`]an" d
X1000000	/ "tri[^.]eu" d
X/	a	(/ > 1	(% = 1	"m[^']ot")(% > 1	c))(/ = 1	c)
X/	c	(% = 5	"l[)]am")(% ~ 5	%)
X/	d	(% < 10	(% > 0	(L = 0	"l[~]e"))) %
SHAR_EOF
if test 347 -ne "`wc -c < 'vietnamese'`"
then
	echo shar: error transmitting "'vietnamese'" '(should have been 347 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'malinke'" '(461 characters)'
if test -f 'malinke'
then
	echo shar: will not over-write existing file "'malinke'"
else
sed 's/^X//' << \SHAR_EOF > 'malinke'
X#
X#	Malinke is spoken is western Africa.  It is known as
X#	either malinke, maninka, mandingo or djula.  This
X#	grammar thanks to Ibrahima Sakho, who served as
X#	linguistic informant.
X#
X0	"foi"
X1	"k[']el[']en"
X2	"fila"
X3	"saba"
X4	"nani"
X5	"lolu"
X6	"w[:]or[:]o"
X7	"w[:]or[:]owila"
X8	"s[']egin"
X9	"k[:]on[:]ond[:]o"
X10	t
X100	h
X1000	"wa" / r
X10000	"wa" t
X100000	"wa" h
X/	r	(% > 0	"ni" %)
X/	t	(/ = 1	"tan")(/ = 2	"mugan")(/ > 2	"bi"/) r
X/	h	"k[`]em[`]e" (/ > 1	/) r
SHAR_EOF
if test 461 -ne "`wc -c < 'malinke'`"
then
	echo shar: error transmitting "'malinke'" '(should have been 461 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'spanish'" '(670 characters)'
if test -f 'spanish'
then
	echo shar: will not over-write existing file "'spanish'"
else
sed 's/^X//' << \SHAR_EOF > 'spanish'
X#
X#	Spanish.  Thanks to Hugo Ramos for serving as linguistic
X#	informant.
X#
X0	"cero"
X1	"uno"
X2	"dos"
X3	"tres"
X4	"cuatro"
X5	"cinco"
X6	"seis"
X7	"siete"
X8	"ocho"
X9	"nueve"
X10	"diez"
X11	"once"
X12	"doce"
X13	"trece"
X14	"catorce"
X15	"quince"
X/	d	"dieci"
X16	d6
X17	d7
X18	d8
X19	d9
X20	(% = 0	"veinte")(% > 0	"venti"%)
X30	"treinta" r
X/	r	(% > 0	"y" %)
X/	a	"enta" r
X40	"cuar"a
X50	"cincu"a
X60	"ses"a
X70	"set"a
X80	"och"a
X90	"nov"a
X100	(/ = 2	"do")\
X	(/ = 3	"tre")\
X	(/ = 4	/)\
X	(/ = 5	"quin")\
X	(/ = 6	"sei")\
X	(/ = 7	"sete")\
X	(/ = 8	/)\
X	(/ = 9	"nove")\
X	(/ ~ 5	"c")"ien"(/ > 1	"tos")(/ = 1	(% > 0	"to")) %
X1000	(/ > 1	/) "mil" %
X1000000	(/ = 1	"un")(/ ~ 1	/) "millon"(/ > 1	"es") %
SHAR_EOF
if test 670 -ne "`wc -c < 'spanish'`"
then
	echo shar: error transmitting "'spanish'" '(should have been 670 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'pular'" '(384 characters)'
if test -f 'pular'
then
	echo shar: will not over-write existing file "'pular'"
else
sed 's/^X//' << \SHAR_EOF > 'pular'
X#
X#	Pular is spoken in western Africa.  Also called Fulani.
X#
X0	"fus"
X1	"g[:]o"
X2	"dhidhi"
X3	"tati"
X4	"nai"
X5	(# = 5	"dyowi")(# ~ 5	"dy[`]e"%)
X10	t R
X100	h R
X1000	(/ = 1	"wulur[`]e")(/ > 1	g /) R
X10000	g t R
X100000	g h R
X/	g	"guludy[`]e"
X/	t	(/ = 1	"sap[:]o")\
X	(/ = 2	"n[:]ogai")\
X	(/ > 2	"tyapannd[`]e" /)
X/	h	"t[`]em[`]e"(/ = 1	"d[`]er[`]e")(/ > 1	"dh[`]e "/)
X/	R	(% > 0	"[']e" %)
SHAR_EOF
if test 384 -ne "`wc -c < 'pular'`"
then
	echo shar: error transmitting "'pular'" '(should have been 384 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'susu'" '(342 characters)'
if test -f 'susu'
then
	echo shar: will not over-write existing file "'susu'"
else
sed 's/^X//' << \SHAR_EOF > 'susu'
X#
X#	Susu is spoken in Guinea.  Also called soso.
X#
X1	"k[']er[']en"
X2	"firin"
X3	"sakhan"
X4	"nani"
X5	"suli"
X6	"s[']enni"
X7	"solof[']er[']e"
X8	"solomasakhan"
X9	"solomanani"
X10	t R
X100	h R
X1000	"wulu" / R
X10000	"wulu" t R
X100000	"wulu" h R
X/	t	(/ = 1	"fu")\
X	(/ = 2	"mukhain")\
X	(/ > 2	"tongo" /) R
X/	h	"k[`]em[`]e" (/ > 1	/)
X/	R	(% > 0	"nun" %)
SHAR_EOF
if test 342 -ne "`wc -c < 'susu'`"
then
	echo shar: error transmitting "'susu'" '(should have been 342 characters)'
fi
fi # end of overwriting check
echo shar: linking "'chinese'" to "'mandarin'"
if test -f 'chinese'
then
	echo shar: will not over-write existing file "'chinese'"
else
	ln 'mandarin' 'chinese'
fi # end of overwriting check
echo shar: extracting "'italian'" '(531 characters)'
if test -f 'italian'
then
	echo shar: will not over-write existing file "'italian'"
else
sed 's/^X//' << \SHAR_EOF > 'italian'
X#
X#	Italian.  This grammar thanks to a week and a half
X#	in Italy, and a Berlitz phrase book!
X#
X0	"zero"
X1	"uno"
X2	"due"
X3	"tre"
X4	"quattro"
X5	"cinque"
X6	"sei"
X7	"sette"
X8	"otto"
X9	"nove"
X10	"dieci"
X11	"un"d
X12	"do"d
X13	3d
X14	4d
X15	"quin"d
X16	"se"d
X17	d"a"7
X18	d8
X19	d"a"9
X20	"vent"ir
X/	d	"dici"
X/	r	(% > 0	%)
X/	i	(% ~ 1	(% ~ 8	"i"))
X/	a	(% ~ 1	"a")
X30	"trent"ar
X40	"quarant"ar
X50	"cinquant"ar
X60	"sessant"ar
X70	"septant"ar
X80	"ottant"ar
X90	"novant"ar
X100	(/ > 1	/)"cento"%
X1000	(/ = 1	"mille")(/ > 1	/"mila")%
X1000000	/"milione"%
SHAR_EOF
if test 531 -ne "`wc -c < 'italian'`"
then
	echo shar: error transmitting "'italian'" '(should have been 531 characters)'
fi
fi # end of overwriting check
echo shar: linking "'gaellic'" to "'irish'"
if test -f 'gaellic'
then
	echo shar: will not over-write existing file "'gaellic'"
else
	ln 'irish' 'gaellic'
fi # end of overwriting check
echo shar: extracting "'esperanto'" '(158 characters)'
if test -f 'esperanto'
then
	echo shar: will not over-write existing file "'esperanto'"
else
sed 's/^X//' << \SHAR_EOF > 'esperanto'
X#
X#	Esperanto
X#
X0	"zero"
X1	"unu"
X2	"du"
X3	"tri"
X4	"kvar"
X5	"kvin"
X6	"ses"
X7	"sep"
X8	"ok"
X9	"na[bre]u"
X10	d"dek" %
X100	d "cent" %
X/	d	(/ > 1	/)
X1000	d "mil" %
SHAR_EOF
if test 158 -ne "`wc -c < 'esperanto'`"
then
	echo shar: error transmitting "'esperanto'" '(should have been 158 characters)'
fi
fi # end of overwriting check
echo shar: done with directory "'lang'"
cd ..
#	End of shell archive
exit 0

-- 

Rich $alz
Cronus Project, BBN Labs			rsalz@bbn.com
Moderator, comp.sources.unix			sources@uunet.uu.net