[comp.sources.misc] Ispell Version 2.0 Beta Part 01/04

allbery@ncoast.UUCP (05/31/87)

[Since this is a beta release, it is here.  The final version will show up in
comp.sources.unix with the other good stuff.  ++bsa]

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	UPDATE2
#	Makefile
#	Othercap.txt
#	README
#	UPDATE
#	WISHES
#	buildhash.c
#	config.X
#	fixdict.sh
# This archive created: Sat May 30 17:13:25 1987
export PATH; PATH=/bin:$PATH
echo shar: extracting "'UPDATE2'" '(6964 characters)'
if test -f 'UPDATE2'
then
	echo shar: will not over-write existing file "'UPDATE2'"
else
sed 's/^X //' << \SHAR_EOF > 'UPDATE2'
X This is the beta-test release of ispell version 2.0.  As I discussed in
X a previous comp.sources.d posting, I will collect bug fixes for this version,
X and then post a final version with dictionary to mod.sources, at which time
X I will wash my hands of the bloody thing.
X 
X Because I am short on time, I can only promise to integrate bug fixes.  If
X you send me improvements, they will very likely disappear into a black hole.
X Sorry, but it takes time to integrate every change, even the ones I can't test
X because they're for BSD.  If you plan on hacking extensively, I'd suggest
X waiting for the mod.sources posting, or you may have to repeat some work.
X 
X Send bug reports/fixes to:
X 
X 	Geoff Kuenning   geoff@ITcorp.com   {uunet,trwrb}!desint!geoff
X ---------------------------------------------------------------------------
X INSTRUCTIONS:
X 
X In response to many requests, this posting contains all sources except the
X dictionary.  Since shar won't overwrite files and some file names have
X changed, you should unshar it in an empty directory.  If you installed my
X previous posting, you may also want to remove expand[12].sed from
X /usr/public/lib, since these scripts have been renamed to to isexp[1-4].sed.
X 
X Once you have unpacked, edit "Makefile" and "config.X" according to the
X comments in each.  Note that the Makefile edits "config.X" further to
X produce "config.h".  Then type "make install" and go away for a while
X (if you're brave and foolish;  otherwise do the equivalent more carefully).
X 
X If you don't already have a dictionary, please don't ask me for one.  Ask
X a neighbor.  If they don't have one, and you can't make one from
X /usr/dict/words or /usr/dict/web2 by running it through "munchlist",
X try running a bunch of text files through "makedict.sh".  (It depends on
X UNIX spell, though in a pinch you can do without if your source files are
X very good).  If all else fails, you'll just have to wait for the mod.sources
X posting.
X 
X If you do have a dictionary, and you would like to use the new CAPITALIZE
X feature, you will have to convert you dictionary.  If you have UNIX spell,
X the "fixdict.sh" script will do this for you, without violating any
X copyrights or license restrictions.  This script replaces the current
X dictionary, and writes a (short) list of questionable capitalizations
X to standard output;  these must be analyzed and, if necessary, corrected
X by hand.  The file "Othercap.txt" (included in this posting) contains
X words that are in dict.191 which will be missed by "fixdict.sh" with
X the standard UNIX spell program.
X 
X Problems fixed in this posting:
X 
X     (1) Ispell did not duplicate the permissions on the files it edited.
X 	(David Neves)
X     (2) The actual maximum number of possible corrections was 99, not 100.
X     (3) Ispell assumed a terminal width of 80 columns, rather than
X 	consulting the termcap entry.
X     (4) Long lines could wrap around on the terminal, damaging the
X 	display.
X     (5) The includes of types.h and param.h need to be interchanged on
X 	BSD systems.  (Ken Yap, Jacob Gore)
X     (6) The givehelp() routine now actually waits for a space to be typed
X 	like it claims, instead of just waiting for any character.  (Steve
X 	Kelem)
X     (7) Good.c was missing a declaration of the index() (strchr) routine.
X     (8) The excessive strlen() calls in good.c have been removed, and
X 	register declarations have been added.  (Joe Orost, Rich Salz)
X     (9) Some systems get "multiple symbol definition" messages when
X 	linking (Joe Orost).
X    (10) Expand[12].sed didn't handle new-format dictionaries.
X    (11) Some minor errors in the usage message have been corrected.
X    (12) If a space (or other non-word character) is inserted using "R",
X         ispell would treat the entire replacement string as a token
X         and try to find it in the dictionary.
X    (13) Ispell now follows the proper UNIX procedure for signal catching
X 	(i.e., it doesn't catch SIGINT if it's run in background).
X    (14) The handling of process stopping on BSD systems has been cleaned
X 	up and made to work right (Mark Davies).
X 
X Improvements added in this posting:
X 
X     (1) Ispell's handling of troff size and font requests has again been
X 	improved.  (Isaac Balbin, Steve Kelem, Joe Orost) (Everybody seems
X 	to fix the particular problem that bothers their individual world :-).
X     (2) If ispell is run on a file with an extension of ".tex", it will
X 	automatically go into TeX mode for this and subsequent files.
X 	(Steve Kelem)
X     (3) The emacs support now includes "ispell-buffer", and ispell is run
X 	from "ispell-program-name" so you can specify an explicit path.
X 	(Stewart Clamen)
X     (4) There is a TERM_MODE configuration option so you can choose between RAW
X 	and CBREAK modes.  The default has been changed to CBREAK (it used
X 	to be RAW) to preserve parity.  (Joe Orost)
X     (5) Term.c will now compile on V7 systems (Joe Orost)
X     (6) Register declarations have been added throughout.  (Joe Orost)
X     (7) Ispell now buffers stdout, improving display performance slightly.
X     (8) The backup file extension is now configurable (George Sipe).
X     (9) All config.X definitions except MAGIC can be overridden with -D
X 	switches (George Sipe).
X    (10) There is now a version.h file, so you will know what version you
X 	have (I guess Larry Wall deserves credit.  Even though he didn't
X 	harass me, guilt set in).  There is also a -v switch to print the
X 	version information.
X    (11) (This was a lot tougher that I expected).  Ispell now knows about
X 	capitalization and proper names (yay).  It recognizes four flavors
X 	of words:  lowercase, capitalized, all-capitals, and "followcase".
X 	If a word appears in the dictionary in lowercase, it is accepted
X 	in lowercase, capitalized, or all-capitals.  If it is capitalized
X 	in the dictionary, all-lowercase is disallowed.  If it is all-caps
X 	in the dictionary, it must always appear in all caps.  Finally,
X 	if the word has "weird" capitalization (like the name of my company,
X 	ITcorp or ITCorp), either that capitalization must be followed
X 	*exactly* or else the word must appear in all-caps.  More than
X 	one of these variants may occur;  "munchlist" will remove unneeded
X 	ones from a dictionary.  Finally, if you blow capitalization,
X 	ispell will offer a list of correctly-capitalized alternatives.
X 	Because it increases the size of the hash file, this feature is
X 	optional (see the CAPITALIZE option in config.X).
X    (12) A new shell script ("fixdict.sh") is provided to aid in converting
X 	your old dictionary to provide capitalization information.
X    (13) Buildhash now pads the string table to a "struct dent" boundary
X 	in the hash file, so that it will be aligned when reading in.  On
X 	many machines, this will speed startup.
X    (14) The -w option now accepts characters specified in octal with
X 	backslashes like any other UNIX program, as well as the previous
X 	decimal option, and it will also accept numeric strings of less
X 	than three digits.
X    (15) The ispell.el file now supports ispell-region and ispell-buffer.
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'Makefile'" '(2252 characters)'
if test -f 'Makefile'
then
	echo shar: will not over-write existing file "'Makefile'"
else
sed 's/^X //' << \SHAR_EOF > 'Makefile'
X # -*- Mode: Text -*-
X 
X # Look over config.X before building.
X #
X # You may want to edit BINDIR, LIBDIR, DEFHASH, DEFDICT, MAN1DIR, MAN4DIR
X # MAN1EXT, MAN4EXT, and TERMLIB below;
X # the Makefile will update all other files to match.
X #
X # On USG systems, add -DUSG to CFLAGS.
X #
X # The ifdef NO8BIT may be used if 8 bit extended text characters
X # cause problems, or you simply don't wish to allow the feature.
X #
X # the argument syntax for buildhash to make alternate dictionary files
X # is simply:
X #
X #   buildhash <infile> <outfile>
X 
X CC = lcc -v -HL -HD -R tgetflag
X CFLAGS = -n -O -DUSG
X # BINDIR, LIBDIR, DEFHASH, DEFDICT, MAN1DIR, MAN4DIR, MAN1EXT, MAN4EXT,
X # TERMLIB
X BINDIR = /usr/sahbin
X LIBDIR = /tmp2/lib
X DEFHASH = ispell.hash
X DEFDICT = dict.191
X MAN1DIR	= /usr/man/u_man/man1
X MAN4DIR	= /usr/man/u_man/man4
X MAN1EXT	= .1l
X MAN4EXT	= .4l
X # TERMLIB = -lcurses
X TERMLIB = -ltermcap
X 
X SHELL = /bin/sh
X 
X all: buildhash ispell icombine munchlist isexpand $(DEFHASH)
X 
X ispell.hash: buildhash $(DEFDICT)
X 	./buildhash $(DEFDICT) $(DEFHASH)
X 
X install: all
X 	cp ispell isexpand munchlist $(BINDIR)
X 	cp ispell.hash $(LIBDIR)/$(DEFHASH)
X 	cp expand1.sed expand2.sed icombine $(LIBDIR)
X 	chmod 755 $(BINDIR)/ispell $(BINDIR)/munchlist $(BINDIR)/isexpand \
X 	  $(LIBDIR)/icombine
X 	chmod 644 $(LIBDIR)/$(DEFHASH) $(LIBDIR)/expand1.sed \
X 	  $(LIBDIR)/expand2.sed
X 	cp ispell.1 $(MAN1DIR)/ispell$(MAN1EXT)
X 	cp ispell.4 $(MAN4DIR)/ispell$(MAN4EXT)
X 
X buildhash: buildhash.o hash.o
X 	$(CC) $(CFLAGS) -o buildhash buildhash.o hash.o
X 
X icombine:	icombine.c config.h ispell.h
X 	$(CC) $(CFLAGS) -o icombine icombine.c
X 
X munchlist:	munchlist.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' -e 's@!!DEFDICT!!@$(DEFDICT)@' \
X 		<munchlist.X >munchlist
X 	chmod +x munchlist
X 
X isexpand:	isexpand.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' isexpand.X >isexpand
X 	chmod +x isexpand
X 
X OBJS=ispell.o term.o good.o lookup.o hash.o tree.o xgets.o
X 
X ispell: $(OBJS)
X 	cc $(CFLAGS) -o ispell $(OBJS) $(TERMLIB)
X 
X $(OBJS) buildhash.o: config.h ispell.h
X ispell.o: version.h
X 
X config.h:	config.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' -e 's@!!DEFDICT!!@$(DEFDICT)@' \
X 	    -e 's@!!DEFHASH!!@$(DEFHASH)@' <config.X >config.h
X 
X clean:
X 	rm -f *.o buildhash ispell core a.out mon.out hash.out \
X 		*.stat *.cnt munchlist config.h
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'Othercap.txt'" '(106 characters)'
if test -f 'Othercap.txt'
then
	echo shar: will not over-write existing file "'Othercap.txt'"
else
sed 's/^X //' << \SHAR_EOF > 'Othercap.txt'
X Airedale
X Alcibiades
X Argo
X Argos
X Arianist
X Arianists
X Auckland
X CDR
X Ethernet
X Ethernet's
X Ethernets
X MIT's
X Sikkim
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'README'" '(6256 characters)'
if test -f 'README'
then
	echo shar: will not over-write existing file "'README'"
else
sed 's/^X //' << \SHAR_EOF > 'README'
X -*- Mode:Text -*-
X 
X Ispell consists of two programs: the actual spelling checker, "ispell",
X and the hash table builder, "buildhash".  Everything is set up so you
X can just say "make install" and have everything happen.  You might want
X to edit the makefile, and ispell.h to change the destination of the
X program and the hash table.
X 
X The dictionary comes from the ITS spell dictionary.  I got it from
X "ml:wba;dict 191", although I don't know that this is the copy currenty
X in use on the 20's around MIT.
X 
X ----------------------------------------------------------------------
X 
X Addendum:
X 
X My eternal gratitude to the author of ispell -- I don't know how I
X ever lived without it.  I received his permission to post ispell to
X the net and have added a GNU EMACS interface.  Look in the file
X ispell.el for installation instructions.
X 
X As far as I know, no one informally "supports" this program.  If you
X would like to "adopt" it (collect fixes/enhancements and post a new
X version periodically), feel free to do so.
X 
X I volunteer to collect dictionary diffs and post a composite diff
X periodically.  If you add a lot of words to the main dictionary, send
X me the diffs between the the modified dictionary and the posted one.
X Also, if you have access to a TOPS20 system with a more complete
X dictionary in ispell format, send me the diffs if you can.  Just
X PLEASE don't dump an entire dictionary to our site!
X 
X The dictionary posted is one I snarfed from around here -- after
X comparison with the one originally supplied, ours appears a tad more
X complete and accurate.
X 
X Walt Buehring
X Texas Instruments - Computer Science Center
X 
X ARPA:  Buehring%TI-CSL@CSNet-Relay
X UUCP:  {smu, texsun, im4u, rice} ! ti-csl ! buehring
X 
X ----------------------------------------------------------------------
X 
X The following is the only documentation I could find about the format
X of the dictionary.  It was written for the TOPS20 speller that ispell
X mimics, so I believe most the information is applicable.  It should be
X useful if you want to add words to the main dictionary by hand.  -WB
X 
X ----------------------------------------------------------------------
X 
X 11.6  Dictionary flags
X 
X      Words  in SPELL's main dictionary (but not the other dictionaries) may
X have flags associated with  them  to  indicate  the  legality  of  suffixes
X without  the  need  to keep the full suffixed words in the dictionary.  The
X flags have "names" consisting of single  letters.    Their  meaning  is  as
X follows:
X 
X Let  #  and  @  be  "variables"  that can stand for any letter.  Upper case
X letters are constants.  "..."  stands  for  any  string  of  zero  or  more
X letters,  but note that no word may exist in the dictionary which is not at
X least 2 letters long, so, for example, FLY may not be produced  by  placing
X the  "Y"  flag  on "F".  Also, no flag is effective unless the word that it
X creates is at least 4 letters  long,  so,  for  example,  WED  may  not  be
X produced by placing the "D" flag on "WE".
X 
X "V" flag:
X         ...E --> ...IVE  as in CREATE --> CREATIVE
X         if # .ne. E, ...# --> ...#IVE  as in PREVENT --> PREVENTIVE
X 
X "N" flag:
X         ...E --> ...ION  as in CREATE --> CREATION
X         ...Y --> ...ICATION  as in MULTIPLY --> MULTIPLICATION
X         if # .ne. E or Y, ...# --> ...#EN  as in FALL --> FALLEN
X 
X "X" flag:
X         ...E --> ...IONS  as in CREATE --> CREATIONS
X         ...Y --> ...ICATIONS  as in MULTIPLY --> MULTIPLICATIONS
X         if # .ne. E or Y, ...# --> ...#ENS  as in WEAK --> WEAKENS
X 
X "H" flag:
X         ...Y --> ...IETH  as in TWENTY --> TWENTIETH
X         if # .ne. Y, ...# --> ...#TH  as in HUNDRED --> HUNDREDTH
X 
X "Y" FLAG:
X         ... --> ...LY  as in QUICK --> QUICKLY
X 
X "G" FLAG:
X         ...E --> ...ING  as in FILE --> FILING
X         if # .ne. E, ...# --> ...#ING  as in CROSS --> CROSSING
X 
X "J" FLAG"
X         ...E --> ...INGS  as in FILE --> FILINGS
X         if # .ne. E, ...# --> ...#INGS  as in CROSS --> CROSSINGS
X 
X "D" FLAG:
X         ...E --> ...ED  as in CREATE --> CREATED
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IED  as in IMPLY --> IMPLIED
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ED  as in CROSS --> CROSSED
X                                 or CONVEY --> CONVEYED
X "T" FLAG:
X         ...E --> ...EST  as in LATE --> LATEST
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IEST  as in DIRTY --> DIRTIEST
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#EST  as in SMALL --> SMALLEST
X                                 or GRAY --> GRAYEST
X 
X "R" FLAG:
X         ...E --> ...ER  as in SKATE --> SKATER
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IER  as in MULTIPLY --> MULTIPLIER
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ER  as in BUILD --> BUILDER
X                                 or CONVEY --> CONVEYER
X 
X "Z FLAG:
X         ...E --> ...ERS  as in SKATE --> SKATERS
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IERS  as in MULTIPLY --> MULTIPLIERS
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ERS  as in BUILD --> BUILDERS
X                                 or SLAY --> SLAYERS
X 
X "S" FLAG:
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IES  as in IMPLY --> IMPLIES
X         if # .eq. S, X, Z, or H,
X                 ...# --> ...#ES  as in FIX --> FIXES
X         if # .ne. S, X, Z, H, or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#S  as in BAT --> BATS
X                                 or CONVEY --> CONVEYS
X 
X "P" FLAG:
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@INESS  as in CLOUDY --> CLOUDINESS
X         if # .ne. Y, or @ = A, E, I, O, or U,
X                 ...@# --> ...@#NESS  as in LATE --> LATENESS
X                                 or GRAY --> GRAYNESS
X 
X "M" FLAG:
X         ... --> ...'S  as in DOG --> DOG'S
X 
X ----------------------------------------------------------------------
X 
X [Whew!  That's all very nice, but how about a quick reference...  -WB]
X 
X V -  ive
X N -  ion, tion, en
X X -  ions, ications, ens
X H -  th, ieth
X Y -  ly
X G -  ing
X J -  ings
X D -  ed
X T -  est
X R -  er
X Z -  ers
X S -  s, es, ies
X P -  ness, iness
X M -  's
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'UPDATE'" '(5252 characters)'
if test -f 'UPDATE'
then
	echo shar: will not over-write existing file "'UPDATE'"
else
sed 's/^X //' << \SHAR_EOF > 'UPDATE'
X 		Ispell enhancements - 3/13/87
X 
X (See three companion postings in net.sources.bugs).
X 
X Here are the enhancements to ispell that I mentioned a couple of days ago.
X Because of the number of changes, several of the context diff's are bigger
X than the original files.  In addition, many people have gotten confused
X about versions, since enhancements/fixes have been made by six different
X people, counting myself (for the list, see the end of ispell.man).  I
X have integrated all of these fixes and enhancements in one place.
X 
X For these reasons, I have decided to repost all of the sources for ispell,
X with one exception -- the dictionary.  (A couple of small files, such
X as ispell.el, are unchanged, but I decided to repost them any for
X completeness.  If you didn't have ispell before, you now need only the
X dictionary).
X 
X The dictionary is a special case:  if you think about it, even ordinary
X diff's will always work with "patch" on that each-line-is-unique file.
X An out-of-place insertion can be corrected by sorting the dictionary
X after patching (something that is done anyway as a side effect of the
X new "munchlist" script).  Because of this, I have decided not to repost
X the sizable dictionary.  In the process of testing this code, it occurred
X to me to run dict.191 through UNIX "spell";  the results of that are
X given in three companion postings in net.sources.bugs, which seemed
X like a more appropriate place for the diffs.  (The postings are not
X divided because of their size;  see comments in the postings for my
X reasons).
X 
X Now, here's what I've done:
X 
X In ispell itself:
X 
X 	- The personal dictionary is now hashed, just like the main one, and
X 	  supports suffixes just like the main one.  (It's not actually
X 	  integrated with the main one, because expanding the main one
X 	  is inefficient and poses a minor but troublesome technical
X 	  problem).  A personal dictionary of 28000+ words can be read in
X 	  within a few minutes (hey, nobody's perfect -- whatcha doing
X 	  with such a big dictionary anyway? :-).
X 	- New option "-c" is used by the new munchlist script to generate
X 	  suggested root/suffix combinations.
X 	- The -d option can now specify /dev/null, if you want to use
X 	  only your personal dictionary (this also saves startup time
X 	  with -c, and is used by the "munchlist" script, which is why
X 	  I put it in).
X 	- The -p option is now more flexible about its handling of pathnames.
X 	  An absolute pathname is always interpreted literally.  A
X 	  relative pathname from WORDLIST is looked up in $HOME first,
X 	  then in the current directory.  The -p option behaves in the
X 	  reverse fashion:  current directory first, then $HOME.  This
X 	  behavior seems more intuitive to me;  I'd be interested in
X 	  opinions of others if you don't find it intuitive.
X 	- Perhaps most important, I have completely overhauled the logic
X 	  in good.c, so that it (I think) matches what the README file
X 	  says it should, no more, no less.  The code has been extensively
X 	  tested, notably by interaction with the new expansion scripts;
X 	  nevertheless because of the extent of the changes and the
X 	  nature of the logic, I'd suggest a bit of suspicion for a while.
X 	  A technique we've found useful here is to do your normal work
X 	  with ispell, and then do a final check with UNIX spell or some
X 	  other slow, inconvenient program to make sure ispell didn't
X 	  screw up.
X 
X New scripts:
X 
X 	- expand.awk:  an obsolete (but correct) awk script that does
X 	  the same thing as expand[12].sed, except slower.  The awk
X 	  script is also much easier to understand than the sed scripts.
X 	  Superseded by the sed scripts, except for very short input.
X 	- expand[12].sed:  the sed pipe
X 
X 	    "sed -f expand1.sed $file | sed -f expand2.sed"
X 
X 	  where "$file" is a raw dictionary file with suffixes
X 	  (e.g., dict.191), generates a list of each root alone, plus
X 	  the root expanded with each possible suffix (e.g.,
X 	  "BOTH/R/Z" produces "BOTH", "BOTHER", and "BOTHERS").  The
X 	  output should usually be sorted with the -u switch before
X 	  further processing.  These scripts are used by 'munchlist';
X 	  they are also useful for (a) checking an ispell dictionary
X 	  with some other spell-checking program and (b) figuring
X 	  out what a particular suffix does to a certain word without
X 	  reading the README file.
X 	- munchlist.sh:  a slow, but effective, shell script that takes
X 	  lists of expanded or unexpanded words as input and reduces
X 	  them to a (usually smaller) list of roots and suffixes.  The
X 	  result is written to standard output.  I think the documentation
X 	  forgot to mention the input must be one word per line.  I
X 	  have successfully used this script to combine dict.191 with
X 	  /usr/dict/words;  it's also useful (and a lot faster) on
X 	  private dictionaries.  For private dictionaries. it will also
X 	  remove any word that has since been added to the main dictionary.
X 
X Oh yes, I almost forgot:  the original documentation didn't mention
X that ispell is a long-name program.  If your "File:" display on the
X top line actually contains the misspelled word, you have long-name problems.
X My fixes don't address long names, because I finally have a way to
X compile long-name programs, thanks to "hash8".
X 
X 	Geoff Kuenning
X 	geoff@ITcorp.COM
X 	...!trwrb!desint!geoff
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'WISHES'" '(2317 characters)'
if test -f 'WISHES'
then
	echo shar: will not over-write existing file "'WISHES'"
else
sed 's/^X //' << \SHAR_EOF > 'WISHES'
X Things remaining to be done to ispell:
X 
X 	- The "munchlist" script can actually increase the size of a
X 	  dictionary.  For example, munching dict.191 (after my bug fixes
X 	  to it) reduced the number of words by about 40, but increased
X 	  the number of characters by a small percentage.  This is
X 	  because munchlist doesn't recognize duplicate suffixes that
X 	  generate the same result, except for the three special
X 	  "s-ending" suffixes J, Z, and X.  For example, right now
X 	  munchlist will make BATHER by adding the R suffix to both
X 	  BATH and BATHE.  It would be nice if munchlist could recognize
X 	  the redundancy and reduce its output so that each word was made
X 	  in only one way.
X 	- The characters in the -w option should be written to the header
X 	  of the hash file, and to a header in the personal dictionary,
X 	  so the user doesn't have to remember to specify them every time.
X 	- Buildhash should support the -w option.
X 	- Buildhash, munchlist, icombine, and the expand scripts should use
X 	  a character other than slash for the flag separator, so that slashes
X 	  are available to the -w option.  I tend to lean towards commas.
X 	- It might be nice to support multiple personal dictionaries.  On
X 	  the other hand, it's pretty easy to combine them with "cat".
X 	- Good.c should be table-driven, so that it is easier to modify for
X 	  other languages.  Ideally, it would support prefixes as well.
X 	- A small amount of string space could be saved if buildhash would
X 	  combine strings with common suffixes (e.g., "and" could be stored
X 	  as a pointer to the tail of "bland").
X 	- (Peter Wan) Ispell should have a "server mode" for large sites, to
X 	  get rid of the time needed to read in the dictionary.  On System V,
X 	  this could be accomplished by having the first execution of ispell
X 	  read the dictionary into a shared-memory region.  Later incarnations
X 	  would then get the dictionary by just attaching to the region.
X 	  One problem would be that the dictionary gets modified during
X 	  the run, so you might still have to do a memory-to-memory copy
X 	  after the attach.  The size of having two copies of the dictionary
X 	  might prohibit this on many machines.  Another approach is a
X 	  message-based "good.c server", but this too would have to deal
X 	  with the possibility of modifiying the dictionary.
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'buildhash.c'" '(10320 characters)'
if test -f 'buildhash.c'
then
	echo shar: will not over-write existing file "'buildhash.c'"
else
sed 's/^X //' << \SHAR_EOF > 'buildhash.c'
X /* -*- Mode: Text -*- */
X 
X #define MAIN
X 
X /*
X  * buildhash.c - make a hash table for ispell
X  *
X  * Pace Willisson, 1983
X  */
X 
X #include <ctype.h>
X #include <stdio.h>
X #ifdef USG
X #include <sys/types.h>
X #endif
X #include <sys/param.h>
X #include <sys/stat.h>
X #include "config.h"
X #include "ispell.h"
X 
X #define NSTAT 100
X struct stat dstat, cstat;
X 
X int numwords, hashsize;
X 
X char *malloc();
X char *realloc ();
X 
X struct dent *hashtbl;
X 
X char *Dfile;
X char *Hfile;
X 
X char Cfile[MAXPATHLEN];
X char Sfile[MAXPATHLEN];
X 
X main (argc,argv)
X int argc;
X char **argv;
X {
X 	FILE *countf;
X 	FILE *statf;
X 	int stats[NSTAT];
X 	int i;
X 
X 	if (argc > 1) {
X 		++argv;
X 		Dfile = *argv;
X 		if (argc > 2) {
X 			++argv;
X 			Hfile = *argv;
X 		}
X 		else
X 			Hfile = DEFHASH;
X 	}
X 	else {
X 		Dfile = DEFDICT;
X 		Hfile = DEFHASH;
X 	}
X 
X 	sprintf(Cfile,"%s.cnt",Dfile);
X 	sprintf(Sfile,"%s.stat",Dfile);
X 
X 	if (stat (Dfile, &dstat) < 0) {
X 		fprintf (stderr, "No dictionary (%s)\n", Dfile);
X 		exit (1);
X 	}
X 
X 	if (stat (Cfile, &cstat) < 0 || dstat.st_mtime > cstat.st_mtime)
X 		newcount ();
X 
X 	if ((countf = fopen (Cfile, "r")) == NULL) {
X 		fprintf (stderr, "No count file\n");
X 		exit (1);
X 	}
X 	numwords = 0;
X 	fscanf (countf, "%d", &numwords);
X 	fclose (countf);
X 	if (numwords == 0) {
X 		fprintf (stderr, "Bad count file\n");
X 		exit (1);
X 	}
X 	hashsize = numwords;
X 	readdict ();
X 
X 	if ((statf = fopen (Sfile, "w")) == NULL) {
X 		fprintf (stderr, "Can't create %s\n", Sfile);
X 		exit (1);
X 	}
X 
X 	for (i = 0; i < NSTAT; i++)
X 		stats[i] = 0;
X 	for (i = 0; i < hashsize; i++) {
X 		struct dent *dp;
X 		int j;
X 		if (hashtbl[i].used == 0) {
X 			stats[0]++;
X 		} else {
X 			for (j = 1, dp = &hashtbl[i]; dp->next != NULL; j++, dp = dp->next)
X 				;
X 			if (j >= NSTAT)
X 				j = NSTAT - 1;
X 			stats[j]++;
X 		}
X 	}
X 	for (i = 0; i < NSTAT; i++)
X 		fprintf (statf, "%d: %d\n", i, stats[i]);
X 	fclose (statf);
X 
X 	filltable ();
X 
X 	output ();
X 	exit(0);
X }
X 
X output ()
X {
X 	FILE *outfile;
X 	struct hashheader hashheader;
X 	int strptr, n, i;
X 
X 	if ((outfile = fopen (Hfile, "w")) == NULL) {
X 		fprintf (stderr, "can't create %s\n",Hfile);
X 		return;
X 	}
X 	hashheader.magic = MAGIC;
X 	hashheader.stringsize = 0;
X 	hashheader.tblsize = hashsize;
X 	fwrite (&hashheader, sizeof hashheader, 1, outfile);
X 	strptr = 0;
X 	for (i = 0; i < hashsize; i++) {
X 		n = strlen (hashtbl[i].word) + 1;
X #ifdef CAPITALIZE
X 		if (hashtbl[i].followcase)
X 			n += (hashtbl[i].word[n] & 0xFF) * (n + 1) + 1;
X #endif
X 		fwrite (hashtbl[i].word, n, 1, outfile);
X 		hashtbl[i].word = (char *)strptr;
X 		strptr += n;
X 	}
X 	/* Pad file to a struct dent boundary for efficiency. */
X 	n = (strptr + sizeof hashheader) % sizeof (struct dent);
X 	if (n != 0) {
X 		n = sizeof (struct dent) - n;
X 		strptr += n;
X 		while (--n >= 0)
X 		    putc ('\0', outfile);
X 	}
X 	for (i = 0; i < hashsize; i++) {
X 		if (hashtbl[i].next != 0) {
X 			int x;
X 			x = hashtbl[i].next - hashtbl;
X 			hashtbl[i].next = (struct dent *)x;
X 		} else {
X 			hashtbl[i].next = (struct dent *)-1;
X 		}
X 	}
X 	fwrite (hashtbl, sizeof (struct dent), hashsize, outfile);
X 	hashheader.stringsize = strptr;
X 	rewind (outfile);
X 	fwrite (&hashheader, sizeof hashheader, 1, outfile);
X 	fclose (outfile);
X }
X 
X filltable ()
X {
X 	struct dent *freepointer, *nextword, *dp;
X 	int i;
X 
X 	for (freepointer = hashtbl; freepointer->used; freepointer++)
X 		;
X 	for (nextword = hashtbl, i = numwords; i != 0; nextword++, i--) {
X 		if (nextword->used == 0) {
X 			continue;
X 		}
X 		if (nextword->next == NULL) {
X 			continue;
X 		}
X 		if (nextword->next >= hashtbl && nextword->next < hashtbl + hashsize) {
X 			continue;
X 		}
X 		dp = nextword;
X 		while (dp->next) {
X 			if (freepointer > hashtbl + hashsize) {
X 				fprintf (stderr, "table overflow\n");
X 				getchar ();
X 				break;
X 			}
X 			*freepointer = *(dp->next);
X 			dp->next = freepointer;
X 			dp = freepointer;
X 
X 			while (freepointer->used)
X 				freepointer++;
X 		}
X 	}
X }
X 
X 
X readdict ()
X {
X 	struct dent d;
X 	register struct dent *dp;
X 	char lbuf[100];
X 	FILE *dictf;
X 	int i;
X 	int h;
X 	int len;
X 	register char *p;
X 
X 	if ((dictf = fopen (Dfile, "r")) == NULL) {
X 		fprintf (stderr, "Can't open dictionary\n");
X 		exit (1);
X 	}
X 
X 	hashtbl = (struct dent *) calloc (numwords, sizeof (struct dent));
X 	if (hashtbl == NULL) {
X 		fprintf (stderr, "couldn't allocate hash table\n");
X 		exit (1);
X 	}
X 
X 	i = 0;
X 	while (fgets (lbuf, sizeof lbuf, dictf) != NULL) {
X 		if ((i & 1023) == 0) {
X 			printf ("%d ", i);
X 			fflush (stdout);
X 		}
X 		i++;
X 
X 		p = &lbuf [ strlen (lbuf) - 1 ];
X 		if (*p == '\n')
X 			*p = 0;
X 
X 		if (makedent (lbuf, &d) < 0)
X 			continue;
X 
X 		len = strlen (lbuf);
X #ifdef CAPITALIZE
X 		if (d.followcase)
X 			d.word = malloc (2 * len + 4);
X 		else
X 			d.word = malloc (len + 1);
X #endif
X 		if (d.word == NULL) {
X 			fprintf (stderr, "couldn't allocate space for word %s\n", lbuf);
X 			exit (1);
X 		}
X 		strcpy (d.word, lbuf);
X #ifdef CAPITALIZE
X 		if (d.followcase) {
X 			p = d.word + len + 1;
X 			*p++ = 1;		/* Count of capitalizations */
X 			*p++ = '-';		/* Don't keep in pers dict */
X 			strcpy (p, lbuf);
X 			
X 		}
X 		for (p = d.word;  *p;  p++) {
X 			if (mylower (*p))
X 				*p = toupper (*p);
X 		}
X #endif
X 
X 		h = hash (d.word, len, hashsize);
X 
X 		dp = &hashtbl[h];
X 		if (dp->used == 0) {
X 			*dp = d;
X 		} else {
X 
X #ifdef CAPITALIZE
X 			while (dp != NULL  &&  strcmp (dp->word, d.word) != 0)
X 			    dp = dp->next;
X 			if (dp != NULL) {
X 			    if (d.followcase
X 			      ||  (dp->followcase  &&  !d.allcaps
X 				&&  !d.capitalize)) {
X 				/* Add a specific capitalization */
X 				if (dp->followcase) {
X 				    p = &dp->word[len + 1];
X 				    (*p)++;	/* Bump counter */
X 				    dp->word = realloc (dp->word,
X 				      ((*p & 0xFF) + 1) * (len + 2));
X 				    if (dp->word == NULL) {
X 					fprintf (stderr,
X 					  "couldn't allocate space for word %s\n",
X 					  lbuf);
X 					exit (1);
X 				    }
X 				    p = &dp->word[len + 1];
X 				    p += ((*p & 0xFF) - 1) * (len + 2) + 1;
X 				    *p++ = '-';
X 				    strcpy (p,
X 				      d.followcase ? &d.word[len + 3] : lbuf);
X 				}
X 				else {
X 				    /* d.followcase must be true */
X 				    /* thus, d.capitalize and d.allcaps are */
X 				    /* clear */
X 				    free (dp->word);
X 				    dp->word = d.word;
X 				    dp->followcase = 1;
X 				    dp->k_followcase = 1;
X 				    /* Code later will clear dp->allcaps. */
X 				}
X 			    }
X 			    /* Combine two capitalizations.  If d was */
X 			    /* allcaps, dp remains unchanged */
X 			    if (d.allcaps == 0) {
X 				/* dp is the entry that will be kept.  If */
X 				/* dp is followcase, the capitalize flag */
X 				/* reflects whether capitalization "may" */
X 				/* occur.  If not, it reflects whether it */
X 				/* "must" occur. */
X 				if (d.capitalize) {	/* ie lbuf was cap'd */
X 				    if (dp->followcase)
X 					dp->capitalize = 1;	/* May */
X 				    else if (dp->allcaps) /* ie not lcase */
X 					dp->capitalize = 1;	/* Must */
X 				}
X 				else {		/* lbuf was followc or all-lc */
X 				    if (!dp->followcase)
X 					dp->capitalize == 0;	/* May */
X 				}
X 				dp->k_capitalize == dp->capitalize;
X 				dp->allcaps = 0;
X 				dp->k_allcaps = 0;
X 			    }
X 			}
X 			else {
X #endif
X 			    dp = (struct dent *) malloc (sizeof (struct dent));
X 			    if (dp == NULL) {
X 				fprintf (stderr,
X 				  "couldn't allocate space for collision\n");
X 				exit (1);
X 			    }
X 			    *dp = d;
X 			    dp->next = hashtbl[h].next;
X 			    hashtbl[h].next = dp;
X 			}
X 		}
X 	}
X 	printf ("\n");
X }
X 
X /*
X  * fill in the flags in d, and put a null after the word in s
X  */
X 
X makedent (lbuf, d)
X char *lbuf;
X struct dent *d;
X {
X 	char *p, *index();
X 
X 	d->next = NULL;
X 	d->used = 1;
X 	d->v_flag = 0;
X 	d->n_flag = 0;
X 	d->x_flag = 0;
X 	d->h_flag = 0;
X 	d->y_flag = 0;
X 	d->g_flag = 0;
X 	d->j_flag = 0;
X 	d->d_flag = 0;
X 	d->t_flag = 0;
X 	d->r_flag = 0;
X 	d->z_flag = 0;
X 	d->s_flag = 0;
X 	d->p_flag = 0;
X 	d->m_flag = 0;
X 	d->keep = 0;
X #ifdef CAPITALIZE
X 	d->allcaps = 0;
X 	d->capitalize = 0;
X 	d->followcase = 0;
X 	/*
X 	** Figure out the capitalization rules from the capitalization of
X 	** the sample entry.  Only one of followcase, allcaps, and capitalize
X 	** will be set.  Combinations are generated by higher-level code.
X 	*/
X 	for (p = lbuf;  *p  &&  *p != '/';  p++) {
X 		if (mylower (*p))
X 			break;
X 	}
X 	if (*p == '\0'  ||  *p == '/')
X 		d->allcaps = 1;
X 	else {
X 		for (  ;  *p  &&  *p != '/';  p++) {
X 			if (myupper (*p))
X 				break;
X 		}
X 		if (*p == '\0'  ||  *p == '/') {
X 			/*
X 			** No uppercase letters follow the lowercase ones.
X 			** If the first two letters are capitalized, it's
X 			** "followcase". If the first one is capitalized, it's
X 			** "capitalize".
X 			*/
X 			if (myupper (lbuf[0])) {
X 				if (myupper (lbuf[1]))
X 					d->followcase = 1;
X 				else
X 					d->capitalize = 1;
X 			}
X 		}
X 		else
X 			d->followcase = 1;	/* .../lower/upper */
X 	}
X 	d->k_allcaps = d->allcaps ;
X 	d->k_capitalize = d->capitalize;
X 	d->k_followcase = d->followcase;
X #endif
X 
X 	p = index (lbuf, '/');
X 	if (p != NULL)
X 		*p = 0;
X 	if (strlen (lbuf) > WORDLEN - 1) {
X 		printf ("%s: word too big\n", lbuf);
X 		return (-1);
X 	}
X 
X 	if (p == NULL)
X 		return (0);
X 
X 	p++;
X 	while (*p != '\0'  &&  *p != '\n') {
X 		if (mylower (*p))
X 			*p = toupper (*p);
X 		switch (*p) {
X 		case 'V': d->v_flag = 1; break;
X 		case 'N': d->n_flag = 1; break;
X 		case 'X': d->x_flag = 1; break;
X 		case 'H': d->h_flag = 1; break;
X 		case 'Y': d->y_flag = 1; break;
X 		case 'G': d->g_flag = 1; break;
X 		case 'J': d->j_flag = 1; break;
X 		case 'D': d->d_flag = 1; break;
X 		case 'T': d->t_flag = 1; break;
X 		case 'R': d->r_flag = 1; break;
X 		case 'Z': d->z_flag = 1; break;
X 		case 'S': d->s_flag = 1; break;
X 		case 'P': d->p_flag = 1; break;
X 		case 'M': d->m_flag = 1; break;
X 		case 0:
X  			fprintf (stderr, "no flags on word %s\n", lbuf);
X 			continue;
X 		default:
X 			fprintf (stderr, "unknown flag %c word %s\n", 
X 					*p, lbuf);
X 			break;
X 		}
X 		p++;
X 		if (*p == '/')		/* Handle old-format dictionaries too */
X 			p++;
X 	}
X 	return (0);
X }
X 
X newcount ()
X {
X 	char buf[200];
X 	char lastbuf[200];
X 	FILE *d;
X 	int i;
X 	register char *cp;
X 
X 	fprintf (stderr, "Counting words in dictionary ...\n");
X 
X 	if ((d = fopen (Dfile, "r")) == NULL) {
X 		fprintf (stderr, "Can't open dictionary\n");
X 		exit (1);
X 	}
X 
X 	for (i = 0, lastbuf[0] = '\0';  fgets (buf, sizeof buf, d);  ) {
X 		for (cp = buf;  *cp;  cp++) {
X 			if (mylower (*cp))
X 				*cp = toupper (*cp);
X 		}
X 		if (strcmp (buf, lastbuf) != 0) {
X 			if ((++i & 1023) == 0) {
X 				printf ("%d ", i);
X 				fflush (stdout);
X 			}
X 			strcpy (lastbuf, buf);
X 		}
X 	}
X 	fclose (d);
X 	printf ("\n%d words\n", i);
X 	if ((d = fopen (Cfile, "w")) == NULL) {
X 		fprintf (stderr, "can't create %s\n", Cfile);
X 		exit (1);
X 	}
X 	fprintf (d, "%d\n", i);
X 	fclose (d);
X }
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'config.X'" '(4594 characters)'
if test -f 'config.X'
then
	echo shar: will not over-write existing file "'config.X'"
else
sed 's/^X //' << \SHAR_EOF > 'config.X'
X /*
X  * This is the configuration file for ispell.  Thanks to Bob McQueer
X  * for creating it and making the necessary changes elsewhere to
X  * support it.
X  * Look through this file from top to bottom, and edit anything that
X  * needs editing.  There are also five or six variables in the
X  * Makefile that you must edit.  Note that the Makefile edits this
X  * file (config.X) to produce config.h.  If you are looking at
X  * config.h, you're in the wrong file.
X  *
X  * Don't change the funny-looking lines with !!'s in them;  see the
X  * Makefile!
X  */
X 
X /*
X ** library directory for hash table(s) / default hash table name
X ** If you intend to use multiple dictionary files, I would suggest
X ** LIBDIR be a directory which will contain nothing else, so sensible
X ** names can be constructed for the -d option without conflict.
X */
X #ifndef LIBDIR
X #define LIBDIR "!!LIBDIR!!"
X #endif
X #ifndef DEFHASH
X #define DEFHASH "!!DEFHASH!!"
X #endif
X 
X #ifdef USG
X #define index strchr
X #define rindex strchr
X #endif
X 
X /* environment variable for user's word list */
X #ifndef PDICTVAR
X #define PDICTVAR "WORDLIST"
X #endif
X 
X /* default word list */
X #ifndef DEFPDICT
X #define DEFPDICT ".ispell_words"
X #endif
X 
X /* environment variable for include file string */
X #ifndef INCSTRVAR
X #define INCSTRVAR "INCLUDE_STRING"
X #endif
X 
X /* default include string */
X #ifndef DEFINCSTR
X #define DEFINCSTR "&Include_File&"
X #endif
X 
X /* mktemp template for temporary file - MUST contain 6 consecutive X's */
X #ifndef TEMPNAME
X #define TEMPNAME "/tmp/ispellXXXXXX"
X #endif
X 
X /* default dictionary file */
X #ifndef DEFDICT
X #define DEFDICT "!!DEFDICT!!"
X #endif
X 
X /* path to LOOK (if look(1) command is available) */
X #ifndef LOOK
X #undef LOOK
X #endif
X 
X /* path to egrep (use speeded up version if available) */
X #ifndef EGREPCMD
X #define EGREPCMD "/bin/egrep"
X #endif
X 
X /* path to wordlist for Lookup command (typically /usr/dict/{words|web2} */
X #ifndef WORDS
X #define WORDS	"/usr/dict/words"
X #endif
X 
X /* buffer size to use for file names if not in sys/param.h */
X #ifndef MAXPATHLEN
X #define MAXPATHLEN 240
X #endif
X 
X /* word length allowed in dictionary by buildhash */
X #define WORDLEN 30
X 
X /* suppress the 8-bit character feature */
X #ifndef NO8BIT
X #define NO8BIT
X #endif
X 
X /* maximum number of include files supported by xgets;  set to 0 to disable */
X #ifndef MAXINCLUDEFILES
X #define MAXINCLUDEFILES	5
X #endif
X 
X /* Approximate number of words in the full dictionary, after munching.
X ** Err on the high side unless you are very short on memory, in which
X ** case you might want to change the tables in tree.c and also increase
X ** MAXPCT.
X **
X ** (Note:  dict.191 is a bit over 15000 words.  dict.191 munched with
X ** /usr/dict/words is a little over 28000).
X */
X #ifndef BIG_DICT
X #define BIG_DICT	29000
X #endif
X 
X /*
X ** Maximum hash table fullness percentage.  Larger numbers trade space
X ** for time.
X **/
X #ifndef MAXPCT
X #define MAXPCT	70		/* Expand table when 70% full */
X #endif
X 
X /*
X ** the isXXXX macros normally only check ASCII range.  These are used
X ** instead for text characters, which we assume may be 8 bit.  The
X ** NO8BIT ifdef shuts off significance of 8 bit characters.  If you are
X ** using this, and your ctype.h already masks, you can simplify.
X */
X #ifdef NO8BIT
X #define myupper(X) isupper((X)&0x7f)
X #define mylower(X) islower((X)&0x7f)
X #define myspace(X) isspace((X)&0x7f)
X #define myalpha(X) isalpha((X)&0x7f)
X #else
X #define myupper(X) (!((X)&0x80) && isupper(X))
X #define mylower(X) (!((X)&0x80) && islower(X))
X #define myspace(X) (!((X)&0x80) && isspace(X))
X #define myalpha(X) (!((X)&0x80) && isalpha(X))
X #endif
X 
X /*
X ** the NOPARITY mask is applied to user input characters from the terminal
X ** in order to mask out the parity bit.
X */
X #ifdef NO8BIT
X #define NOPARITY 0x7f
X #else
X #define NOPARITY 0xff
X #endif
X 
X 
X /*
X ** the terminal mode for ispell, set to CBREAK or RAW
X **
X */
X #ifndef TERM_MODE
X #define TERM_MODE	CBREAK
X #endif
X 
X /*
X ** Define this if you want your columns of words to be of equal length.
X ** This will spread short word lists across the screen instead of down it.
X */
X #ifndef EQUAL_COLUMNS
X #undef EQUAL_COLUMNS
X #endif
X 
X /*
X ** This is the extension that will be added to backup files
X */
X #ifndef	BAKEXT
X #define	BAKEXT	".bak"
X #endif
X 
X /*
X ** Define this if you want the capitalization feature.  This will increase
X ** the size of the hashed dictionary on most 16-bit and some 32-bit machines.
X */
X #ifndef CAPITALIZE
X #define CAPITALIZE
X #endif
X 
X /*
X ** Define this if you want your personal dictionary sorted.  This may take
X ** a long time for very large dictionaries.  Dictionaries larger than
X ** SORTPERSONAL words will not be sorted.
X */
X #ifndef SORTPERSONAL
X #define SORTPERSONAL	1000
X #endif
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'fixdict.sh'" '(2502 characters)'
if test -f 'fixdict.sh'
then
	echo shar: will not over-write existing file "'fixdict.sh'"
else
sed 's/^X //' << \SHAR_EOF > 'fixdict.sh'
X : Use /bin/sh
X #
X #	Add capitalization information to an ispell dictionary
X #
X #	Usage:
X #
X #	fixdict dict-file
X #
X #	Requires availability of UNIX spell.  The new dictionary is
X #	rewritten in place.  A list of words that couldn't be
X #	resolved (because spell doesn't know them) is written to
X #	standard output.  This list appears in lowercase in the
X #	dictionary, and if there are any errors the must be edited
X #	by hand.
X #
X #	The final dictionary appears in expanded form and must be
X #	passed through munchlist to regenerate suffixes.
X #
X LIBDIR=/tmp2/lib
X EXPAND1=${LIBDIR}/isexp1.sed
X EXPAND2=${LIBDIR}/isexp2.sed
X EXPAND3=${LIBDIR}/isexp3.sed
X EXPAND4=${LIBDIR}/isexp4.sed
X TDIR=${TMPDIR:-/tmp}
X TMP=${TDIR}/fix$$
X 
X trap "/bin/rm -f ${TMP}*; exit 1" 1 2 15
X sed -f ${EXPAND1} $1 | sed -f ${EXPAND2} \
X   | sed -f ${EXPAND3} | sed -f ${EXPAND4} \
X   | tr '[A-Z]' '[a-z]' \
X   | spell \
X   | sort > ${TMP}a
X #
X # ${TMP}a contains all the words that spell doesn't like.
X # Now figure out which of those are because spell doesn't know them at
X # all, and leave those in ${TMP}b.
X #
X tr '[a-z]' '[A-Z]' < ${TMP}a | spell | tr '[A-Z]' '[a-z]' > ${TMP}b
X #
X # The wrongly-capitalized words are those that spell didn't object to
X # in the last step.  Produce a list of them in, and capitalize the
X # first letter of each.  Save this list in ${TMP}c.
X #
X comm -23 ${TMP}a ${TMP}b \
X   | sed 's/^a/A/;s/^b/B/;s/^c/C/;s/^d/D/;s/^e/E/;s/^f/F/;s/^g/G/;s/^h/H/
X      s/^i/I/;s/^j/J/;s/^k/K/;s/^l/L/;s/^m/M/;s/^n/N/;s/^o/O/;s/^p/P/
X      s/^q/Q/;s/^r/R/;s/^s/S/;s/^t/T/;s/^u/U/;s/^v/V/;s/^w/W/;s/^x/X/
X      s/^y/Y/;s/^z/Z/' > ${TMP}c
X #
X # Find out which of those spell objects to, saving the failures in ${TMP}d.
X #
X spell ${TMP}c > ${TMP}d
X #
X # Extract the words which were correctly capitalized at the first letter,
X # combine them with an all-capitals version of the ones that weren't, and
X # put the result into ${TMP}e.
X #
X (comm -23 ${TMP}c ${TMP}d;  tr '[a-z]' '[A-Z]' < ${TMP}d) \
X   | sort -o ${TMP}e
X #
X # At this point, ${TMP}b contains the words that spell just plain doesn't
X # like, and ${TMP}e contains the words that are now capitalized correctly.
X #
X /bin/rm ${TMP}[cd]
X #
X # Put it all together, rewriting the dictionary in place.
X #
X sed -f ${EXPAND1} $1 | sed -f ${EXPAND2} \
X   | sed -f ${EXPAND3} | sed -f ${EXPAND4} \
X   | tr '[A-Z]' '[a-z]' \
X   | sort \
X   | comm -23 - ${TMP}a \
X   | sort -f -o $1 - ${TMP}b ${TMP}e
X #
X # Finally, write the list of words that have questionable capitalization
X # to the standard output.
X #
X cat ${TMP}b
X /bin/rm ${TMP}*
SHAR_EOF
chmod +x 'fixdict.sh'
fi # end of overwriting check
#	End of shell archive
exit 0

-- 
Brandon S. Allbery	{decvax,cbatt,cbosgd}!cwruecmp!ncoast!allbery
Tridelta Industries	{ames,mit-eddie,talcott}!necntc!ncoast!allbery
7350 Corporate Blvd.	necntc!ncoast!allbery@harvard.HARVARD.EDU
Mentor, OH 44060	+01 216 255 1080	(also eddie.MIT.EDU)