[news.groups] Moderation improves comp.sources.misc?

gnu@hoptoad.UUCP (06/08/87)
Bill Wohler writes:
>                       ...when we want to get our sources, we move over
>   to the moderated sources groups where the subject line and the
>   initial contents of the article will really tell us what the sources
>   are.

It's clear that Bill has not been reading comp.sources.misc.  Not every
posting is this bad, but...see how far down you have to read before you
find out what this program is or does, assuming you have not heard of it:

From: allbery@ncoast.UUCP (Brandon S. Allbery)
Newsgroups: comp.sources.misc
Subject: Ispell Version 2.0 Beta Part 01/04
Approved: allbery@ncoast.UUCP

[Since this is a beta release, it is here.  The final version will show up in
comp.sources.unix with the other good stuff.  ++bsa]

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	UPDATE2
#	Makefile
#	Othercap.txt
#	README
#	UPDATE
#	WISHES
#	buildhash.c
#	config.X
#	fixdict.sh
# This archive created: Sat May 30 17:13:25 1987
export PATH; PATH=/bin:$PATH
echo shar: extracting "'UPDATE2'" '(6964 characters)'
if test -f 'UPDATE2'
then
	echo shar: will not over-write existing file "'UPDATE2'"
else
sed 's/^X //' << \SHAR_EOF > 'UPDATE2'
X This is the beta-test release of ispell version 2.0.  As I discussed in
X a previous comp.sources.d posting, I will collect bug fixes for this version,
X and then post a final version with dictionary to mod.sources, at which time
X I will wash my hands of the bloody thing.
X 
X Because I am short on time, I can only promise to integrate bug fixes.  If
X you send me improvements, they will very likely disappear into a black hole.
X Sorry, but it takes time to integrate every change, even the ones I can't test
X because they're for BSD.  If you plan on hacking extensively, I'd suggest
X waiting for the mod.sources posting, or you may have to repeat some work.
X 
X Send bug reports/fixes to:
X 
X 	Geoff Kuenning   geoff@ITcorp.com   {uunet,trwrb}!desint!geoff
X ---------------------------------------------------------------------------
X INSTRUCTIONS:
X 
X In response to many requests, this posting contains all sources except the
X dictionary.  Since shar won't overwrite files and some file names have
X changed, you should unshar it in an empty directory.  If you installed my
X previous posting, you may also want to remove expand[12].sed from
X /usr/public/lib, since these scripts have been renamed to to isexp[1-4].sed.
X 
X Once you have unpacked, edit "Makefile" and "config.X" according to the
X comments in each.  Note that the Makefile edits "config.X" further to
X produce "config.h".  Then type "make install" and go away for a while
X (if you're brave and foolish;  otherwise do the equivalent more carefully).
X 
X If you don't already have a dictionary, please don't ask me for one.  Ask
X a neighbor.  If they don't have one, and you can't make one from
X /usr/dict/words or /usr/dict/web2 by running it through "munchlist",
X try running a bunch of text files through "makedict.sh".  (It depends on
X UNIX spell, though in a pinch you can do without if your source files are
X very good).  If all else fails, you'll just have to wait for the mod.sources
X posting.
X 
X If you do have a dictionary, and you would like to use the new CAPITALIZE
X feature, you will have to convert you dictionary.  If you have UNIX spell,
X the "fixdict.sh" script will do this for you, without violating any
X copyrights or license restrictions.  This script replaces the current
X dictionary, and writes a (short) list of questionable capitalizations
X to standard output;  these must be analyzed and, if necessary, corrected
X by hand.  The file "Othercap.txt" (included in this posting) contains
X words that are in dict.191 which will be missed by "fixdict.sh" with
X the standard UNIX spell program.
X 
X Problems fixed in this posting:
X 
X     (1) Ispell did not duplicate the permissions on the files it edited.
X 	(David Neves)
X     (2) The actual maximum number of possible corrections was 99, not 100.
X     (3) Ispell assumed a terminal width of 80 columns, rather than
X 	consulting the termcap entry.
X     (4) Long lines could wrap around on the terminal, damaging the
X 	display.
X     (5) The includes of types.h and param.h need to be interchanged on
X 	BSD systems.  (Ken Yap, Jacob Gore)
X     (6) The givehelp() routine now actually waits for a space to be typed
X 	like it claims, instead of just waiting for any character.  (Steve
X 	Kelem)
X     (7) Good.c was missing a declaration of the index() (strchr) routine.
X     (8) The excessive strlen() calls in good.c have been removed, and
X 	register declarations have been added.  (Joe Orost, Rich Salz)
X     (9) Some systems get "multiple symbol definition" messages when
X 	linking (Joe Orost).
X    (10) Expand[12].sed didn't handle new-format dictionaries.
X    (11) Some minor errors in the usage message have been corrected.
X    (12) If a space (or other non-word character) is inserted using "R",
X         ispell would treat the entire replacement string as a token
X         and try to find it in the dictionary.
X    (13) Ispell now follows the proper UNIX procedure for signal catching
X 	(i.e., it doesn't catch SIGINT if it's run in background).
X    (14) The handling of process stopping on BSD systems has been cleaned
X 	up and made to work right (Mark Davies).
X 
X Improvements added in this posting:
X 
X     (1) Ispell's handling of troff size and font requests has again been
X 	improved.  (Isaac Balbin, Steve Kelem, Joe Orost) (Everybody seems
X 	to fix the particular problem that bothers their individual world :-).
X     (2) If ispell is run on a file with an extension of ".tex", it will
X 	automatically go into TeX mode for this and subsequent files.
X 	(Steve Kelem)
X     (3) The emacs support now includes "ispell-buffer", and ispell is run
X 	from "ispell-program-name" so you can specify an explicit path.
X 	(Stewart Clamen)
X     (4) There is a TERM_MODE configuration option so you can choose between RAW
X 	and CBREAK modes.  The default has been changed to CBREAK (it used
X 	to be RAW) to preserve parity.  (Joe Orost)
X     (5) Term.c will now compile on V7 systems (Joe Orost)
X     (6) Register declarations have been added throughout.  (Joe Orost)
X     (7) Ispell now buffers stdout, improving display performance slightly.
X     (8) The backup file extension is now configurable (George Sipe).
X     (9) All config.X definitions except MAGIC can be overridden with -D
X 	switches (George Sipe).
X    (10) There is now a version.h file, so you will know what version you
X 	have (I guess Larry Wall deserves credit.  Even though he didn't
X 	harass me, guilt set in).  There is also a -v switch to print the
X 	version information.
X    (11) (This was a lot tougher that I expected).  Ispell now knows about
X 	capitalization and proper names (yay).  It recognizes four flavors
X 	of words:  lowercase, capitalized, all-capitals, and "followcase".
X 	If a word appears in the dictionary in lowercase, it is accepted
X 	in lowercase, capitalized, or all-capitals.  If it is capitalized
X 	in the dictionary, all-lowercase is disallowed.  If it is all-caps
X 	in the dictionary, it must always appear in all caps.  Finally,
X 	if the word has "weird" capitalization (like the name of my company,
X 	ITcorp or ITCorp), either that capitalization must be followed
X 	*exactly* or else the word must appear in all-caps.  More than
X 	one of these variants may occur;  "munchlist" will remove unneeded
X 	ones from a dictionary.  Finally, if you blow capitalization,
X 	ispell will offer a list of correctly-capitalized alternatives.
X 	Because it increases the size of the hash file, this feature is
X 	optional (see the CAPITALIZE option in config.X).
X    (12) A new shell script ("fixdict.sh") is provided to aid in converting
X 	your old dictionary to provide capitalization information.
X    (13) Buildhash now pads the string table to a "struct dent" boundary
X 	in the hash file, so that it will be aligned when reading in.  On
X 	many machines, this will speed startup.
X    (14) The -w option now accepts characters specified in octal with
X 	backslashes like any other UNIX program, as well as the previous
X 	decimal option, and it will also accept numeric strings of less
X 	than three digits.
X    (15) The ispell.el file now supports ispell-region and ispell-buffer.
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'Makefile'" '(2252 characters)'
if test -f 'Makefile'
then
	echo shar: will not over-write existing file "'Makefile'"
else
sed 's/^X //' << \SHAR_EOF > 'Makefile'
X # -*- Mode: Text -*-
X 
X # Look over config.X before building.
X #
X # You may want to edit BINDIR, LIBDIR, DEFHASH, DEFDICT, MAN1DIR, MAN4DIR
X # MAN1EXT, MAN4EXT, and TERMLIB below;
X # the Makefile will update all other files to match.
X #
X # On USG systems, add -DUSG to CFLAGS.
X #
X # The ifdef NO8BIT may be used if 8 bit extended text characters
X # cause problems, or you simply don't wish to allow the feature.
X #
X # the argument syntax for buildhash to make alternate dictionary files
X # is simply:
X #
X #   buildhash <infile> <outfile>
X 
X CC = lcc -v -HL -HD -R tgetflag
X CFLAGS = -n -O -DUSG
X # BINDIR, LIBDIR, DEFHASH, DEFDICT, MAN1DIR, MAN4DIR, MAN1EXT, MAN4EXT,
X # TERMLIB
X BINDIR = /usr/sahbin
X LIBDIR = /tmp2/lib
X DEFHASH = ispell.hash
X DEFDICT = dict.191
X MAN1DIR	= /usr/man/u_man/man1
X MAN4DIR	= /usr/man/u_man/man4
X MAN1EXT	= .1l
X MAN4EXT	= .4l
X # TERMLIB = -lcurses
X TERMLIB = -ltermcap
X 
X SHELL = /bin/sh
X 
X all: buildhash ispell icombine munchlist isexpand $(DEFHASH)
X 
X ispell.hash: buildhash $(DEFDICT)
X 	./buildhash $(DEFDICT) $(DEFHASH)
X 
X install: all
X 	cp ispell isexpand munchlist $(BINDIR)
X 	cp ispell.hash $(LIBDIR)/$(DEFHASH)
X 	cp expand1.sed expand2.sed icombine $(LIBDIR)
X 	chmod 755 $(BINDIR)/ispell $(BINDIR)/munchlist $(BINDIR)/isexpand \
X 	  $(LIBDIR)/icombine
X 	chmod 644 $(LIBDIR)/$(DEFHASH) $(LIBDIR)/expand1.sed \
X 	  $(LIBDIR)/expand2.sed
X 	cp ispell.1 $(MAN1DIR)/ispell$(MAN1EXT)
X 	cp ispell.4 $(MAN4DIR)/ispell$(MAN4EXT)
X 
X buildhash: buildhash.o hash.o
X 	$(CC) $(CFLAGS) -o buildhash buildhash.o hash.o
X 
X icombine:	icombine.c config.h ispell.h
X 	$(CC) $(CFLAGS) -o icombine icombine.c
X 
X munchlist:	munchlist.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' -e 's@!!DEFDICT!!@$(DEFDICT)@' \
X 		<munchlist.X >munchlist
X 	chmod +x munchlist
X 
X isexpand:	isexpand.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' isexpand.X >isexpand
X 	chmod +x isexpand
X 
X OBJS=ispell.o term.o good.o lookup.o hash.o tree.o xgets.o
X 
X ispell: $(OBJS)
X 	cc $(CFLAGS) -o ispell $(OBJS) $(TERMLIB)
X 
X $(OBJS) buildhash.o: config.h ispell.h
X ispell.o: version.h
X 
X config.h:	config.X Makefile
X 	sed -e 's@!!LIBDIR!!@$(LIBDIR)@' -e 's@!!DEFDICT!!@$(DEFDICT)@' \
X 	    -e 's@!!DEFHASH!!@$(DEFHASH)@' <config.X >config.h
X 
X clean:
X 	rm -f *.o buildhash ispell core a.out mon.out hash.out \
X 		*.stat *.cnt munchlist config.h
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'Othercap.txt'" '(106 characters)'
if test -f 'Othercap.txt'
then
	echo shar: will not over-write existing file "'Othercap.txt'"
else
sed 's/^X //' << \SHAR_EOF > 'Othercap.txt'
X Airedale
X Alcibiades
X Argo
X Argos
X Arianist
X Arianists
X Auckland
X CDR
X Ethernet
X Ethernet's
X Ethernets
X MIT's
X Sikkim
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'README'" '(6256 characters)'
if test -f 'README'
then
	echo shar: will not over-write existing file "'README'"
else
sed 's/^X //' << \SHAR_EOF > 'README'
X -*- Mode:Text -*-
X 
X Ispell consists of two programs: the actual spelling checker, "ispell",
X and the hash table builder, "buildhash".  Everything is set up so you
X can just say "make install" and have everything happen.  You might want
X to edit the makefile, and ispell.h to change the destination of the
X program and the hash table.
X 
X The dictionary comes from the ITS spell dictionary.  I got it from
X "ml:wba;dict 191", although I don't know that this is the copy currenty
X in use on the 20's around MIT.
X 
X ----------------------------------------------------------------------
X 
X Addendum:
X 
X My eternal gratitude to the author of ispell -- I don't know how I
X ever lived without it.  I received his permission to post ispell to
X the net and have added a GNU EMACS interface.  Look in the file
X ispell.el for installation instructions.
X 
X As far as I know, no one informally "supports" this program.  If you
X would like to "adopt" it (collect fixes/enhancements and post a new
X version periodically), feel free to do so.
X 
X I volunteer to collect dictionary diffs and post a composite diff
X periodically.  If you add a lot of words to the main dictionary, send
X me the diffs between the the modified dictionary and the posted one.
X Also, if you have access to a TOPS20 system with a more complete
X dictionary in ispell format, send me the diffs if you can.  Just
X PLEASE don't dump an entire dictionary to our site!
X 
X The dictionary posted is one I snarfed from around here -- after
X comparison with the one originally supplied, ours appears a tad more
X complete and accurate.
X 
X Walt Buehring
X Texas Instruments - Computer Science Center
X 
X ARPA:  Buehring%TI-CSL@CSNet-Relay
X UUCP:  {smu, texsun, im4u, rice} ! ti-csl ! buehring
X 
X ----------------------------------------------------------------------
X 
X The following is the only documentation I could find about the format
X of the dictionary.  It was written for the TOPS20 speller that ispell
X mimics, so I believe most the information is applicable.  It should be
X useful if you want to add words to the main dictionary by hand.  -WB
X 
X ----------------------------------------------------------------------
X 
X 11.6  Dictionary flags
X 
X      Words  in SPELL's main dictionary (but not the other dictionaries) may
X have flags associated with  them  to  indicate  the  legality  of  suffixes
X without  the  need  to keep the full suffixed words in the dictionary.  The
X flags have "names" consisting of single  letters.    Their  meaning  is  as
X follows:
X 
X Let  #  and  @  be  "variables"  that can stand for any letter.  Upper case
X letters are constants.  "..."  stands  for  any  string  of  zero  or  more
X letters,  but note that no word may exist in the dictionary which is not at
X least 2 letters long, so, for example, FLY may not be produced  by  placing
X the  "Y"  flag  on "F".  Also, no flag is effective unless the word that it
X creates is at least 4 letters  long,  so,  for  example,  WED  may  not  be
X produced by placing the "D" flag on "WE".
X 
X "V" flag:
X         ...E --> ...IVE  as in CREATE --> CREATIVE
X         if # .ne. E, ...# --> ...#IVE  as in PREVENT --> PREVENTIVE
X 
X "N" flag:
X         ...E --> ...ION  as in CREATE --> CREATION
X         ...Y --> ...ICATION  as in MULTIPLY --> MULTIPLICATION
X         if # .ne. E or Y, ...# --> ...#EN  as in FALL --> FALLEN
X 
X "X" flag:
X         ...E --> ...IONS  as in CREATE --> CREATIONS
X         ...Y --> ...ICATIONS  as in MULTIPLY --> MULTIPLICATIONS
X         if # .ne. E or Y, ...# --> ...#ENS  as in WEAK --> WEAKENS
X 
X "H" flag:
X         ...Y --> ...IETH  as in TWENTY --> TWENTIETH
X         if # .ne. Y, ...# --> ...#TH  as in HUNDRED --> HUNDREDTH
X 
X "Y" FLAG:
X         ... --> ...LY  as in QUICK --> QUICKLY
X 
X "G" FLAG:
X         ...E --> ...ING  as in FILE --> FILING
X         if # .ne. E, ...# --> ...#ING  as in CROSS --> CROSSING
X 
X "J" FLAG"
X         ...E --> ...INGS  as in FILE --> FILINGS
X         if # .ne. E, ...# --> ...#INGS  as in CROSS --> CROSSINGS
X 
X "D" FLAG:
X         ...E --> ...ED  as in CREATE --> CREATED
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IED  as in IMPLY --> IMPLIED
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ED  as in CROSS --> CROSSED
X                                 or CONVEY --> CONVEYED
X "T" FLAG:
X         ...E --> ...EST  as in LATE --> LATEST
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IEST  as in DIRTY --> DIRTIEST
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#EST  as in SMALL --> SMALLEST
X                                 or GRAY --> GRAYEST
X 
X "R" FLAG:
X         ...E --> ...ER  as in SKATE --> SKATER
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IER  as in MULTIPLY --> MULTIPLIER
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ER  as in BUILD --> BUILDER
X                                 or CONVEY --> CONVEYER
X 
X "Z FLAG:
X         ...E --> ...ERS  as in SKATE --> SKATERS
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IERS  as in MULTIPLY --> MULTIPLIERS
X         if # .ne. E or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#ERS  as in BUILD --> BUILDERS
X                                 or SLAY --> SLAYERS
X 
X "S" FLAG:
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@IES  as in IMPLY --> IMPLIES
X         if # .eq. S, X, Z, or H,
X                 ...# --> ...#ES  as in FIX --> FIXES
X         if # .ne. S, X, Z, H, or Y, or (# = Y and @ = A, E, I, O, or U)
X                 ...@# --> ...@#S  as in BAT --> BATS
X                                 or CONVEY --> CONVEYS
X 
X "P" FLAG:
X         if @ .ne. A, E, I, O, or U,
X                 ...@Y --> ...@INESS  as in CLOUDY --> CLOUDINESS
X         if # .ne. Y, or @ = A, E, I, O, or U,
X                 ...@# --> ...@#NESS  as in LATE --> LATENESS
X                                 or GRAY --> GRAYNESS
X 
X "M" FLAG:
X         ... --> ...'S  as in DOG --> DOG'S
X 
X ----------------------------------------------------------------------
X 
X [Whew!  That's all very nice, but how about a quick reference...  -WB]
X 
X V -  ive
X N -  ion, tion, en
X X -  ions, ications, ens
X H -  th, ieth
X Y -  ly
X G -  ing
X J -  ings
X D -  ed
X T -  est
X R -  er
X Z -  ers
X S -  s, es, ies
X P -  ness, iness
X M -  's
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'UPDATE'" '(5252 characters)'
if test -f 'UPDATE'
then
	echo shar: will not over-write existing file "'UPDATE'"
else
sed 's/^X //' << \SHAR_EOF > 'UPDATE'
X 		Ispell enhancements - 3/13/87
X 
X (See three companion postings in net.sources.bugs).
X 
X Here are the enhancements to ispell that I mentioned a couple of days ago.
X Because of the number of changes, several of the context diff's are bigger
X than the original files.  In addition, many people have gotten confused
X about versions, since enhancements/fixes have been made by six different
X people, counting myself (for the list, see the end of ispell.man).  I
X have integrated all of these fixes and enhancements in one place.
X 
X For these reasons, I have decided to repost all of the sources for ispell,
X with one exception -- the dictionary.  (A couple of small files, such
X as ispell.el, are unchanged, but I decided to repost them any for
X completeness.  If you didn't have ispell before, you now need only the
X dictionary).
X 
X The dictionary is a special case:  if you think about it, even ordinary
X diff's will always work with "patch" on that each-line-is-unique file.
X An out-of-place insertion can be corrected by sorting the dictionary
X after patching (something that is done anyway as a side effect of the
X new "munchlist" script).  Because of this, I have decided not to repost
X the sizable dictionary.  In the process of testing this code, it occurred
X to me to run dict.191 through UNIX "spell";  the results of that are
X given in three companion postings in net.sources.bugs, which seemed
X like a more appropriate place for the diffs.  (The postings are not
X divided because of their size;  see comments in the postings for my
X reasons).
X 
X Now, here's what I've done:
X 
X In ispell itself:
X 
X 	- The personal dictionary is now hashed, just like the main one, and
X 	  supports suffixes just like the main one.  (It's not actually
X 	  integrated with the main one, because expanding the main one
X 	  is inefficient and poses a minor but troublesome technical
X 	  problem).  A personal dictionary of 28000+ words can be read in
X 	  within a few minutes (hey, nobody's perfect -- whatcha doing
X 	  with such a big dictionary anyway? :-).
X 	- New option "-c" is used by the new munchlist script to generate
X 	  suggested root/suffix combinations.
X 	- The -d option can now specify /dev/null, if you want to use
X 	  only your personal dictionary (this also saves startup time
X 	  with -c, and is used by the "munchlist" script, which is why
X 	  I put it in).
X 	- The -p option is now more flexible about its handling of pathnames.
X 	  An absolute pathname is always interpreted literally.  A
X 	  relative pathname from WORDLIST is looked up in $HOME first,
X 	  then in the current directory.  The -p option behaves in the
X 	  reverse fashion:  current directory first, then $HOME.  This
X 	  behavior seems more intuitive to me;  I'd be interested in
X 	  opinions of others if you don't find it intuitive.
X 	- Perhaps most important, I have completely overhauled the logic
X 	  in good.c, so that it (I think) matches what the README file
X 	  says it should, no more, no less.  The code has been extensively
X 	  tested, notably by interaction with the new expansion scripts;
X 	  nevertheless because of the extent of the changes and the
X 	  nature of the logic, I'd suggest a bit of suspicion for a while.
X 	  A technique we've found useful here is to do your normal work
X 	  with ispell, and then do a final check with UNIX spell or some
X 	  other slow, inconvenient program to make sure ispell didn't
X 	  screw up.
X 
X New scripts:
X 
X 	- expand.awk:  an obsolete (but correct) awk script that does
X 	  the same thing as expand[12].sed, except slower.  The awk
X 	  script is also much easier to understand than the sed scripts.
X 	  Superseded by the sed scripts, except for very short input.
X 	- expand[12].sed:  the sed pipe
X 
X 	    "sed -f expand1.sed $file | sed -f expand2.sed"
X 
X 	  where "$file" is a raw dictionary file with suffixes
X 	  (e.g., dict.191), generates a list of each root alone, plus
X 	  the root expanded with each possible suffix (e.g.,
X 	  "BOTH/R/Z" produces "BOTH", "BOTHER", and "BOTHERS").  The
X 	  output should usually be sorted with the -u switch before
X 	  further processing.  These scripts are used by 'munchlist';
X 	  they are also useful for (a) checking an ispell dictionary
X 	  with some other spell-checking program and (b) figuring
X 	  out what a particular suffix does to a certain word without
X 	  reading the README file.
X 	- munchlist.sh:  a slow, but effective, shell script that takes
X 	  lists of expanded or unexpanded words as input and reduces
X 	  them to a (usually smaller) list of roots and suffixes.  The
X 	  result is written to standard output.  I think the documentation
X 	  forgot to mention the input must be one word per line.  I
X 	  have successfully used this script to combine dict.191 with
X 	  /usr/dict/words;  it's also useful (and a lot faster) on
X 	  private dictionaries.  For private dictionaries. it will also
X 	  remove any word that has since been added to the main dictionary.
X 
X Oh yes, I almost forgot:  the original documentation didn't mention
X that ispell is a long-name program.  If your "File:" display on the
X top line actually contains the misspelled word, you have long-name problems.
X My fixes don't address long names, because I finally have a way to
X compile long-name programs, thanks to "hash8".
X 
X 	Geoff Kuenning
X 	geoff@ITcorp.COM
X 	...!trwrb!desint!geoff
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'WISHES'" '(2317 characters)'
if test -f 'WISHES'
then
	echo shar: will not over-write existing file "'WISHES'"
else
sed 's/^X //' << \SHAR_EOF > 'WISHES'
X Things remaining to be done to ispell:
X 
X 	- The "munchlist" script can actually increase the size of a
X 	  dictionary.  For example, munching dict.191 (after my bug fixes
X 	  to it) reduced the number of words by about 40, but increased
X 	  the number of characters by a small percentage.  This is
X 	  because munchlist doesn't recognize duplicate suffixes that
X 	  generate the same result, except for the three special
X 	  "s-ending" suffixes J, Z, and X.  For example, right now
X 	  munchlist will make BATHER by adding the R suffix to both
X 	  BATH and BATHE.  It would be nice if munchlist could recognize
X 	  the redundancy and reduce its output so that each word was made
X 	  in only one way.
X 	- The characters in the -w option should be written to the header
X 	  of the hash file, and to a header in the personal dictionary,
X 	  so the user doesn't have to remember to specify them every time.
X 	- Buildhash should support the -w option.
X 	- Buildhash, munchlist, icombine, and the expand scripts should use
X 	  a character other than slash for the flag separator, so that slashes
X 	  are available to the -w option.  I tend to lean towards commas.
X 	- It might be nice to support multiple personal dictionaries.  On
X 	  the other hand, it's pretty easy to combine them with "cat".
X 	- Good.c should be table-driven, so that it is easier to modify for
X 	  other languages.  Ideally, it would support prefixes as well.
X 	- A small amount of string space could be saved if buildhash would
X 	  combine strings with common suffixes (e.g., "and" could be stored
X 	  as a pointer to the tail of "bland").
X 	- (Peter Wan) Ispell should have a "server mode" for large sites, to
X 	  get rid of the time needed to read in the dictionary.  On System V,
X 	  this could be accomplished by having the first execution of ispell
X 	  read the dictionary into a shared-memory region.  Later incarnations
X 	  would then get the dictionary by just attaching to the region.
X 	  One problem would be that the dictionary gets modified during
X 	  the run, so you might still have to do a memory-to-memory copy
X 	  after the attach.  The size of having two copies of the dictionary
X 	  might prohibit this on many machines.  Another approach is a
X 	  message-based "good.c server", but this too would have to deal
X 	  with the possibility of modifiying the dictionary.
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'buildhash.c'" '(10320 characters)'
if test -f 'buildhash.c'
then
	echo shar: will not over-write existing file "'buildhash.c'"
else
sed 's/^X //' << \SHAR_EOF > 'buildhash.c'
X /* -*- Mode: Text -*- */
X 
X #define MAIN
X 
X /*
X  * buildhash.c - make a hash table for ispell
X  *
X  * Pace Willisson, 1983
X  */
X 
X #include <ctype.h>
X #include <stdio.h>
X #ifdef USG
X #include <sys/types.h>
X #endif
X #include <sys/param.h>
X #include <sys/stat.h>
X #include "config.h"
X #include "ispell.h"
X 
X #define NSTAT 100
X struct stat dstat, cstat;
X 
X int numwords, hashsize;
X 
X char *malloc();
X char *realloc ();
X 
X struct dent *hashtbl;
X 
X char *Dfile;
X char *Hfile;
X 
X char Cfile[MAXPATHLEN];
X char Sfile[MAXPATHLEN];
X 
X main (argc,argv)
X int argc;
X char **argv;
X {
X 	FILE *countf;
X 	FILE *statf;
X 	int stats[NSTAT];
X 	int i;
X 
X 	if (argc > 1) {
X 		++argv;
X 		Dfile = *argv;
X 		if (argc > 2) {
X 			++argv;
X 			Hfile = *argv;
X 		}
X 		else
X 			Hfile = DEFHASH;
X 	}
X 	else {
X 		Dfile = DEFDICT;
X 		Hfile = DEFHASH;
X 	}
X 
X 	sprintf(Cfile,"%s.cnt",Dfile);
X 	sprintf(Sfile,"%s.stat",Dfile);
X 
X 	if (stat (Dfile, &dstat) < 0) {
X 		fprintf (stderr, "No dictionary (%s)\n", Dfile);
X 		exit (1);
X 	}
X 
X 	if (stat (Cfile, &cstat) < 0 || dstat.st_mtime > cstat.st_mtime)
X 		newcount ();
X 
X 	if ((countf = fopen (Cfile, "r")) == NULL) {
X 		fprintf (stderr, "No count file\n");
X 		exit (1);
X 	}
X 	numwords = 0;
X 	fscanf (countf, "%d", &numwords);
X 	fclose (countf);
X 	if (numwords == 0) {
X 		fprintf (stderr, "Bad count file\n");
X 		exit (1);
X 	}
X 	hashsize = numwords;
X 	readdict ();
X 
X 	if ((statf = fopen (Sfile, "w")) == NULL) {
X 		fprintf (stderr, "Can't create %s\n", Sfile);
X 		exit (1);
X 	}
X 
X 	for (i = 0; i < NSTAT; i++)
X 		stats[i] = 0;
X 	for (i = 0; i < hashsize; i++) {
X 		struct dent *dp;
X 		int j;
X 		if (hashtbl[i].used == 0) {
X 			stats[0]++;
X 		} else {
X 			for (j = 1, dp = &hashtbl[i]; dp->next != NULL; j++, dp = dp->next)
X 				;
X 			if (j >= NSTAT)
X 				j = NSTAT - 1;
X 			stats[j]++;
X 		}
X 	}
X 	for (i = 0; i < NSTAT; i++)
X 		fprintf (statf, "%d: %d\n", i, stats[i]);
X 	fclose (statf);
X 
X 	filltable ();
X 
X 	output ();
X 	exit(0);
X }
X 
X output ()
X {
X 	FILE *outfile;
X 	struct hashheader hashheader;
X 	int strptr, n, i;
X 
X 	if ((outfile = fopen (Hfile, "w")) == NULL) {
X 		fprintf (stderr, "can't create %s\n",Hfile);
X 		return;
X 	}
X 	hashheader.magic = MAGIC;
X 	hashheader.stringsize = 0;
X 	hashheader.tblsize = hashsize;
X 	fwrite (&hashheader, sizeof hashheader, 1, outfile);
X 	strptr = 0;
X 	for (i = 0; i < hashsize; i++) {
X 		n = strlen (hashtbl[i].word) + 1;
X #ifdef CAPITALIZE
X 		if (hashtbl[i].followcase)
X 			n += (hashtbl[i].word[n] & 0xFF) * (n + 1) + 1;
X #endif
X 		fwrite (hashtbl[i].word, n, 1, outfile);
X 		hashtbl[i].word = (char *)strptr;
X 		strptr += n;
X 	}
X 	/* Pad file to a struct dent boundary for efficiency. */
X 	n = (strptr + sizeof hashheader) % sizeof (struct dent);
X 	if (n != 0) {
X 		n = sizeof (struct dent) - n;
X 		strptr += n;
X 		while (--n >= 0)
X 		    putc ('\0', outfile);
X 	}
X 	for (i = 0; i < hashsize; i++) {
X 		if (hashtbl[i].next != 0) {
X 			int x;
X 			x = hashtbl[i].next - hashtbl;
X 			hashtbl[i].next = (struct dent *)x;
X 		} else {
X 			hashtbl[i].next = (struct dent *)-1;
X 		}
X 	}
X 	fwrite (hashtbl, sizeof (struct dent), hashsize, outfile);
X 	hashheader.stringsize = strptr;
X 	rewind (outfile);
X 	fwrite (&hashheader, sizeof hashheader, 1, outfile);
X 	fclose (outfile);
X }
X 
X filltable ()
X {
X 	struct dent *freepointer, *nextword, *dp;
X 	int i;
X 
X 	for (freepointer = hashtbl; freepointer->used; freepointer++)
X 		;
X 	for (nextword = hashtbl, i = numwords; i != 0; nextword++, i--) {
X 		if (nextword->used == 0) {
X 			continue;
X 		}
X 		if (nextword->next == NULL) {
X 			continue;
X 		}
X 		if (nextword->next >= hashtbl && nextword->next < hashtbl + hashsize) {
X 			continue;
X 		}
X 		dp = nextword;
X 		while (dp->next) {
X 			if (freepointer > hashtbl + hashsize) {
X 				fprintf (stderr, "table overflow\n");
X 				getchar ();
X 				break;
X 			}
X 			*freepointer = *(dp->next);
X 			dp->next = freepointer;
X 			dp = freepointer;
X 
X 			while (freepointer->used)
X 				freepointer++;
X 		}
X 	}
X }
X 
X 
X readdict ()
X {
X 	struct dent d;
X 	register struct dent *dp;
X 	char lbuf[100];
X 	FILE *dictf;
X 	int i;
X 	int h;
X 	int len;
X 	register char *p;
X 
X 	if ((dictf = fopen (Dfile, "r")) == NULL) {
X 		fprintf (stderr, "Can't open dictionary\n");
X 		exit (1);
X 	}
X 
X 	hashtbl = (struct dent *) calloc (numwords, sizeof (struct dent));
X 	if (hashtbl == NULL) {
X 		fprintf (stderr, "couldn't allocate hash table\n");
X 		exit (1);
X 	}
X 
X 	i = 0;
X 	while (fgets (lbuf, sizeof lbuf, dictf) != NULL) {
X 		if ((i & 1023) == 0) {
X 			printf ("%d ", i);
X 			fflush (stdout);
X 		}
X 		i++;
X 
X 		p = &lbuf [ strlen (lbuf) - 1 ];
X 		if (*p == '\n')
X 			*p = 0;
X 
X 		if (makedent (lbuf, &d) < 0)
X 			continue;
X 
X 		len = strlen (lbuf);
X #ifdef CAPITALIZE
X 		if (d.followcase)
X 			d.word = malloc (2 * len + 4);
X 		else
X 			d.word = malloc (len + 1);
X #endif
X 		if (d.word == NULL) {
X 			fprintf (stderr, "couldn't allocate space for word %s\n", lbuf);
X 			exit (1);
X 		}
X 		strcpy (d.word, lbuf);
X #ifdef CAPITALIZE
X 		if (d.followcase) {
X 			p = d.word + len + 1;
X 			*p++ = 1;		/* Count of capitalizations */
X 			*p++ = '-';		/* Don't keep in pers dict */
X 			strcpy (p, lbuf);
X 			
X 		}
X 		for (p = d.word;  *p;  p++) {
X 			if (mylower (*p))
X 				*p = toupper (*p);
X 		}
X #endif
X 
X 		h = hash (d.word, len, hashsize);
X 
X 		dp = &hashtbl[h];
X 		if (dp->used == 0) {
X 			*dp = d;
X 		} else {
X 
X #ifdef CAPITALIZE
X 			while (dp != NULL  &&  strcmp (dp->word, d.word) != 0)
X 			    dp = dp->next;
X 			if (dp != NULL) {
X 			    if (d.followcase
X 			      ||  (dp->followcase  &&  !d.allcaps
X 				&&  !d.capitalize)) {
X 				/* Add a specific capitalization */
X 				if (dp->followcase) {
X 				    p = &dp->word[len + 1];
X 				    (*p)++;	/* Bump counter */
X 				    dp->word = realloc (dp->word,
X 				      ((*p & 0xFF) + 1) * (len + 2));
X 				    if (dp->word == NULL) {
X 					fprintf (stderr,
X 					  "couldn't allocate space for word %s\n",
X 					  lbuf);
X 					exit (1);
X 				    }
X 				    p = &dp->word[len + 1];
X 				    p += ((*p & 0xFF) - 1) * (len + 2) + 1;
X 				    *p++ = '-';
X 				    strcpy (p,
X 				      d.followcase ? &d.word[len + 3] : lbuf);
X 				}
X 				else {
X 				    /* d.followcase must be true */
X 				    /* thus, d.capitalize and d.allcaps are */
X 				    /* clear */
X 				    free (dp->word);
X 				    dp->word = d.word;
X 				    dp->followcase = 1;
X 				    dp->k_followcase = 1;
X 				    /* Code later will clear dp->allcaps. */
X 				}
X 			    }
X 			    /* Combine two capitalizations.  If d was */
X 			    /* allcaps, dp remains unchanged */
X 			    if (d.allcaps == 0) {
X 				/* dp is the entry that will be kept.  If */
X 				/* dp is followcase, the capitalize flag */
X 				/* reflects whether capitalization "may" */
X 				/* occur.  If not, it reflects whether it */
X 				/* "must" occur. */
X 				if (d.capitalize) {	/* ie lbuf was cap'd */
X 				    if (dp->followcase)
X 					dp->capitalize = 1;	/* May */
X 				    else if (dp->allcaps) /* ie not lcase */
X 					dp->capitalize = 1;	/* Must */
X 				}
X 				else {		/* lbuf was followc or all-lc */
X 				    if (!dp->followcase)
X 					dp->capitalize == 0;	/* May */
X 				}
X 				dp->k_capitalize == dp->capitalize;
X 				dp->allcaps = 0;
X 				dp->k_allcaps = 0;
X 			    }
X 			}
X 			else {
X #endif
X 			    dp = (struct dent *) malloc (sizeof (struct dent));
X 			    if (dp == NULL) {
X 				fprintf (stderr,
X 				  "couldn't allocate space for collision\n");
X 				exit (1);
X 			    }
X 			    *dp = d;
X 			    dp->next = hashtbl[h].next;
X 			    hashtbl[h].next = dp;
X 			}
X 		}
X 	}
X 	printf ("\n");
X }
X 
X /*
X  * fill in the flags in d, and put a null after the word in s
X  */
X 
X makedent (lbuf, d)
X char *lbuf;
X struct dent *d;
X {
X 	char *p, *index();
X 
X 	d->next = NULL;
X 	d->used = 1;
X 	d->v_flag = 0;
X 	d->n_flag = 0;
X 	d->x_flag = 0;
X 	d->h_flag = 0;
X 	d->y_flag = 0;
X 	d->g_flag = 0;
X 	d->j_flag = 0;
X 	d->d_flag = 0;
X 	d->t_flag = 0;
X 	d->r_flag = 0;
X 	d->z_flag = 0;
X 	d->s_flag = 0;
X 	d->p_flag = 0;
X 	d->m_flag = 0;
X 	d->keep = 0;
X #ifdef CAPITALIZE
X 	d->allcaps = 0;
X 	d->capitalize = 0;
X 	d->followcase = 0;
X 	/*
X 	** Figure out the capitalization rules from the capitalization of
X 	** the sample entry.  Only one of followcase, allcaps, and capitalize
X 	** will be set.  Combinations are generated by higher-level code.
X 	*/
X 	for (p = lbuf;  *p  &&  *p != '/';  p++) {
X 		if (mylower (*p))
X 			break;
X 	}
X 	if (*p == '\0'  ||  *p == '/')
X 		d->allcaps = 1;
X 	else {
X 		for (  ;  *p  &&  *p != '/';  p++) {
X 			if (myupper (*p))
X 				break;
X 		}
X 		if (*p == '\0'  ||  *p == '/') {
X 			/*
X 			** No uppercase letters follow the lowercase ones.
X 			** If the first two letters are capitalized, it's
X 			** "followcase". If the first one is capitalized, it's
X 			** "capitalize".
X 			*/
X 			if (myupper (lbuf[0])) {
X 				if (myupper (lbuf[1]))
X 					d->followcase = 1;
X 				else
X 					d->capitalize = 1;
X 			}
X 		}
X 		else
X 			d->followcase = 1;	/* .../lower/upper */
X 	}
X 	d->k_allcaps = d->allcaps ;
X 	d->k_capitalize = d->capitalize;
X 	d->k_followcase = d->followcase;
X #endif
X 
X 	p = index (lbuf, '/');
X 	if (p != NULL)
X 		*p = 0;
X 	if (strlen (lbuf) > WORDLEN - 1) {
X 		printf ("%s: word too big\n", lbuf);
X 		return (-1);
X 	}
X 
X 	if (p == NULL)
X 		return (0);
X 
X 	p++;
X 	while (*p != '\0'  &&  *p != '\n') {
X 		if (mylower (*p))
X 			*p = toupper (*p);
X 		switch (*p) {
X 		case 'V': d->v_flag = 1; break;
X 		case 'N': d->n_flag = 1; break;
X 		case 'X': d->x_flag = 1; break;
X 		case 'H': d->h_flag = 1; break;
X 		case 'Y': d->y_flag = 1; break;
X 		case 'G': d->g_flag = 1; break;
X 		case 'J': d->j_flag = 1; break;
X 		case 'D': d->d_flag = 1; break;
X 		case 'T': d->t_flag = 1; break;
X 		case 'R': d->r_flag = 1; break;
X 		case 'Z': d->z_flag = 1; break;
X 		case 'S': d->s_flag = 1; break;
X 		case 'P': d->p_flag = 1; break;
X 		case 'M': d->m_flag = 1; break;
X 		case 0:
X  			fprintf (stderr, "no flags on word %s\n", lbuf);
X 			continue;
X 		default:
X 			fprintf (stderr, "unknown flag %c word %s\n", 
X 					*p, lbuf);
X 			break;
X 		}
X 		p++;
X 		if (*p == '/')		/* Handle old-format dictionaries too */
X 			p++;
X 	}
X 	return (0);
X }
X 
X newcount ()
X {
X 	char buf[200];
X 	char lastbuf[200];
X 	FILE *d;
X 	int i;
X 	register char *cp;
X 
X 	fprintf (stderr, "Counting words in dictionary ...\n");
X 
X 	if ((d = fopen (Dfile, "r")) == NULL) {
X 		fprintf (stderr, "Can't open dictionary\n");
X 		exit (1);
X 	}
X 
X 	for (i = 0, lastbuf[0] = '\0';  fgets (buf, sizeof buf, d);  ) {
X 		for (cp = buf;  *cp;  cp++) {
X 			if (mylower (*cp))
X 				*cp = toupper (*cp);
X 		}
X 		if (strcmp (buf, lastbuf) != 0) {
X 			if ((++i & 1023) == 0) {
X 				printf ("%d ", i);
X 				fflush (stdout);
X 			}
X 			strcpy (lastbuf, buf);
X 		}
X 	}
X 	fclose (d);
X 	printf ("\n%d words\n", i);
X 	if ((d = fopen (Cfile, "w")) == NULL) {
X 		fprintf (stderr, "can't create %s\n", Cfile);
X 		exit (1);
X 	}
X 	fprintf (d, "%d\n", i);
X 	fclose (d);
X }
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'config.X'" '(4594 characters)'
if test -f 'config.X'
then
	echo shar: will not over-write existing file "'config.X'"
else
sed 's/^X //' << \SHAR_EOF > 'config.X'
X /*
X  * This is the configuration file for ispell.  Thanks to Bob McQueer
X  * for creating it and making the necessary changes elsewhere to
X  * support it.
X  * Look through this file from top to bottom, and edit anything that
X  * needs editing.  There are also five or six variables in the
X  * Makefile that you must edit.  Note that the Makefile edits this
X  * file (config.X) to produce config.h.  If you are looking at
X  * config.h, you're in the wrong file.
X  *
X  * Don't change the funny-looking lines with !!'s in them;  see the
X  * Makefile!
X  */
X 
X /*
X ** library directory for hash table(s) / default hash table name
X ** If you intend to use multiple dictionary files, I would suggest
X ** LIBDIR be a directory which will contain nothing else, so sensible
X ** names can be constructed for the -d option without conflict.
X */
X #ifndef LIBDIR
X #define LIBDIR "!!LIBDIR!!"
X #endif
X #ifndef DEFHASH
X #define DEFHASH "!!DEFHASH!!"
X #endif
X 
X #ifdef USG
X #define index strchr
X #define rindex strchr
X #endif
X 
X /* environment variable for user's word list */
X #ifndef PDICTVAR
X #define PDICTVAR "WORDLIST"
X #endif
X 
X /* default word list */
X #ifndef DEFPDICT
X #define DEFPDICT ".ispell_words"
X #endif
X 
X /* environment variable for include file string */
X #ifndef INCSTRVAR
X #define INCSTRVAR "INCLUDE_STRING"
X #endif
X 
X /* default include string */
X #ifndef DEFINCSTR
X #define DEFINCSTR "&Include_File&"
X #endif
X 
X /* mktemp template for temporary file - MUST contain 6 consecutive X's */
X #ifndef TEMPNAME
X #define TEMPNAME "/tmp/ispellXXXXXX"
X #endif
X 
X /* default dictionary file */
X #ifndef DEFDICT
X #define DEFDICT "!!DEFDICT!!"
X #endif
X 
X /* path to LOOK (if look(1) command is available) */
X #ifndef LOOK
X #undef LOOK
X #endif
X 
X /* path to egrep (use speeded up version if available) */
X #ifndef EGREPCMD
X #define EGREPCMD "/bin/egrep"
X #endif
X 
X /* path to wordlist for Lookup command (typically /usr/dict/{words|web2} */
X #ifndef WORDS
X #define WORDS	"/usr/dict/words"
X #endif
X 
X /* buffer size to use for file names if not in sys/param.h */
X #ifndef MAXPATHLEN
X #define MAXPATHLEN 240
X #endif
X 
X /* word length allowed in dictionary by buildhash */
X #define WORDLEN 30
X 
X /* suppress the 8-bit character feature */
X #ifndef NO8BIT
X #define NO8BIT
X #endif
X 
X /* maximum number of include files supported by xgets;  set to 0 to disable */
X #ifndef MAXINCLUDEFILES
X #define MAXINCLUDEFILES	5
X #endif
X 
X /* Approximate number of words in the full dictionary, after munching.
X ** Err on the high side unless you are very short on memory, in which
X ** case you might want to change the tables in tree.c and also increase
X ** MAXPCT.
X **
X ** (Note:  dict.191 is a bit over 15000 words.  dict.191 munched with
X ** /usr/dict/words is a little over 28000).
X */
X #ifndef BIG_DICT
X #define BIG_DICT	29000
X #endif
X 
X /*
X ** Maximum hash table fullness percentage.  Larger numbers trade space
X ** for time.
X **/
X #ifndef MAXPCT
X #define MAXPCT	70		/* Expand table when 70% full */
X #endif
X 
X /*
X ** the isXXXX macros normally only check ASCII range.  These are used
X ** instead for text characters, which we assume may be 8 bit.  The
X ** NO8BIT ifdef shuts off significance of 8 bit characters.  If you are
X ** using this, and your ctype.h already masks, you can simplify.
X */
X #ifdef NO8BIT
X #define myupper(X) isupper((X)&0x7f)
X #define mylower(X) islower((X)&0x7f)
X #define myspace(X) isspace((X)&0x7f)
X #define myalpha(X) isalpha((X)&0x7f)
X #else
X #define myupper(X) (!((X)&0x80) && isupper(X))
X #define mylower(X) (!((X)&0x80) && islower(X))
X #define myspace(X) (!((X)&0x80) && isspace(X))
X #define myalpha(X) (!((X)&0x80) && isalpha(X))
X #endif
X 
X /*
X ** the NOPARITY mask is applied to user input characters from the terminal
X ** in order to mask out the parity bit.
X */
X #ifdef NO8BIT
X #define NOPARITY 0x7f
X #else
X #define NOPARITY 0xff
X #endif
X 
X 
X /*
X ** the terminal mode for ispell, set to CBREAK or RAW
X **
X */
X #ifndef TERM_MODE
X #define TERM_MODE	CBREAK
X #endif
X 
X /*
X ** Define this if you want your columns of words to be of equal length.
X ** This will spread short word lists across the screen instead of down it.
X */
X #ifndef EQUAL_COLUMNS
X #undef EQUAL_COLUMNS
X #endif
X 
X /*
X ** This is the extension that will be added to backup files
X */
X #ifndef	BAKEXT
X #define	BAKEXT	".bak"
X #endif
X 
X /*
X ** Define this if you want the capitalization feature.  This will increase
X ** the size of the hashed dictionary on most 16-bit and some 32-bit machines.
X */
X #ifndef CAPITALIZE
X #define CAPITALIZE
X #endif
X 
X /*
X ** Define this if you want your personal dictionary sorted.  This may take
X ** a long time for very large dictionaries.  Dictionaries larger than
X ** SORTPERSONAL words will not be sorted.
X */
X #ifndef SORTPERSONAL
X #define SORTPERSONAL	1000
X #endif
SHAR_EOF
fi # end of overwriting check
echo shar: extracting "'fixdict.sh'" '(2502 characters)'
if test -f 'fixdict.sh'
then
	echo shar: will not over-write existing file "'fixdict.sh'"
else
sed 's/^X //' << \SHAR_EOF > 'fixdict.sh'
X : Use /bin/sh
X #
X #	Add capitalization information to an ispell dictionary
X #
X #	Usage:
X #
X #	fixdict dict-file
X #
X #	Requires availability of UNIX spell.  The new dictionary is
X #	rewritten in place.  A list of words that couldn't be
X #	resolved (because spell doesn't know them) is written to
X #	standard output.  This list appears in lowercase in the
X #	dictionary, and if there are any errors the must be edited
X #	by hand.
X #
X #	The final dictionary appears in expanded form and must be
X #	passed through munchlist to regenerate suffixes.
X #
X LIBDIR=/tmp2/lib
X EXPAND1=${LIBDIR}/isexp1.sed
X EXPAND2=${LIBDIR}/isexp2.sed
X EXPAND3=${LIBDIR}/isexp3.sed
X EXPAND4=${LIBDIR}/isexp4.sed
X TDIR=${TMPDIR:-/tmp}
X TMP=${TDIR}/fix$$
X 
X trap "/bin/rm -f ${TMP}*; exit 1" 1 2 15
X sed -f ${EXPAND1} $1 | sed -f ${EXPAND2} \
X   | sed -f ${EXPAND3} | sed -f ${EXPAND4} \
X   | tr '[A-Z]' '[a-z]' \
X   | spell \
X   | sort > ${TMP}a
X #
X # ${TMP}a contains all the words that spell doesn't like.
X # Now figure out which of those are because spell doesn't know them at
X # all, and leave those in ${TMP}b.
X #
X tr '[a-z]' '[A-Z]' < ${TMP}a | spell | tr '[A-Z]' '[a-z]' > ${TMP}b
X #
X # The wrongly-capitalized words are those that spell didn't object to
X # in the last step.  Produce a list of them in, and capitalize the
X # first letter of each.  Save this list in ${TMP}c.
X #
X comm -23 ${TMP}a ${TMP}b \
X   | sed 's/^a/A/;s/^b/B/;s/^c/C/;s/^d/D/;s/^e/E/;s/^f/F/;s/^g/G/;s/^h/H/
X      s/^i/I/;s/^j/J/;s/^k/K/;s/^l/L/;s/^m/M/;s/^n/N/;s/^o/O/;s/^p/P/
X      s/^q/Q/;s/^r/R/;s/^s/S/;s/^t/T/;s/^u/U/;s/^v/V/;s/^w/W/;s/^x/X/
X      s/^y/Y/;s/^z/Z/' > ${TMP}c
X #
X # Find out which of those spell objects to, saving the failures in ${TMP}d.
X #
X spell ${TMP}c > ${TMP}d
X #
X # Extract the words which were correctly capitalized at the first letter,
X # combine them with an all-capitals version of the ones that weren't, and
X # put the result into ${TMP}e.
X #
X (comm -23 ${TMP}c ${TMP}d;  tr '[a-z]' '[A-Z]' < ${TMP}d) \
X   | sort -o ${TMP}e
X #
X # At this point, ${TMP}b contains the words that spell just plain doesn't
X # like, and ${TMP}e contains the words that are now capitalized correctly.
X #
X /bin/rm ${TMP}[cd]
X #
X # Put it all together, rewriting the dictionary in place.
X #
X sed -f ${EXPAND1} $1 | sed -f ${EXPAND2} \
X   | sed -f ${EXPAND3} | sed -f ${EXPAND4} \
X   | tr '[A-Z]' '[a-z]' \
X   | sort \
X   | comm -23 - ${TMP}a \
X   | sort -f -o $1 - ${TMP}b ${TMP}e
X #
X # Finally, write the list of words that have questionable capitalization
X # to the standard output.
X #
X cat ${TMP}b
X /bin/rm ${TMP}*
SHAR_EOF
chmod +x 'fixdict.sh'
fi # end of overwriting check
#	End of shell archive
exit 0

-- 
Brandon S. Allbery	{decvax,cbatt,cbosgd}!cwruecmp!ncoast!allbery
Tridelta Industries	{ames,mit-eddie,talcott}!necntc!ncoast!allbery
7350 Corporate Blvd.	necntc!ncoast!allbery@harvard.HARVARD.EDU
Mentor, OH 44060	+01 216 255 1080	(also eddie.MIT.EDU)
-- 
Copyright 1987 John Gilmore; you may redistribute only if your recipients may.
(This is an effort to bend Stargate to work with Usenet, not against it.)
{sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu	       gnu@ingres.berkeley.edu