[net.sources] spellfix - interactive spelling checker and fixer

RSanders@DENVER.ARPA (04/21/85)

Tired of scribbling lists of misspelled words on random scraps of paper
after using spell?  (If you never use spell, skip this message).
Spellfix is a reasonably convenient hack.  Be sure to check out the BUGS
section of the man page - someday, I'll work out a sed/awk syntax for
troff requests that will fix that bug.  Cut at the dotted lines.
This works for us on VAX/4.2 BSD (ran under 4.1 also).

-- Rex
.........................................................................
.TH SPELLFIX 1 LOCAL "USGS Pacific Marine Geology"
.SH NAME
spellfix - interactively fix spelling errors
.SH SYNOPSIS
.B spellfix
file
.SH DESCRIPTION
.PP
.I spellfix
is an interactive spelling checker and fixer.
Using
.I spellfix
is a 4 step process:
.TP
1.
Type:
.sp
.B
          spellfix
.I file
.sp
where
.I file
is the name of your manuscript.
.TP
2.
Edit the misspelled words to leave only the words you want flagged
in your manuscript.
.TP
3.
Edit the temporary copy of your original file.
The words you left in step 2 will be flagged with lines like:
.br
.sp
.ec +
        .\"  ###  spell:  foobar  %%%
.ec
.br
.sp
where
.I foobar
is a misspelled word.
You should correct the misspellings on the text lines that follow.
The flag lines will be removed automatically before the next step.
.TP
4.
Answer
.IR spellfix 's
question:
.sp
          overwrite
.I file
?
.sp
with
.B yes
or
.BR no .
If your answer is yes,
.I spellfix
will replace your original file with the file you created in step 3.
.SH NOTES
In step 3, the search string for
.IR vi (1)
is set to "###";
you can conveniently search for the next spelling error with the
"n" request.
.PP
.I spellfix
was inspired by
.IR error (1).
.PP
The peculiar look of the flag lines is an attempt to make them
.IR troff (1)
comments.
.SH SEE ALSO
spell(1), vi(1), troff(1), error(1), myspell(1), myspellfix(1).
.SH BUGS
.PP
.I spellfix
can insert more error lines than needed.
For example, if
.I spell
rejects 'teache',
lines with 'teacher' will be marked as erroneous.
.SH AUTHOR
Rex Sanders, U.S. Geological Survey, Pacific Marine Geology
..............................................................................
#! /bin/sh
# spellfix - interactive spelling checker and fixer
#   Rex Sanders, USGS Pacific Marine Geology
t1=/tmp/$$.1
t2=/tmp/$$.2
t3=/tmp/$$.3

if [ $# -ne 1 ]
then
  echo "Usage:  spellfix filename" >&2
  exit 1
fi
trap 'rm -f $t1 $t2 $t3; exit 2' 1 2 15

echo "Checking spelling errors in $1 ..."
spell $1 > $t1
if [ ! -s $t1 ]
then
  rm -f $t1
  exit 0
fi

echo "Editing `wc -l <$t1` misspelled words - leave only truly misspelled words.
"
sleep 3
vi $t1
  echo "Editing a temporary copy of $1 - fix spelling errors
"
cat $t1 | while read word
do

  echo "/${word}/i\
.\\\" ### spell: ${word} %%%" >> $t3

done

if (sed -f $t3 < $1 > $t2 2> /dev/null)
then
  sleep 3
  vi +/###/ $t2
  sed -e '/^\.\" ### spell: .* %%%$/d' $t2 > $t3
  cp -i $t3 $1
else
  echo "spellfix: error processing misspelled words - file $1 not affected." >&2
  rm -f $t1 $t2 $t3
  exit 1
fi

rm -f $t1 $t2 $t3

laura@utzoo.UUCP (Laura Creighton) (04/23/85)

Hello there. I have my own interactive spelling fixer. It doesn't
munge up your files or dump you into vi. It uses sed. It doesn't have
a manual page (what do you expect for a 1/2 hour hack that I never
thought anybody else would want?). What you do is to run spell on your
magnum opus to get a list of errors. Put them in spell.errs (well, if
you don't like that name you can define it to be something else in your
environment). Then run ispell. It will prompt you for every error.
``y'' is -- yes, that's an error, but I don't know how to spell it;
``n'' is -- no, ``Creighton'' *isn't* a spelling error, you stupid program,
and anything else is a correct spelling. You can keep error lists around
until you find a dictionary. You can do shell escapes. It catches signals
properly. 

There is one known problem. If you mispell the word ``the'' as ``teh''
(to pick a random example you will *never* see in any news articles
posted by me :-) ) it will munge all words that have ``teh'' in them.
So you had better run spell one last time when you are done.

Flash Bulletin! Geoff Collyer thinks it is spiffo! He is going to write me a
manual page on the spot. Thanks Geoff...

Manual page written. Oh -- the program uses ``overwrite(1)'' which is
found on page 154 of Kernighan and Pike. If you don't have this book,
you should probably buy it -- it is good stuff. If you can't figure out
how to write overwrite from context, then you *definitely* should buy it -
by the time you finish reading the first 4 chapters you won't have this
problem.


Laura Creighton
utzoo!laura (but not for long... yippee!)

-------------here is ispell-----------------------
#! /bin/sh
# ispell - an interactive spelling checker

case $#
in
	0)	echo "usage: $0 filenames" >&2; exit 1 ;;
esac

: ${SPELLERRS=spell.errs}	# the list of mistakes
: ${SPELLFILE=/tmp/ispell$$}	# the temporary file of unfixed mistakes
: ${SPELLFIX=/tmp/ispellf$$}	# the temporary file of sed commands

trap "rm -f $SPELLFILE $SPELLFIX; exit 1" 1 2 15	# clean up files

for mistake in `cat $SPELLERRS`
do
	echo -n "$mistake? "
	read correction
	case "$correction" in
	q|Q)	break ;;
	y|Y|"")	echo $mistake >> $SPELLFILE ;;
	n|N)	;;
	!*)	trap "echo !; continue" 1 2 15	# child gets signals,
			# parent ignores them
			# this means that you can type
			# interrupts without blowing away the
			# whole session.
		command=`echo "$correction" | sed 's@^.@@'`
		$command
		trap "rm -f $SPELLFILE $SPELLFIX; exit 1" 1 2 15
			# catch signals again
		echo !
		echo $mistake >> $SPELLFILE ;; 
			# don't lose it, you can always delete it later
	*)	echo s@$mistake@$correction@g >> $SPELLFIX ;;
	esac
done

if test -s $SPELLFIX
then
	for i
	do
		overwrite $i sed -f $SPELLFIX $i
	done
fi

if test -s $SPELLFILE
then
	mv $SPELLFILE $SPELLERRS
else
	rm -f $SPELLFILE $SPELLERRS
fi
rm -f $SPELLFIX

----------and this is the brand new manual page!!!--------------
.TH ISPELL 1 local
.DA 23 April 1985
.SH NAME
ispell \- interactive spelling repair
.SH SYNOPSIS
.B ispell
file...
.SH DESCRIPTION
.I Ispell
corrects spelling errors (as verified by the user) in
.IR file s.
Errors are assumed to be found in
.I spell.errs
(overridden by the environment variable SPELLERRS),
which
.I ispell
maintains.
.I Ispell
prompts the user for each misspelling and asks how to correct it.
.PP
If the user types
.BR n ,
the misspelling is discarded.
If the user types
.B y
or a newline,
the misspelling is retained but not corrected.
A line beginning with
.B !
is taken to be a shell escape and the misspelling is retained.
Any other response is assumed to be the corrected form of
the misspelling and will be corrected after the last reply has been typed.
If the user types
.BR q ,
.I ispell
discards the remaining misspellings
(probably because you know the remaining ones are correct)
and applies any corrections accumulated so far.
.SH FILES
spell.errs	running list of potential mistakes
.br
/tmp/ispell$$	new spell.errs being created
.br
/tmp/ispellf$$	sed commands to correct the mistakes
.SH SEE ALSO
spell(1), overwrite(1)
.SH HISTORY
Written by Laura Creighton in a fit of desperation when proof-reading
a long manuscript; manual page by Geoff Collyer so she wouldn't have
to post the source without this page.
.SH BUGS
Shell escapes probably shouldn't assume that you don't know
the correction and should prompt for the mistake again.
.PP
Correcting a short word may break longer words in which the short misspelling
appears.

laura@utzoo.UUCP (Laura Creighton) (04/23/85)

ps -- sorry about that -- those ``@''s in the sed should be something else,
'cause you might want to type an ``@'' some day. I use control-Vs. But
inews strips control characters which would leave you with a broken script.
So I replaced them with ``@''s.

Laura Creighton
utzoo!laura (until Thursday...)

holte@brueer.UUCP (Robert Holte) (04/29/85)

There is a small bug in the "spellfix" shell script, as distributed.
The sed commands (stored in $t3) are not being created in
a legal format.

The fix is to replace the following lines (32-33) in the original
  echo "/${word}/i\
.\\\" ### spell: ${word} %%%" >> $t3

with

  echo "/${word}/i\\" >> $t3
  echo ".\\\\\" ### spell: ${word} %%%" >> $t3


The author observes in the BUGS section of the man page that
	"spellfix" can insert more error lines than needed.
	For example, if "spell" rejects 'teache',
	lines with 'teacher' will be marked as erroneous.

This can be extremely inconvenient, and can largely be avoided by
using more selective sed patterns.  I suggest using all
of the following (instead of the bug fix just given):

  echo "/[^a-zA-Z]${word}[^a-zA-Z]/i\\" >> $t3
  echo ".\\\\\" ### spell: ${word} %%%" >> $t3

  echo "/[^a-zA-Z]${word}\$/i\\" >> $t3
  echo ".\\\\\" ### spell: ${word} %%%" >> $t3

  echo "/^${word}[^a-zA-Z]/i\\" >> $t3
  echo ".\\\\\" ### spell: ${word} %%%" >> $t3

  echo "/^${word}\$/i\\" >> $t3
  echo ".\\\\\" ### spell: ${word} %%%" >> $t3

Collectively, the four sed patterns created in this way pick out
all occurrences of "word" which are neither preceded nor
followed by a letter of the alphabet.  This is still not
perfect, but it is very much better than the original.

 -- Rob Holte     (...!ukc!reading!brueer!holte)

RSanders@USGS2-MULTICS.ARPA (04/30/85)

(I'm posting this here because I don't have access to net.sources.bugs -
if it exists)

Rob Holte (brueer!holte) posted some changes to spellfix that I must
comment on - specifically, the extra "sed" lines to parse misspelled
words better.  Unfortunately, I implemented and discarded his method
(and others even hairier) because of the following, common, troff/nroff
constructs:

  \*(lqThjs is a quote\*(rq

  \fBThjs is bold text\fR

With Rob's sed lines, these lines would not be flagged.  Rather than
*not flag* misspelled words, I decided to flag *too many* properly
spelled words.  If you never run spellfix on nroff/troff input files,
Rob's changes will work fine.  I alluded to this problem in the header
to the posting message for spellfix.

-- Rex

P.S.  I still haven't seen a copy of "correct" - if someone could mail
or re-post...

silvert@dalcs.UUCP (Bill Silvert) (05/03/85)

Here is an enhanced version of spellfix with the following features:
menu driven rather than requiring editing of spell.errors in vi;
supports user dictionary $HOME/dict/words and menu allows user to
	add words to this dictionary;
option for automatic correction of words.

This was created by modifying Rex Saunders' script, and he gets most
of the credit for it.  My changes are inspired mainly by the CP/M
spelling checker "The Word Plus", which is a superb utility with
these and more features.

It has been suggested that spellfix check for complete words.
That is too messy for my taste -- my spelling isn't that bad!
Also, a recent item in net.sources.bugs points out that you can
get into trouble with troff commands.

Another problem which I haven't a fix for is case differences.
I have chosen to ignore all words with embedded numerals (this
includes 2nd) and anything which is all in capitals (since I
write a lot of stuff with embedded Fortran code).
---------------------------cut here-------------------------------
#! /bin/sh
# run through /bin/sh to create script and manual entry
cat > spellfix << xxSHELLxx
#! /bin/sh
# <@(#)spellfix	Ver. 1.6, 85/04/29 12:03:48> - interactive spelling checker and fixer
#   Rex Sanders, USGS Pacific Marine Geology
#   Modifications by Bill Silvert, MEL
t1=/tmp/spf$$.1
t2=/tmp/spf$$.2
t3=/tmp/spf$$.3
prog=`basename $0`
udict=$HOME/dict
uwords=$udict/words

case $# in
1)	trap 'rm -f $t1 $t2 $t3; exit' 0 1 2 15 ;;
*)	echo "Usage: $prog filename" >&2
	exit 1 ;;
esac

echo "Looking for spelling errors in $1 ..."
# ignore upper-case 'words' and alphnumerics
spell $1 | grep "[a-z]" | grep -v "[0-9]" | sort > $t2
if test -s $uwords
then	sort -u $uwords -o $uwords # clean up user's dictionary
	comm -23 $t2 $uwords > $t1
else	mv $t2 $t1
fi
test -s $t1 || exit 0

test -d $udict || mkdir $udict

echo "Found `wc -l <$t1` misspellings"
echo "Responses:	A=add to user dictionary, B=bypass"
echo "		C=correct, M=mark for correction"

for word in `cat $t1`
do
	grep $word $1
	while :
	do
		echo -n "${word}: (A/B/C/M?) "
		read response
		case $response in
		A|a)	echo $word >> $uwords
			break ;;
		B|b)	break ;;
		C|c)	echo -n "Correct spelling: "
			read response
			echo "s/${word}/${response}/" >> $t3
			break ;;
		M|m)	echo "/${word}/i\\" >> $t3
			echo "### spell: ${word} %%%" >> $t3
			break ;;
		*)	;;
		esac
	done
done

test -s $t3 || exit 0

if (sed -f $t3 < $1 > $t2 2> /dev/null)
then
  echo "Here is a temporary copy of $1 to edit: use n to find next error"
  sleep 3
  vi +/###/ $t2
  sed -e '/^### spell: .* %%%$/d' $t2 > $t3
  cp -i $t3 $1
else
  echo "spellfix: error marking misspelled words - file $1 unchanged." >&2
fi
xxSHELLxx
chmod +x spellfix
cat >spellfix.1 <<xxMANxx
.TH SPELLFIX 1 "Local Utility"
.SH NAME
spellfix - interactively fix spelling errors
.SH SYNOPSIS
.B spellfix
file
.SH DESCRIPTION
.PP
.I spellfix
is an interactive spelling checker and fixer.
It calls
.I spell
and also uses a local word list of the user.
Using
.I spellfix
is a 4 step process:
.TP
1.
Type:
.sp
.B
          spellfix
.I file
.sp
where
.I file
is the name of your manuscript.
.I spellfix
only handles one file at a time.
.TP
2.
.I spellfix
will list all apparent misspellings in your manuscript,
along with all lines in which each occurs.
Respond to each with A if you want it added to your local
dictionary file, ~/dict/words; B to bypass the word;
C to correct it; and M to mark it in the file for later correction.
If you respond with C,
.I spellfix
will prompt you for the corrected spelling.
Make sure that you want this correction made in all lines shown!
.TP
3.
Edit the temporary copy of your original file.
The words you marked in step 2 will be flagged with lines like:
.br
.sp
.ec +
        ###  spell:  foobar  %%%
.ec
.br
.sp
where
.I foobar
is a misspelled word.
You should correct the misspellings on the text lines that follow.
The flag lines will be removed automatically before the next step.
.TP
4.
Answer
.IR spellfix 's
question:
.sp
          overwrite
.I file
?
.sp
with
.B yes
or
.BR no .
If your answer is yes,
.I spellfix
will replace your original file with the file you created in step 3.
.SH NOTES
In step 3, the search string for
.IR vi (1)
is set to "###";
you can conveniently search for the next spelling error with the
"n" request.
.PP
.I spellfix
was inspired by
.IR error (1).
.SH FILES
~/dict/words	user dictionary
.SH SEE ALSO
sed(1), spell(1), vi(1), troff(1), error(1).
.SH BUGS
.PP
.I spellfix
can change more lines than needed.
For example, if
.I spell
rejects 'teache',
lines with 'teacher' will be marked as erroneous.
.SH AUTHOR
Rex Sanders, U.S. Geological Survey, Pacific Marine Geology.
Modifications by William Silvert, Marine Ecology Laboratory.
xxMANxx
-- 
Bill Silvert
Marine Ecology Lab.
Dartmouth, NS
dalcs!biomel!bill