rik@ucla-cs.UUCP (05/06/85)
Here's a program I use all the time to correct a text file. It is different from "spellfix" posted recently in that it makes the changes to the file directly. It is a Bourne shell script but I heard that a similar program (written in C and therefore probably more efficient) was posted a couple of years ago. If somebody has a copy of that, please post so that we can compare. Rik Verstraete. ARPA: rik@UCLA-CS.ARPA UUCP: ...!{cepu,ihnp4,trwspp,ucbvax}!ucla-cs!rik ------------------------------------------------------------------------ #!/bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # correct.1 # correct # This archive created: Mon May 6 09:48:33 1985 export PATH; PATH=/bin:$PATH echo cshar: extracting "'correct.1'" '(4126 characters)' if test -f 'correct.1' then echo cshar: over-writing existing file "'correct.1'" fi sed 's/^X//' << \SHAR_EOF > 'correct.1' X.TH CORRECT 1P 12/21/83 X.SH NAME Xcorrect \- program to find and correct spelling errors X.SH SYNOPSIS X.B correct X[ X.B \-\fIn\fP X] X[ X.B \-\fId\fP X.B \fIdictionary\fP X] file X.SH DESCRIPTION X.I correct Xis a program that Xfinds the spelling errors in a file, Xedits this file to correct the mistakes, Xand allows the user to maintain a dictionary file. XUsing X.I correct Xhas been preferable to running X\fIspell\fR(1) Xand then editing the file. X.PP XFirst, Xthe program runs \fIspell\fR(1). XThe list of errors produced is Xchecked against the entries in a user-defined dictionary file, Xand the ones not in the dictionary are filtered out. XThe default dictionary is $HOME/dictionary X(which need not exist; X.I correct Xwill create it if necessary). XA different dictionary file can be Xspecified by using the X.I \-d Xflag. X.PP XIf the X.I \-n Xoption is not used (see below), Xthe program then presents each error to the user, Xoffering five options: X.TP X.I a[dd] Xadd the word to the user dictionary file. X.TP X.I s[ubstitute] Xsubstitute the word throughout the entire file. X.TP X.I c[heck] Xcheck the word in a line-by-line context X(with option to edit at each place). X.TP X.I l[ook] Xlook up the word in the system dictionary X(via the \fIlook\fR(1) command). X.TP X<return> Xno action. X.PP XWhen all the errors have been processed, X.I correct Xgives the user a final chance to abort the Xcorrections by prompting for yes/no. XA reply `yes' means that the original file will be changed Xto reflect the corrections of the spelling errors. XFinally, the dictionary is sorted Xand all temporary workspace is removed. XNote that X.I correct Xalways sorts the dictionary file, Xeven if no errors were found. X.sp XWith the X.I \-n Xflag, Xno editing is done; X.I correct Xjust checks the user lexicon Xand sends the resulting list to the standard output X(looks like \fIspell\fR(1) except that Xthe list does not contain entries from the dictionary, and Xis sorted in \s-2ASCII\s+2 rather than alphabetical order). X.sp XThe X.I \-d Xflag is used to specify a different dictionary. XThe flag should be followed by a file name, Xwhich will be taken as the dictionary, Xinstead of $HOME/dictionary. XNote that the X.I \-n Xflag, the X.I \-d Xflag (with dictionary file name), Xand the file itself, Xcan be specified in any order. X.LP X.I Restriction: XThis program only works on X.I one Xfile; Xit can not edit more than one file at a time. XAlso, X``.so'' \fItroff\fR(1) commands should not be used Xin the file to be corrected. X.SH FILES X$HOME/dictionary: default dictionary file. X.SH SEE\ ALSO Xspell(1), Xlook(1), Xsort(1), Xgrep(1), Xex(1), Xsed(1). X.SH DIAGNOSTICS X.I correct Xwill complain if Xmore than one or no file to be corrected is specified; Xif the X.I \-d Xflag is used without a dictionary file; Xor if the file to be corrected does not exist. X.SH BUGS XA serious bug has to do with Xthe form of search. XThe X.I \-w Xoption of X\fIgrep\fR(1) Xand the /\\<...\\>/ search pattern in X\fIex\fR(1) Xare used to find and edit only Xthose matching letter patterns Xthat are indeed words. XA line such as: X.sp X.ce X\\fBGradautes\\fR X.sp X(\\fx is the troff command to change to font x) Xwill cause ``Gradautes'' to come out Xas an error via X.I spell . XHowever, Xin the editing phase, X``Gradautes'' will not be picked up as a ``word'' X(because of the fB in front) Xand will not get changed. X.PP XChoosing to check ``line-by-line'' Xleads to problems in the editing stage. XSpecifically, Xif the line has $'s, *'s, etc. X(characters with special meaning to X.I ex ) Xthen the editing may not work as expected. XThis problem has been alleviated somewhat Xby using the ``nomagic'' option of X\fIex\fR(1), Xbut it still remains a problem... X.PP XAnother minor problem with check line-by-line occurs Xwhen two identical errors appear Xin one line. XRight now, Xboth (or more) errors will be ``corrected'' in the same way; Xone cannot change the two words in different ways. X.PP XApparently, X.I correct Xdoesn't work on all dial-up terminals. XThat is, Xthe program runs, Xbut no corrections are made X(something to do with X\fIex\fR(1) Xand minimum baud rates). X.SH AUTHORS XTovah Hollander (``tovah@ucla-cs.arpa'') Xand Rik Verstraete (``rik@ucla-cs.arpa''). SHAR_EOF if test 4126 -ne "`wc -c 'correct.1'`" then echo cshar: error transmitting "'correct.1'" '(should have been 4126 characters)' fi echo cshar: extracting "'correct'" '(5123 characters)' if test -f 'correct' then echo cshar: over-writing existing file "'correct'" fi sed 's/^X//' << \SHAR_EOF > 'correct' X: X: correct: a program to find and correct spelling errors X: X: SYNOPSIS: correct [-n] [-d dictionary] file X: AUTHOR: Tovah Hollander and Rik Verstraete. X: DATE: Tue Jan 10 14:08:13 PST 1984 X: X: Xtrap "/bin/rm -f /tmp/*$$ > /dev/null ; exit" 1 2 15 X: initialize variables Xsetf=0 Xlexicon=$HOME/dictionary Xcorrectyn=1 X: all arguments of the command Xwhile test -n "$1" X do case $1 in X -n) correctyn=0 ;; X -d) shift X if test $1 X then lexicon=$1 X else echo 'correct: must specify dictionary file with -d flag' X exit X fi ;; X *) if test $setf -eq 1 X then echo 'correct: can work only on one file' X exit X fi X file=$1 X setf=1 ;; X esac X shift X done X: is a file name specified? Xif test $setf -eq 0 X then echo 'correct: must specify a file name' X exit Xfi X: does file exist? Xif test ! -f $file X then echo 'correct: cannot open '$file X exit Xfi X: create dictionary if necessary Xif test ! -f $lexicon X then echo '' > $lexicon Xfi X: sort dictionary Xsort -u $lexicon -o $lexicon X: find all errors not in the dictionary Xspell $file | sort -u -o /tmp/spell$$ Xcomm -23 /tmp/spell$$ $lexicon > /tmp/errors$$ X: process errors one by one, if any, and if -n flag is not set Xif test $correctyn -eq 0 X then cat /tmp/errors$$ Xelif test -s /tmp/errors$$ X then X set `cat /tmp/errors$$` X echo ' XChoose for each word: X a(dd),s(ubstitute),c(heck),l(ook),h(elp),or <return>' X: X: process all errors one by one X: X for i X do X word=$i X valid=0 X until test $valid -eq 1 X do echo '' X echo -n '"'$word'": ' X read action X case $action in X a*) echo $word >> $lexicon X valid=1 ;; X s*) echo -n ' Substitute: ' X read newword X echo "g/\<"$word"\>/s//"$newword"/g" >> /tmp/ex-script$$ X valid=1 ;; X c*) grep -w $word $file > /tmp/line$$ X mode=1 X until test `grep -cw $word /tmp/line$$` -eq 0 X do echo '' X echo ' Context is:' X line=`head -1 /tmp/line$$` X echo $line X echo -n ' Edit (y/n)? ' X read yorn X case $yorn in X y*) if test $mode -eq 1 X then mode=0 X echo -n ' Substitute: ' X read newword X echo "/^$line$/s/\<"$word"\>/"$newword"/g" >> /tmp/ex-script$$ X else echo -n ' Substitute "'$newword'" (y/n)? ' X read ans X case $ans in X y*) echo "/^$line$/s/\<"$word"\>/"$newword"/g" >> /tmp/ex-script$$ X ;; X *) echo -n ' Substitute: ' X read newword X echo "/^$line$/s/\<"$word"\>/"$newword"/g" >> /tmp/ex-script$$ X esac X fi ;; X n*) echo -n " Add to dictionary? " X read yorn X case $yorn in X y*) echo $word >> $lexicon X esac X esac X sed -e 1d /tmp/line$$ > /tmp/junk$$ X mv /tmp/junk$$ /tmp/line$$ X done X valid=1 ;; X l*) echo -n ' Enter search string: ' X read string X look $string > /tmp/look$$ X if test ! -s /tmp/look$$ X then echo ' No words found.' X else echo ' Words found in system dictionary: ' X cat /tmp/look$$ X fi ;; X h*) echo ' XOPTIONS (choose for each incorrect word): X a(dd) - add the word to the lexicon X s(ubstitute) - make a global substitution for the word X c(heck) - check the word in each context X l(ook) - check system dictionary for possible corrections X h(elp) - print this list X <return> - no action' ;; X "") valid=1 ;; X *) echo '"'$action'" is invalid option. XChoose for each word: X a(dd),s(ubstitute),c(heck),l(ook),h(elp),or <return> X' ;; X esac X done X done X: make corrections, if any X if test -f /tmp/ex-script$$ X then echo '' X echo -n 'Do you want to make all the corrections now? ' X read yorn X case $yorn in X y*) echo w $file >> /tmp/ex-script$$ X echo q >> /tmp/ex-script$$ X ex $file < /tmp/ex-script$$ > /dev/null ;; X *) echo "corrections aborted..." X esac X rm /tmp/ex-script$$ X fi X: clean up and sort lexicon X spell $lexicon | sort -u -o $lexicon Xfi X/bin/rm -f /tmp/*$$ > /dev/null SHAR_EOF if test 5123 -ne "`wc -c 'correct'`" then echo cshar: error transmitting "'correct'" '(should have been 5123 characters)' fi chmod +x 'correct' # End of shell archive exit 0