[comp.sources.unix] v10i29: Bull Tuthill's "hum" text concordance package, Part03/03

rs@uunet.UU.NET (Rich Salz) (06/27/87)
Submitted by: John Gilmore <hoptoad!gnu>
Mod.Sources: Volume 10, Number 29
Archive-name: hum/Part03

: To unbundle, sh this file
mkdir man
echo man/INDEX.man
cat >man/INDEX.man <<'@@@ Fin de man/INDEX.man'
.TH INDEX HUM "Version 3.7"
.ps 12
.vs 14
.ta 15n
.nf
The following programs are available for Humanities users:
.sp
\fBaccent\fP	user-controlled accent module
\fBcedilla\fP 	convert plus mark to cedilla
\fBcfreq\fP 	character (or digraph) frequency count
\fBdict\fP 	split file into dictionary sections
\fBexclude\fP 	exclusion module for concordance
\fBformat\fP 	format and count keywords in concordance
\fBfreq\fP 	word frequency count
\fBkwal\fP 	key word and line concordance
\fBkwic\fP 	key word in context concordance
\fBlemma\fP 	user-controlled lemmatization module *
\fBlno\fP 	line number (also double lines, hemistichs, strophes)
\fBmaxwd\fP 	locate, measure and print longest word (or line)
\fBpair\fP 	set two files side by side (or merge lines)
\fBpause\fP 	stop terminal output to change type ball
\fBrevconc\fP 	reverse concordance module
\fBsfind\fP 	find sentence (or record) matching a pattern
\fBskel\fP 	prompt user with database skeleton *
\fBtogrk\fP 	convert Greek transcription for typesetter *
\fBtolpr\fP 	filter output for lineprinter
\fBtosel\fP 	convert English for Selectric terminal *
\fBtprep\fP 	prepare text for concordance (trim, pad, or unpad)
\fBtroffmt\fP 	format concordance for typesetter *
\fBumlaut\fP 	convert plus mark to umlaut
\fBwdlen\fP 	tabulate word lengths and print histogram
\fBwheel\fP 	roll through text a word cluster at a time
\fBxref\fP	cross reference words and linenumbers
.sp
	* not widely distributed, but available on request
.sp 3
To get the manual pages for any of these programs, type:
.sp
	% \fBhuman\fP  \fIprogramname\fP
.fi
.ps
.vs
@@@ Fin de man/INDEX.man
echo man/Makefile
cat >man/Makefile <<'@@@ Fin de man/Makefile'
MAN = ../man/

all:	$(MAN)index $(MAN)accent $(MAN)cfreq $(MAN)dict $(MAN)exclude \
	$(MAN)format $(MAN)freq $(MAN)kwal $(MAN)kwic $(MAN)lno $(MAN)maxwd \
	$(MAN)pair $(MAN)pause $(MAN)revconc $(MAN)skel $(MAN)sfind \
	$(MAN)tolpr $(MAN)tosel $(MAN)tprep $(MAN)wdlen $(MAN)wheel $(MAN)xref

$(MAN)index: INDEX.man
	nroff -man INDEX.man > $(MAN)index
$(MAN)accent: accent.man
	nroff -man accent.man > $(MAN)accent
	ln $(MAN)accent $(MAN)cedilla
	ln $(MAN)accent $(MAN)umlaut
$(MAN)cfreq: cfreq.man
	nroff -man cfreq.man > $(MAN)cfreq
$(MAN)dict: dict.man
	nroff -man dict.man > $(MAN)dict
$(MAN)exclude: exclude.man
	nroff -man exclude.man > $(MAN)exclude
$(MAN)format: format.man
	nroff -man format.man > $(MAN)format
$(MAN)freq: freq.man
	nroff -man freq.man > $(MAN)freq
$(MAN)kwal: kwal.man
	nroff -man kwal.man > $(MAN)kwal
$(MAN)kwic: kwic.man
	nroff -man kwic.man > $(MAN)kwic
$(MAN)lno: lno.man
	nroff -man lno.man > $(MAN)lno
$(MAN)maxwd: maxwd.man
	nroff -man maxwd.man > $(MAN)maxwd
$(MAN)pair: pair.man
	nroff -man pair.man > $(MAN)pair
$(MAN)pause: pause.man
	nroff -man pause.man > $(MAN)pause
$(MAN)revconc: revconc.man
	nroff -man revconc.man > $(MAN)revconc
$(MAN)sfind: sfind.man
	nroff -man sfind.man > $(MAN)sfind
$(MAN)skel: skel.man
	nroff -man skel.man > $(MAN)skel
$(MAN)tolpr: tolpr.man
	nroff -man tolpr.man > $(MAN)tolpr
$(MAN)tosel: tosel.man
	nroff -man tosel.man > $(MAN)tosel
$(MAN)tprep: tprep.man
	nroff -man tprep.man > $(MAN)tprep
$(MAN)wdlen: wdlen.man
	nroff -man wdlen.man > $(MAN)wdlen
$(MAN)wheel: wheel.man
	nroff -man wheel.man > $(MAN)wheel
$(MAN)xref: xref.man
	nroff -man xref.man > $(MAN)xref
@@@ Fin de man/Makefile
echo man/README
cat >man/README <<'@@@ Fin de man/README'
	This file contains nroff/troff text for use with the -man
macro package (version 7 Unix only).  To get printable manual pages
in the directory "../man", where users can read them by using the
"human" command, just use the "make" command in this directory.
This is very similar to the "make" for the C source code, except
that this "Makefile" calls "nroff -man".
@@@ Fin de man/README
echo man/accent.man
cat >man/accent.man <<'@@@ Fin de man/accent.man'
.TH ACCENT HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
.nf
accent \- user-controlled accent module
cedilla \- convert plus mark to cedilla
umlaut \- convert plus mark to umlaut
.fi
.SH SYNOPSIS
.nf
\fBaccent\fP  [ \fB\-a\fP  accfile ]  [ filename ... ]
\fBcedilla\fP  [ filename ... ]
\fBumlaut\fP  [ filename ... ]
.fi
.SH DESCRIPTION
\fIAccent\fP reads accent mark definitions from ``accfile'',
or some other file specified after the -a option.
This file should have one or more lines
of a character, some space, and another character.
\fIAccent\fP converts all characters on the left
into a backspace and the corresponding character on the right.
When using \fIaccent\fP, create a punctuation file
(to be specified after the -d option of \fIkwic\fP)
containing all your left-hand accent marks on its second line.
\fIKwic\fP will consider these as zero-width characters.
.PP
For convenience, two links to \fIaccent\fP are provided:
\fIcedilla\fP and \fIumlaut\fP.
\fICedilla\fP converts the plus mark to a backspace and comma,
which looks like a cedilla on the lineprinter;
this convention is used even on the phototypesetter.
\fIUmlaut\fP converts the plus mark to a backspace and double quote;
this passes for an umlaut on unsophisticated output devices.
The plus mark should follow the character that is to be accented;
for example, type "Provenc+al" or "Mu+llerin" in your text.
The use of a plus character to represent both accent marks
implies that you cannot have cedillas and umlauts in the same text,
unless you use the \fIaccent\fP program.
.PP
Be sure to use the + option of \fIkwic\fP or \fIkwal\fP,
or have a second line of zero-width characters in the punctuation file,
in order to create the proper character alignment.
\fIAccent, cedilla,\fP or \fIumlaut\fP should be called
after \fIsort\fP, but before \fIformat\fP.
If you are using these programs,
invoke the \-d flag of \fIsort\fP to prevent accent marks
from influencing dictionary order.
.SH "SEE ALSO"
kwal(hum), kwic(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
\fIAccent\fP should probably be able to convert a character
into an arbitrary-length string.
@@@ Fin de man/accent.man
echo man/cfreq.man
cat >man/cfreq.man <<'@@@ Fin de man/cfreq.man'
.TH CFREQ HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
cfreq \- character (or digraph) frequency count
.SH SYNOPSIS
.nf
\fBcfreq\fP  [ \fB\-p \-a \-d \-m \-\fP ]  filename ...
\-p: list all printable characters (blank \- '~')
\-a: list all ascii characters (null \- delete)
\-d: count digraphs rather than single characters
\-m: disable mapping of digraphs to lower case
\- : read standard input instead of files
.fi
.SH DESCRIPTION
.I Cfreq
reads through a list of files, counting the number of
occurrences of each ascii character.
The counts are kept in an internal table.
When reading is finished, the frequencies
are listed, with the character (in ascii order) on the left,
and its frequency on the right.
A count of the total number of characters
(including newlines) appears at the bottom of the listing.
The output can be formatted into multiple columns
with the Unix utility \fIpr,\fP if desired.
.PP
Ordinarily, only alphabetic characters are listed.
The \-p option, however, gives all printable characters, 
including letters, digits, and punctuation marks.
The \-a option gives all 128 ascii characters, including 
control characters, letters, digits, and punctuation marks.
.PP
When given the \-d flag, \fIcfreq\fP counts digraph frequencies.
Spaces, tabs and newlines are considered valid characters,
as are punctuation marks, digits, and control characters.
When reading is finished, the digraphs are listed on the left,
with the frequency counts on the right.
If the \-m flag is also invoked,
\fIcfreq\fP will not map alphabetic characters to lower case,
so you will end up with capitals among the digraphs.
.SH "SEE ALSO"
freq(hum), pr(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/cfreq.man
echo man/dict.man
cat >man/dict.man <<'@@@ Fin de man/dict.man'
.TH DICT HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
dict \- split file into dictionary sections
.SH SYNOPSIS
.nf
\fBdict\fP  [ \- ]  filename  [ outfileroot ]
\-: read standard input rather than file
.fi
.SH DESCRIPTION
\fIDict\fP will divide a text into multiple files,
according to the first letter on each line.
It can be used to split up a large concordance
into smaller, more manageable dictionary sections.
This program is akin to the Unix utility \fIsplit\fP,
which divides files into 1000 line portions.
\fIDict\fP reads from the file given in the first argument
(or standard input if the first argument is `\-'),
and writes onto a set of output files,
all beginning with the root given in the second argument.
If no second argument is given, the root defaults to "X".
For every file that is created,
a character is added to the root,
to indicate what letter that file contains.
.PP
Theoretically, it is possible to write 128 different files,
one for each ascii character.
This means that each number goes into its own file,
and that an upper case "A" and a lower case "a"
will end up in different files.
In the case of the \fIkwic\fP program, all keywords are already
mapped to lower case, so there should be 26 or fewer files.
Here is an example of a concordance program using \fIkwic\fP:
.nf
 % kwic text* | sort | dict \- /tmp/OUT
 % edit /tmp/OUT*
 % format /tmp/OUT* | lpr
 % rm /tmp/OUT*
.fi
In the above example, \fIdict\fP makes small files out of one large file,
so that you can edit the concordance until you are happy with it.
The best and most useful concordances are always hand-edited.
.SH "SEE ALSO"
format(hum), kwal(hum), kwic(hum), sort(1), split(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/dict.man
echo man/exclude.man
cat >man/exclude.man <<'@@@ Fin de man/exclude.man'
.TH EXCLUDE HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
exclude \- exclusion module for concordance
.SH SYNOPSIS
\fBexclude\fP  [ \fB\-i\fP  ignorefile ]  [ \fB\-o\fP  onlyfile ]  [ filename ... ]
.nf
\-i: ignorefile contains words to be ignored, one per line
\-o: onlyfile has only words to be printed, one per line
.fi
.SH DESCRIPTION
.I Exclude
is a filter that functions as an exclusion routine
for deleting unnecessary words from a concordance.
When invoked without any arguments,
it reads standard input, and writes to standard output,
filtering out all lines beginning with words listed in ``exclfile''.
If any filenames of text files are given,
\fIexclude\fP will read from them rather than from standard input.
.PP
Ordinarily, words to be ignored are read from ``exclfile'',
but another ignore file can be specified after the \-i option.
(There is a list of common English words in /usr/lib/eign.)
If you wish to preserve only a small set of words,
and want all other words ignored, you can list these
important words in the only file, and use the \-o option;
only words listed in that file will be sent through the filter.
Words listed in the exclude file must be on a line of their own,
with no blanks anywhere on the line.
.PP
\fIExclude\fP should be used after \fIkwic\fP or \fIkwal\fP,
but before \fIsort\fP, because eliminating unnecessary words before
sorting will save large amounts of otherwise redundant machine time.
Here is a sample command line using the exclusion routine:
.nf
 % kwic  textfile | exclude | sort | format
.fi
Of course, it is necessary to have words to be excluded in a file 
called ``exclfile'', residing in the same directory as ``textfile''.
Eliminating prepositions and articles from a concordance
can often shorten it by as much as one-third to one-half.
.SH "SEE ALSO"
format(hum), kwal(hum), kwic(hum), sort(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
There cannot be more than 500 lines in the exclude file.
@@@ Fin de man/exclude.man
echo man/format.man
cat >man/format.man <<'@@@ Fin de man/format.man'
.TH FORMAT HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
format \- format and count keywords in concordance
.SH SYNOPSIS
.nf
\fBformat\fP  [ \fB\-mck\fP ]  [ filename ... ]  [ \- ]
\-m: keywords not mapped from lower to upper case
\-c: suppress counting of keywords (will speed it up)
\-k: suppress printing of separate keyword
\- : read standard input instead of files
.fi
.SH DESCRIPTION
.I Format
is generally the last program used in making a concordance.
Once the concordance has been compiled and sorted,
using \fIkwic\fP or \fIkwal\fP and \fIsort,\fP
the keywords can be formatted into capitalized headings
followed by a frequency count.
\fIFormat\fP depends on sorted input to make its frequency counts.
.PP
If for some reason you do not want an upper case keyword heading,
you can preserve the lower case keywords by using the \-m option.
Keyword counting can also be suppressed by using the \-c option;
this will speed up the \fIformat\fP program somewhat.
To completely suppress printing of a separate keyword,
use the \-k option; this will produce only
the identification field and the context.
.PP
Here is a typical program sequence for a concordance,
suitable for sending to the lineprinter:
.nf
 % kwic \-c100 filename(s) | sort | format | lpr
.fi
The \-c100 argument to \fIkwic\fP creates a long context
suitable for the lineprinter.
.SH FILES
\fIFormat\fP creates a temporary file, /tmp/Fmt?????,
where it stores all the contexts of a single keyword,
while counting the frequency of that keyword.
This tempfile is removed in case of interrupt.
.SH "SEE ALSO"
kwal(hum), kwic(hum), sort(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
The \-k option silently overrides the \-m and \-c options.
@@@ Fin de man/format.man
echo man/freq.man
cat >man/freq.man <<'@@@ Fin de man/freq.man'
.TH FREQ HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
freq \- word frequency count
.SH SYNOPSIS
.nf
\fBfreq\fP  [ \fB\-n  \-m  \-d\fPpfile  \- ]  filename ...
\-n: list words in numerical order of frequency
\-m: disable mapping of letters to lower case
\-d: define punctuation set according to \fIpfile\fP
\- : read standard input instead of files
.fi
.SH DESCRIPTION
\fIFreq\fP reads through a list of files,
counting the number of occurrences of each word.
The frequencies and words are kept
in a binary tree structure in core memory,
so that the program will be as efficient as possible.
When reading is finished, the frequencies are listed on the left,
with the words, in alphabetical order, on the right.
The total number of words, and the number of different words,
is tabulated and given at the end of the wordlist.
The output can be formatted into columns with \fIpr,\fP if desired.
.PP
The \-n option will cause the words to be listed by
numerical order of frequency, with the most common words first.
The \-m flag will leave capital letters as they are.
The \-d option allows the user to define his own punctuation set.
If this option is called, \fIfreq\fP will replace
the default punctuation set ,.;:-?!"()[]{}
with the last line of the specified file.
.SH "SEE ALSO"
cfreq(hum), dissolve(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
\fIFreq\fP will run out of core memory at about 64K bytes of storage.
In that case it is necessary to use \fIprep\fP, \fIsort\fP and \fIuniq\fP,
which is a much slower process, but which can handle large amounts of data.
@@@ Fin de man/freq.man
echo man/kwal.man
cat >man/kwal.man <<'@@@ Fin de man/kwal.man'
.TH KWAL HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
kwal \- key word and line concordance
.SH SYNOPSIS
.nf
\fBkwal\fP  [ \fB\-k\fIn\fP \-m \-w\fIS\fP \-f\fIn\fP \-s\fIn\fP \-r \-l\fIn\fP \-x \-d\fIF\fR + \- ]  filename ...
\-kn: keyword is n characters long (defaults to 15)
\-m : keywords not mapped from upper to lower case
\-wS: write string S onto id field (use quotes around blanks)
\-fn: filename (up to n characters) written onto id field
\-sn: skip n characters of lefthand id field in text and write as id
\-r : reset linenumber to 1 at beginning of every file
\-ln: line numbering begins with line n (instead of 1)
\-x : line numbering is suppressed entirely
\-d : define punctuation set according to file F
\(pl : the + character indicates cedilla or umlaut
\(mi : read text from standard input (terminal or pipe)
.fi
.SH DESCRIPTION
\fIKwal\fP is a text concordance program,
generally for use with poetry.
Normally, it prints a left-hand keyword (adjusted for backspaces),
a 6 digit linenumber, and the line of context.
The following characters are considered to be
punctuation marks:  ,.;:-"?!()[]{}  but all other
non-alphabetic characters can be part of a word.
These punctuation characters can be changed.
.PP
By default, only the first 15 characters
of the keyword are printed, followed by a vertical bar;
longer keywords are truncated.
If you want more or less than 15 characters in the keyword,
use the \-k option to lengthen or shorten it.
To find the longest word in your text,
try the \fImaxwd\fP program, and set \-k accordingly.
You can also use \fImaxwd \-l\fP to determine
the length of your longest context line.
Keywords are mapped to lower case to ease the logistics of sorting,
unless the \-m option is specified.
.PP
The \-w argument allows you to write an id field
(such as the name of an author or work) after the keyword.
If you want to include any blanks,
enclose the entire string in quotes: \-w"Poetic Edda".
The \-f argument allows you to write the current filename,
up to a number of characters you specify.
If the filename is shorter, it will be blank-padded,
and if it is longer, it will be truncated.
.PP
If you are concording a series of short poems,
each starting with line 1, type them into separate files,
and use the \-r option to reset the linenumber to 1
at the beginning of each new file.
If you resume concording in the middle of your text,
you can set the line number with the \-l option.
If your text is already numbered or identified,
with a system that is not entirely arithmetic,
such as by hemistich or by double lines,
you can print your custom id field by using the \-s option.
This will skip over n characters of your lefthand id field
embedded in the text, and print it as an id field,
after the (\-f) filename, but before the (\-l) linenumber.
When you also want to suppress linenumbering, use the \-x option.
.PP
If you are working with a foreign language,
and need to use normal punctuation marks as diacritical marks,
you can change the default punctuation set with the \-d option.
Just type the punctuation marks you want into a file,
on a single line with no embedded spaces,
and specify the filename after the \-d in your command line.
If you have cedillas or umlauts, you can represent them
as a `+' character after the accented letter.
Use the `+' option of \fIkwic\fP, and filter your output through
either the \fIcedilla\fP or \fIumlaut\fP program.
.PP
After generating the concordance,
it should be alphabetized using the Unix \fIsort\fP program.
Keywords should be grouped and counted with the \fIformat\fP program,
and the final results can be sent to the lineprinter.
Here is a typical program sequence for generating a concordance:
.nf
 % kwal poem* | sort | format | lpr
.fi
Usually, it is better to send the results of \fIformat\fP
to a file, where they can be examined and edited,
before sending the file to the lineprinter.
.SH "SEE ALSO"
format(hum), kwic(hum), maxwd(hum), maxln(hum), sort(1)
.SH LIMITATIONS
Lines of text cannot be longer than 512 characters.
Linenumbers cannot exceed 999999 without skewing the output format.
Most lineprinters will not print entries longer than 132 characters.
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/kwal.man
echo man/kwic.man
cat >man/kwic.man <<'@@@ Fin de man/kwic.man'
.TH KWIC HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
kwic \- key word in context concordance
.SH SYNOPSIS
.nf
\fBkwic\fP  [ \fB\-k\fIn\fP \-m \-w\fIS\fP \-f\fIn\fP \-r \-l\fIn\fP \-p\fIn\fP \-i\fIc\fP \-c\fIn\fP \-d\fIF\fR + \- ]  filename ...
\-kn: keyword is n characters long (defaults to 15)
\-m : keywords not mapped from upper to lower case
\-wS: write string S onto id field (use quotes around blanks)
\-fn: filename (up to n characters) written onto id field
\-r : reset linenumber to 1 at beginning of every file
\-ln: line numbering begins with line n (instead of 1)
\-pn: page numbering begins with page n (instead of 1)
\-ic: page incrementer is character c (defaults to =)
\-cn: context is n characters long (defaults to 50)
\-dF: define punctuation set according to file F
\(pl : the + character indicates cedilla or umlaut
\(mi : read text from standard input (terminal or pipe)
.fi
.SH DESCRIPTION
\fIKwic\fP is a text concordance program,
generally for use with prose,
although it is often used for poetry.
Normally, it prints a left-hand keyword,
a 6 digit linenumber or 6 place pagenumber
(depending on how you want to label your text),
and a context of 50 characters, centered around the keyword.
Words are separated at their natural boundaries,
and adjustment is made for backspaces.
Newline characters are printed as "/",
and tabs are printed as a single blank.
If you want to have a space after the newline "/",
use the pad option of \fItprep\fP to insert a space
at the beginning of each line in your text.
The following characters are considered to be
punctuation marks:  ,.;:-"?!()[]{}  but all other
non-alphabetic characters can be part of a word.
These punctuation characters can be changed.
.PP
By default, only the first 15 characters
of the keyword are printed, followed by a vertical bar;
longer keywords are truncated.
If you want more or less than 15 characters in the keyword,
use the \-k option to lengthen or shorten it.
To find the longest word in your text,
use the \fImaxwd\fP program, and set \-k accordingly.
Keywords are mapped to lower case to ease the logistics of sorting,
unless the \-m option is specified.
.PP
The \-w argument allows you to write an id field
(such as the name of an author or work) after the keyword.
If you want to include any blanks,
enclose the entire string in quotes: \-w"Prose Edda".
The \-f argument allows you to write the current filename,
up to a number of characters you specify.
If the filename is shorter, it will be blank-padded,
and if it is longer, it will be truncated.
.PP
If the program encounters the character "=",
which, by default, indicates pagination,
it will count pages as well as line numbers.
Line numbers will print as: ``\ 12469'',
while page numbers will print as: ``178,12''.
If you are concording a series of short poems,
each starting with line 1, type them into separate files,
and use the \-r option to reset the linenumber to 1
at the beginning of each new file.
If you resume concording in the middle of your text,
you can set the line number with the \-l option,
or the page number with the \-p option.  
If you want to indicate pagination,
make sure that you begin your text with ``=1'',
on a line of its own, to indicate the first page.
When a new chapter starts at the top of the page,
be sure to set \-p to the previous page.
The page indicator can be changed with the \-i option;
\-i% will change it to a percent sign, for instance.
.PP
If you are sending output to the lineprinter,
the context width can be increased with the \-c argument;
\-c110, for instance, will give you about 55 characters
on either side of the keyword in context.
Note that the lineprinter can print only 132 characters per line,
so add up your field widths carefully.
.PP
If you are working with a foreign language,
and need to use normal punctuation marks as diacritical marks,
you can change the default punctuation set with the \-d option.
Just type the punctuation marks you want into a file,
on a single line with no embedded spaces,
and specify the filename after the \-d in your command line.
If you have cedillas or umlauts, you can represent them
as a `+' character after the accented letter.
Use the `+' option of \fIkwic\fP, and filter your output through
either the \fIcedilla\fP or \fIumlaut\fP program.
.PP
After generating the concordance,
it should be alphabetized using the Unix \fIsort\fP program.
Keywords should be grouped and counted with the \fIformat\fP program,
and the final results can be sent to the lineprinter.
Here is a typical program sequence for generating a concordance:
.nf
 % kwic \-c110 chapter* | sort | format | lpr
.fi
Usually, it is better to send the results of FORMAT
to a file, where they can be examined and edited,
before sending the file to the lineprinter.
.SH FILES
A temporary file, /tmp/KwicXXXXX,
is created if \fIkwic\fP has to work with standard input,
because seeking can only be done with files.
.SH "SEE ALSO"
format(hum), kwal(hum), maxwd(hum), tprep(hum), sort(1)
.SH LIMITATIONS
Words cannot be longer than 512 characters,
nor can the first half of the context.
Linenumbers cannot exceed 999999 and pagenumbers 
cannot exceed 999,99 without skewing the output format.
Most lineprinters will not print entries longer than 132 characters,
and the CAT/4 typesetter cannot handle lines longer than 7.54 inches.
.SH AUTHOR
Bill Tuthill
.SH BUGS
If there are lots of backspaces in the text,
the context width is somewhat shortened.
Using a wheel-like data structure might be more efficient
than using disk seeks and reads to output the contexts.
@@@ Fin de man/kwic.man
echo man/lno.man
cat >man/lno.man <<'@@@ Fin de man/lno.man'
.TH LNO HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
lno \- line number, double line number, hemistich number 
.SH SYNOPSIS
.nf
\fBlno\fP  [ \fB+n  \-d  \-h  \-s\fIn\fR  \- ]  filename ...
+n : the beginning line number is n, not 1
\-d : double line number text with long lines
\-h : hemistich number text with split lines
\-sn: number and letter strophes of n lines
\-  : read standard input instead of files
.fi
.SH DESCRIPTION
\fILno\fP line numbers a text, starting at 1 (one) and going up.
If you are beginning in the middle of a text,
the initial line number can be specified after the + option.
.PP
With the \-d option, \fIlno\fP numbers a text with long (Germanic) lines,
which are generally labelled in editions as double lines.
It starts at 1 (one), or at the specified line number,
and goes up in increments of two at the end of each line.
.PP
With the \-h option, \fIlno\fP numbers a text with hemistichs, or half lines.
It starts at 1 (one), unless another beginning number is specified. 
The first line is labelled 1a, the second 1b,
the third 2a, the fourth 2b, and so forth.
.PP
The \-s option can be used to specify the number of lines in a strophe.
For example, \-s4 will produce 1a, 1b, 1c, 1d, 2a, and so on.
The \-h option is identical to the \-s2 argument.
.PP
With the \-d option, if you specify an even beginning number,
all the following double line numbers will be even.
With the \-h option, all line pairs have a number postfixed
with "a" and then "b", so if you want to begin with a "b",
put an empty line in your text, to be labelled "a".
.SH "SEE ALSO"
num(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/lno.man
echo man/maxwd.man
cat >man/maxwd.man <<'@@@ Fin de man/maxwd.man'
.TH MAXWD HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
maxwd \- locate, measure and print longest word (or line)
.SH SYNOPSIS
.nf
\fBmaxwd\fP  [ \fB\-l  \-d\fIF\fR  \- ]  filename ...
\-l: look for longest line instead of longest word
\-d: define punctuation set according to file F
\- : read standard input instead of files
.fi
.SH DESCRIPTION
\fIMaxwd\fP reads through a set of files,
or standard input if specified,
looking for the longest word.
After reading is finished, the filename and linenumber
of the longest word are printed,
with the length of that word.
On the next line, the longest word is printed verbatim.
.PP
With the \-l option, \fImaxwd\fP will look for the longest line,
and print its filename, linenumber, and length.
Similarly, the next line will contain the longest line, verbatim.
.PP
If several files are concatenated
and sent through a pipe to \fImaxwd\fP,
the filename will appear as "Stdin" and line numbering will
continue to increment across file boundaries.
.PP
\fIMaxwd\fP should be used before concording a text
with \fIkwic\fP or \fIkwal\fP,
in order to determine what keyword length you should specify.
If you are working with foreign languages, the \-d option
can be used to split words at the proper place;
the punctuation file is compatible with many other related programs.
.SH "SEE ALSO"
kwal(hum), kwic(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
\fIMaxwd\fP will truncate words longer than 512 characters, and
\fIMaxwd \-l\fP will truncate lines longer than 1024 characters.
@@@ Fin de man/maxwd.man
echo man/pair.man
cat >man/pair.man <<'@@@ Fin de man/pair.man'
.TH PAIR HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
pair \- set two files side by side (or merge lines)
.SH SYNOPSIS
.nf
\fBpair\fP  [ \fB\-m\fP ]  file1  [ \- ]  file2  [ +\fIlen1\fP  [ +\fIlen2\fP ] ]
\-m: merge (intercalate) files line by line
\- : read standard input instead of files
len1 and len2 denote screen width of file1 and file2
.fi
.SH DESCRIPTION
\fIPair\fP is a program for looking at two parallel texts,
in order to compare and contrast them.
By default, \fIpair\fP sets them side by side,
but with the \-m option, it shuffles them together.
This utility is useful for examining manuscript variations.
It will accept standard input rather than a file,
if a dash is used in place of the filename.
Output can be redirected if desired.
.PP
By default, \fIpair\fP prints two 40-character wide columns of text,
which gives equal space to each text, and fills up the screen.
The third and fourth arguments can be used to change
the column width for the first and second files, respectively.
For example, if your first file is composed of numbers
but your second file contains text with occasional long lines,
specify something like:
.nf
 %  pair  file1  file2  +10  +70
.fi
If you have long lines and would rather have
lines from each text on separate lines, use the \-m option.
.PP
\fIPair\fP can be used for comparing textual variants.
It is especially useful for making two texts parallel
before analyzing the variants with \fIdiff\fP or \fIdiff3\fP.
\fIDiff\fP compares two files, while \fIdiff3\fP compares three at a time.
The results from these programs will be more usable
if the texts are parallel before they are analyzed.
.SH "SEE ALSO"
diff(1), diff3(1), pr(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
When output is redirected and input is being taken from the terminal,
it is impossible to tell what is coming from the input file.
@@@ Fin de man/pair.man
echo man/pause.man
cat >man/pause.man <<'@@@ Fin de man/pause.man'
.TH PAUSE HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
pause \- stop terminal output to change type ball
.SH SYNOPSIS
pause  [ filename ... ]
.SH DESCRIPTION
\fIPause\fP will stop terminal output when it encounters a control-p
embedded in the text it is reading, and resume output
when a control-d is typed on the terminal keyboard.
Other than that, \fIpause\fP acts much like the Unix utility \fIcat\fP.
It is intended for use on a Selectric terminal with an IBM ball,
or on a DTC or IPSI terminal with a Diablo printwheel.
.PP
If your text proceeds in one language,
and then changes to another for a quote,
just put a ctrl-p in your text between sections.
The terminal will pause until you change the printing device,
and when you are ready to continue,
you can type ctrl-d on the terminal.
.SH "SEE ALSO"
tosel(hum), cat(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
Control characters embedded in the text can affect the lineprinter,
\fInroff\fP and \fItroff\fP, and many other programs.
So do not be indiscriminate with your use of control-p.
@@@ Fin de man/pause.man
echo man/revconc.man
cat >man/revconc.man <<'@@@ Fin de man/revconc.man'
.TH REVCONC HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
revconc \- reverse concordance module
.SH SYNOPSIS
revconc  [ filename ... ]
.SH DESCRIPTION
\fIRevconc\fP reverses the first word on each line,
which in a concordance is, conveniently, the keyword.
This program is intended to be a module to create a reverse concordance.
Words will be alphabetized from the end to the beginning,
rather than from the beginning to the end, as is usual.
The results can be used to examine word endings and inflections.
.PP
It should be used between a series of pipes including
\fIkwic\fP or \fIkwal\fP, \fIsort\fP, and \fIformat\fP.
Here is a suggested command sequence:
.nf
 % kwic filename(s) | revconc | sort | revconc | format
.fi
It must be used twice,
or else the word will appear backwards
in the final version.
The first invocation of \fIrevconc\fP reverses
the keyword, so that \fIsort\fP operates from the back to the front,
while the second invocation restores normal order to the word.
.PP
Many published concordances contain a Reverse List of Graphic Forms;
\fIrevconc\fP can be used for this purpose, but the Unix utility \fIrev\fP
would probably be faster.
Here is a suggested command sequence
for making a Reverse List of Graphic Forms:
.nf
 % prep filename(s) | rev | sort \-u | rev
.fi
The results can be put into columns with the Unix utility \fIpr\fP.
.SH "SEE ALSO"
format(hum), kwal(hum), kwic(hum), pr(1), rev(1), sort(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
It is not possible to make a reverse concordance using context, 
rather than line number, as the secondary sort field.
@@@ Fin de man/revconc.man
echo man/sfind.man
cat >man/sfind.man <<'@@@ Fin de man/sfind.man'
.TH SFIND HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
sfind \- find sentence matching a pattern
.SH SYNOPSIS
.nf
\fBsfind\fP  [ \fB\-s\fIc\fP \-l\fIn\fP \-p\fIn\fP \-i\fIc\fP \-r\fR ]  'pattern'  [ \- ]  filename ...
\-sC: record separator set to C (or empty line with no C)
\-ln: line number is set to n (instead of 1)
\-pn: page number is set to n (default off)
\-ic: page incrementing character is c (not =)
\-r : reset linenumber to 1 with each new file
\-  : read standard input instead of files
.fi
.SH DESCRIPTION
\fISfind\fP is a rewrite of the Unix utility \fIgrep\fP,
oriented towards sentences rather than towards lines.
It is useful for finding words and syntactic patterns
in their full linguistic context.
If the pattern is longer than one word,
or if it contains magic shell characters,
it must be enclosed in quotes.
You can specify multiple filenames,
and \fIsfind\fP will search through them in order.
If there is a match, it will print the current filename,
the line number where the sentence begins,
the page number if relevent, and the pattern, all on a single line.
This information will be followed by the sentence
exactly as it appears in the text.
.PP
The pattern wildcard character `_' (underscore) matches
any single character; it is similar to the `.' (period) in \fIgrep\fP,
or the `?' (question mark) in the shell.
The wildcard character `*' (asterisk) matches
any number of characters in your text until the pattern continues;
it is exactly like the `*' wildcard in the shell.
It is also similar, but not identical, to the `*' in \fIgrep\fP,
which matches zero or more repetitions of the previous character.
To find an actual underscore or asterisk,
precede these metacharacters with a backslash.
.PP
If you begin searching in the middle of a text,
you can set the beginning line number (or page number)
with the \-l (or \-p) option.
For compatibility with the page incrementing feature of \fIkwic\fP,
\fIsfind\fP will count pages if it encounters `=' (equals) in the text.
The incrementing character can be changed with the \-i option.
If you want to reset the linenumber to 1
at the beginning of each new file, use the \-r option.
.PP
The \-s option is for use with databases
where records are separated by a record separator.
This character can be specified after the \-s,
and the program will operate a record at a time,
rather than a sentence at a time.
If the record separator is a magic shell character,
it will have to be quoted or escaped with a backslash.
A \-s alone indicates that records are separated by a blank line,
as are records in \fIrefer\fP bibliographies.
It is similar to the \-F option of \fIawk\fP.
.SH "SEE ALSO"
kwic(hum), awk(1), grep(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
There is no equivalent in \fIsfind\fP to the
[...], and [^...] metacharacters of \fIgrep\fP.
These would be extremely helpful.
@@@ Fin de man/sfind.man
echo man/skel.man
cat >man/skel.man <<'@@@ Fin de man/skel.man'
.TH SKEL HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
skel \- prompt user for database skeleton
.SH SYNOPSIS
\fBskel\fP  outfile
.SH DESCRIPTION
\fBSkel\fP reads from a ``promptfile'' containing a skeleton outline
of subjects in a database, prompts the user for data,
and writes the outline and the data to the ``outfile''.
The promptfile must have exactly that name,
and reside in the working directory.
Outfiles cannot be overwritten, to protect vital information.
.PP
It is possible to escape to a system editor
from where you can easily correct mistakes,
by giving a ``tilde escape'' on the data line.
The tilde must be the first character on the input line.
In the distributed program, ~v will escape to \fIvi\fP,
and ~e will escape to \fIex\fP;
both editors are part of 2bsd and 4bsd
(Berkeley Software Distribution).
If you don't have these editors,
simply change the code and recompile the program
so it will work with \fIed\fP, or you own favorite editor.
.SH FILES
promptfile \- file containing database skeleton
.SH "SEE ALSO"
ex(1), vi(1), sfind(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/skel.man
echo man/tolpr.man
cat >man/tolpr.man <<'@@@ Fin de man/tolpr.man'
.TH TOLPR HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
tolpr \- shift output for the lineprinter
.SH SYNOPSIS
\fBtolpr\fP  [ \fB\-2\fP ]  [ \fB\-h\fP  "Header" ]  [ \fB\-s\fP ]  [ filename ... ]
.SH DESCRIPTION
\fITolpr\fP adds a tab at the beginning of every line,
which moves your text away from the holes and used-up ribbon.
If the first line in a file is non-blank,
\fItolpr\fP also prints page numbers,
inserts three line header and footer margins,
and saves widows for the top of the next page.
If the first line in a file is blank,
\fItolpr\fP only shifts output to the right,
on the assumption that the file is already paginated.
Consequently, it can be used equally well
with \fInroff, pr,\fP and with concordances.
.PP
The -2 option will cause output to be double spaced;
-3 will cause triple spacing, and so forth.
This is a substitute for the .ls 2 of \fInroff/troff,\fP
or the .nr VS 24 of the \-ms macros.
The -h option is used to print a header at the top of each page;
it only works if pagination is in effect.
The -s flag suppresses the shifting to the right.
.SH "SEE ALSO"
nroff(1), pr(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
Sometimes the first line in a file is blank,
but the file is not pre-paginated;
if this occurs, delete the blank line.
@@@ Fin de man/tolpr.man
echo man/tosel.man
cat >man/tosel.man <<'@@@ Fin de man/tosel.man'
.TH TOSEL HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
tosel \- convert English for Selectric terminal
.SH SYNOPSIS
\fBtosel\fP  [ filename ... ]
.SH DESCRIPTION
\fITosel\fP is intended for use with the Anderson-Jacobson
terminal in the Humanities Computing Service.
It will convert Unix files to character strings
that print out properly when a regular 
typewriter ball is used in the AJ-841,
instead of the ebcdic ball normally used in the machine.
.PP
The \fItosel\fP program works much like the Unix utility \fIcat\fP.
That is to say, it can be used to print out one or more files,
or as a filter in a series of programs communicating by pipes.
It can be used before or after \fIpause\fP,
since it does nothing to Control-P.
.SH SEE ALSO
pause(hum), cat(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
The following characters, since they do not exist on a
standard typewriter ball, produce garbage output:
.nf
 <  >  |  \  ^
.fi
The circumflex character produces a blank space.  These 
five characters are not rendered accurately:
.nf
 [  {  ]  }  `
.fi
They produce, in order, these five similar characters:
.nf
 (  (  )  )  '
.fi
Of course, various IBM balls will differ,
and will cause further program bugs.
The \fItosel\fP program was written for
the IBM "Pica 72" 10-pitch ball, but will probably work 
perfectly for any ball that has the characters `!' and `1',
and 1/2 and 1/4.
@@@ Fin de man/tosel.man
echo man/tprep.man
cat >man/tprep.man <<'@@@ Fin de man/tprep.man'
.TH TPREP HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
tprep \- prepare text for concordance (trim, pad, or unpad)
.SH SYNOPSIS
.nf
\fBtprep\fP  [ \fB\-y  \-tpu\fP ]  filename ...
\-y: say yes and suppress interactive prompting
\-t: trim lines, removing trailing blanks and tabs
\-p: pad, inserting blank at beginning of each line
\-u: unpad, deleting blank at beginning of each line
.fi
.SH DESCRIPTION
\fITprep\fP is a semi-interactive text editor
with specific application to preparing text for concordances.
It is much faster than \fIsed\fP,
and will work on far larger files than \fIex\fP or \fIed\fP.
It provides limited facilities:
trimming of trailing blanks or tabs, and padding and unpadding.
.PP
When typing in a text, it is practically impossible to avoid
accidental spaces at the end of lines.
These spurious blanks throw off the results of character counting,
and are unsightly in a \fIkwic\fP-style concordance.
Also, before compiling a \fIkwic\fP concordance,
you may want to pad each line with a blank,
so that the slash indicating newline
is not followed too closely by the next word.
After finishing the concordance, the padding can be removed,
using the unpad option.
.PP
If you do not specify any options in the command line,
you are prompted to make sure you want to rewrite your files.
Then you are asked whether you want to use trim, pad or unpad.
You can answer either with the full word,
or with the first letter of these three words.
\fITprep\fP also tells what files it is rewriting,
and reports on the scope of the changes involved for each file.
.SH FILES
\fITprep\fP makes changes to a file, and writes the results to
/tmp/Prep?????; this file is then copied back on top of
the original file.
.SH "SEE ALSO"
kwic(hum), ex(1), sed(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
It is impossible to stop rewriting files once begun,
because interrupts have been disabled,
since vital information could otherwise be lost forever.
Interrupts should probably halt the process after the next overwrite.
@@@ Fin de man/tprep.man
echo man/troffmt.man
cat >man/troffmt.man <<'@@@ Fin de man/troffmt.man'
.TH TROFFMT HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
troffmt \- format concordance for typesetter
.SH SYNOPSIS
.nf
\fBtroffmt\fP  [ \fB\-ckm\fP ]  [ filename ... ]  [ \- ]
\-c: suppress counting of keyword frequency
\-k: entirely suppress printing of keyword
\-m: do not supply concordance macros automatically
\- : read standard input instead of files
.fi
.SH DESCRIPTION
\fITroffmt\fP is a preprocessor for \fItroff\fP that replaces \fIformat\fP
when using the phototypesetter instead of the lineprinter.
It builds its own macros, so it does not require the \-ms package.
.PP
Keyword counting can be suppressed by using the \-c option;
this will speed up the program somewhat.
To completely suppress printing of a separate keyword, use the \-k option.
.PP
Here is a typical program sequence for a concordance,
suitable for sending to the typesetter:
.nf
 % kwic \-f5 \-c80 filename(s) | sort | troffmt | troff \-Q
.fi
The \-c80 argument to \fIkwic\fP creates a context suitable for the typesetter.
Anything larger may result in lines too long for the typesetter.
If there is no \-f or \-w option, \-c85 would be safe;
with long \-f or \-w options, adjust \-c accordingly.
.SH FILES
\fITroffmt\fP depends in /usr/lib/me/chars.me,
or /usr/lib/mx/tmac.xacc, for accent mark definitions.
.SH "SEE ALSO"
format(hum), kwal(hum), kwic(hum), sort(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
On systems without either -me or -mx, accent marks are undefined.
The \-k option silently overrides the \-c option.
The \-m flag does not have the same meaning
as the \-m flag in \fIformat.\fP
@@@ Fin de man/troffmt.man
echo man/wdlen.man
cat >man/wdlen.man <<'@@@ Fin de man/wdlen.man'
.TH WDLEN HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
wdlen \- tabulate word lengths and print histogram
.SH SYNOPSIS
.nf
\fBwdlen\fP  [ \fB\-l  \-d\fIPfile\fB  \-\fR ]  filename ...
\-l: print long histogram suitable for lineprinter
\-d: define punctuation set according to Pfile
\- : read standard input instead of files
.fi
.SH DESCRIPTION
\fIWdlen\fP reads through a text,
tabulating the frequencies of various word lengths.
Then it prints out these frequencies,
along with a horizontal bar graph of word length.
Word length distribution is one of many stylistic traits
that can be analyzed in a linguistic corpus.
.PP
If there are a great number of words in your text,
the dashes in the bar graph do not have a one to one
correspondence with the frequency count,
but are calculated so that the longest bar fills up the screen.
The length of the bar can be extended with the \-l option.
.PP
If you are working with foreign languages, the \-d option
can be used to split words at the proper place;
the ``Pfile'' is compatible with many other related programs.
.SH "SEE ALSO"
cfreq(hum), freq(hum), maxwd(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
Words longer than 20 characters are not considered.
@@@ Fin de man/wdlen.man
echo man/wheel.man
cat >man/wheel.man <<'@@@ Fin de man/wheel.man'
.TH WHEEL HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
wheel \- roll through text a word cluster at a time
.SH SYNOPSIS
.nf
\fBwheel\fP  [ \fB+n  \-m  \-d\fIF\fR  \- ]  filename ...
+n: print clusters of n words (default 2)
\-m: do not map upper case to lower case
\-d: define punctuation set according to file F
\- : read standard input instead of files
.fi
.SH DESCRIPTION
To analyze syntactic clusters, you can roll
\fIwheel\fP through your text, several words at a time.
The second word of the initial cluster will become
the first word of the following cluster, and so forth.
By default, each output line contains a two-word cluster,
but with the + option, you can specify any cluster size up to 20.
The \-m option prevents mapping of words to lower case,
and the \-d option can be used to specify non-standard punctuation.
.PP
After extracting all the word clusters in your text,
they can be sorted and counted to find repeated patterns.
Here is an example of a command line to accomplish this:
.nf
 % wheel +3 text | sort | uniq \-c
.fi
Of course, \fIsort\fP can be applied to any field desired;
``sort +2'' refers to the third word on each line.
It would be good to analyze syntactic clusters
of two, three, four, and possibly more words a piece.
British scholars use the cumbersome term ``collocation''
to mean word cluster.
.SH "SEE ALSO"
dissolve(hum), freq(hum), sort(1), uniq(1)
.SH AUTHOR
Bill Tuthill
.SH BUGS
@@@ Fin de man/wheel.man
echo man/xref.man
cat >man/xref.man <<'@@@ Fin de man/xref.man'
.TH XREF HUM (rev3.7)
.ds ]W UC Berkeley
.SH NAME
xref \- cross reference generator
.SH SYNOPSIS
.nf
\fBxref\fP  [ \fB\-r \-l\fIn\fP \-p\fIn\fP \-i\fIc\fP \-d\fIF\|\fR \- ]  filename ...
\-r : reset linenumber to 1 at beginning of every file
\-ln: line numbering begins with line n (instead of 1)
\-pn: page numbering begins with page n (instead of 1)
\-ic: page incrementer is character c (defaults to =)
\-wn: width of output page is n (defaults to 80)
\-d : define punctuation set according to file F
\-  : read text from standard input (terminal or pipe)
.fi
.SH DESCRIPTION
\fIXref\fP is a cross reference generator
that lists all distinct words in a text,
with the line number (or page number) where they appear.
Its output constitutes a simple word index,
without the labels or context quoting provided by \fIkwic\fP or \fIkwal\fP.
If you want your concordance to give merely the location 
of certain common words, without any context,
you may want to use selected output of \fIxref\fP.
.PP
If you are cross referencing a number of short texts,
you can reset the linenumber to 1 with the \-r option.
Line number and page number can be set with the \-l and \-p options.
The default pagination character is the equals sign;
if you have another page indicator, it can be set with the \-i option.
In case your text has equals signs that do not indicate a new page,
you could use the \-i option without a character afterwards,
and page labelling will not occur.
.PP
\fIXref\fP will also read a user-definable punctuation set
from the file specified after the \-d option.
It can also read from standard input.
Most importantly, the output width can be set with the \-w option.
For example, to send a cross reference index to the lineprinter,
a \-w130 is recommended.
The default page width is 80,
which is appropriate for a CRT terminal or for regular paper.
.SH FILES
A text is broken into words labelled by line number or page number,
and then sent to a tempfile, /tmp/RefXXXXX,
where the results are sorted before final formatting.
This file is removed in case of interrupt.
.SH "SEE ALSO"
kwal(hum), kwic(hum)
.SH AUTHOR
Bill Tuthill
.SH BUGS
In the tempfile, words are separated
from line numbers (or page numbers) by a control-b,
so if you have this character anywhere in your text,
you will get strange results.
@@@ Fin de man/xref.man
exit 0