ifw@frc.UUCP (Ian West) (05/01/88)
I would like to obtain text analysing tools similar to UNIX STYLE and DICTION but operating under MSDOS in ibm-pc. Any assistance appreciated. If necessary prepared to spend some dollars -- Ian F. West, tel + 64 4 861 029 Scientist (Mathematical Statistics), telex MAFFCC NZ 30049 Fisheries Research Centre, P.O.Box 297, usenet ...vuwcomp!frc!ifw Wellington, usenet(domainised) ifw@greta.maf.govt.nz NEW ZEALAND. (dis)organisation :Fisheries Group, New Zealand Ministry of Agriculture & Fisheries. (MAFFISH)
bd@hpsemc.HP.COM (bob desinger) (05/10/88)
Ian West (ifw@frc.UUCP) writes: > I would like to obtain text analysing tools similar to UNIX STYLE and > DICTION but operating under MSDOS in ibm-pc. Any assistance appreciated. > If necessary prepared to spend some dollars I have some public-domain code written by Gary Perlman that graphically displays your text. The graphical display works on any ASCII terminal (it's a series of underscores and punctuation; see the man page below). It highlights the complicated passages of your document so you can rewrite them. The tools don't compute Flesch Indexes or anything, but then again the numbers that `style' spits out are often meaningless to mere mortals anyway. I can send you the sources---they are free---or post them to comp.sources.misc or somewhere. Spend your money on a good C compiler for your PC. These sources are written for Unix, but if your run-time library is pretty good it may not be much effort to port them. bob desinger uunet!hpda!hpsemc!bd bd%hpsemc@hplabs.HP.COM P.S. Gary, are you on the net these days? P.P.S. Here are the man pages for `punc' and `headings'. A third program, `abstract,' ties together the output of these two programs but it doesn't have a man page. PUNC(1) PUNC(1) NAME punc - graphically display sentences using their punctuation SYNOPSIS punc [-lmpsw] [-c criterion] [-] [files] DESCRIPTION punc prints graphical representations of sentences. This graphical representation has the properties that the representation is long when the sentence is long, and the representation looks complex when the sentence is complex. The program works by displaying text, one sentence per line, with embedded punctuation retained, and underscores substituted for words. For example, the previous two sentences of this man entry look like: ________________,__________. ______,____,____,_____. OPTIONS -c length Print only those sentences with "punc" lengths greater than the criterion. -l Print the line numbers of the text where the sentences begin. -m Map words to different classes represented by characters. Upper case word are shown as the ^ character. & conjunctions (and, but, ...) | disjunctions (or) # numbers (first, one, ...) ~ negations (not, never, ...) " pronouns (he, she, ...) w who, what, where, when, why, ... t a, the, that, those, ... This set of words is incomplete. -p Print the sentences after the graphical representation. -s Print the sentence numbers before the graphical representation. -w Print the length of words instead of underscores for words. Words longer than 10 characters are printed as *, and ten character words are printed as 0. SEE ALSO headings(1) for a high-level representation of a paper. AUTHORS Tom Erickson and Gary Perlman BUGS The way the program identifies the end of a sentence is too simple and it can be fooled badly. Sentences must end at the end of lines. Nroff macros are not handled intelligently by the program; deroff does a better but not perfect job and should be used as a preprocessor. HEADINGS(1) HEADINGS(1) NAME headings - show headings from nroff source file SYNOPSIS headings Usage: headings [-cflns] [-h header] [-p para] [-P mark] [-m min] [-M max] [-] [file] DESCRIPTION Headings is used to create tables of contents and outlines of papers based on the nroff headings macros in the text. Subheadings are indented below their superheadings to show the structure of the paper. OPTIONS -n Number sections according to their order and depth. -l Line numbers from the input are printed along with the headings. -c Characters read from the input are printed along with the headings. -m N The minimum section level is taken to be N. Sections of level less than N will be shown, but not indented nor numbered. -M N The maximum section level is taken to be N. Sections of level greater than N will not appear in the output. -h XX The "next" heading macro is taken to be XX, where XX is a one or two letter macro name. It is assumed that the level of each successive heading macro is one greater than the previous (see the macros example below). If no macros are specified, the -me standard of ".sh" is assumed. If only one macro is supplied, a numerical argument indicating the section level is assumed to follow the call to the macro. This is used as an index of how much to indent the headings. This option must be the last in a sequence started with a flag. If different section macros are used for different levels, a new flag argument must be added. EXAMPLES The -mcsl macros at the UCSD Cognitive Science Lab are based on the APA standard headings: hh (high), mh (main), lh (left), and ph (paragraph). The call to headings for these macros would look like: headings -h hh -h mh -h lh -h ph file... To get a standard memorandum macro (-mm) table of contents (this includes numbering and headings for level 1 and 2 only), use: headings -n -M 2 -h H file... SEE ALSO wwb(1), org(1) AUTHOR Gary Perlman