[comp.text.tex] word count on latex document

ascott@gara.une.oz.au (Tony Scott STPG) (01/30/91)

 Hi there,
	I need to count the number of words in a document that is
built up of latex commands. Is there something available that will
count words without taking into account words beginning with a
backslash of those within a group such as
\begin{picture}...\end{picture} etc.

Thanks for your help.
Tony Scott

rda@cogsci.ed.ac.uk (Robert Dale) (01/31/91)

ascott@gara.une.oz.au (Tony Scott STPG) writes:

>I need to count the number of words in a document that is
>built up of latex commands. Is there something available that will
>count words without taking into account words beginning with a
>backslash of those within a group such as
>\begin{picture}...\end{picture} etc.

I've often wished for such a thing, but it has always seemed to me
that this really ought to have been part of [La]TeX's functionality:
after all, only [La]TeX knows how many words were *really* output --
remember you have to distinguish between \begin{picture} and
\underline{some words}, and to be able to handle arbitrary text
substitution via macros.

I'd also be interested to know of any solutions -- perhaps doing the
word count on the output of dvi2ps might be easier (only 0.5 :-)?

R

-- 
Robert Dale        Phone: +44 31 650 4416       | University of Edinburgh
UUCP:   ...!uunet!mcvax!ukc!its63b!cogsci!rda   | Centre for Cognitive Science
ARPA:   rda%cogsci.ed.ac.uk@nsfnet-relay.ac.uk  | 2 Buccleuch Place
JANET:  rda@uk.ac.ed.cogsci or R.Dale@uk.ac.ed  | Edinburgh EH8 9LW Scotland

rodgers@clausius.mmwb.ucsf.edu (02/02/91)

In <3635@scott.ed.ac.uk> rda@cogsci.ed.ac.uk (Robert Dale) writes:
>ascott@gara.une.oz.au (Tony Scott STPG) writes:
>>I need to count the number of words in a document that is
>>built up of latex commands. Is there something available that will
>I've often wished for such a thing, ...
>I'd also be interested to know of any solutions -- perhaps doing the
>word count on the output of dvi2ps might be easier (only 0.5 :-)?

How about using the pd program detex (at the stanford archive, I believe,
followed by wc?

Cheerio, Rick Rodgers
R. P. C. Rodgers, M.D.         (415)476-2957 (work) 664-0560 (home)
UCSF Laurel Heights Campus     UUCP: ...ucbvax.berkeley.edu!cca.ucsf.edu!rodgers
3333 California St., Suite 102 Internet: rodgers@maxwell.mmwb.ucsf.edu
San Francisco CA 94118 USA     BITNET: rodgers@ucsfcca

spqr@ecs.soton.ac.uk (Sebastian Rahtz) (02/02/91)

In article <3635@scott.ed.ac.uk> rda@cogsci.ed.ac.uk (Robert Dale) writes:

   I've often wished for such a thing, but it has always seemed to me
   that this really ought to have been part of [La]TeX's functionality:
   after all, only [La]TeX knows how many words were *really* output --

LaTeX has no idea how many words were output. What is a `word'? LaTeX
outputs a description of a page composed of a series of glyphs. LaTeX
is not a word-processor!

The exisiting detex and delatex programs do a good enough job for
spelling checkers, so I assume word counts would work OK

sebastian
--
Sebastian Rahtz                        S.Rahtz@uk.ac.soton.ecs (JANET)
Computer Science                       S.Rahtz@ecs.soton.ac.uk (Bitnet)
Southampton S09 5NH, UK                S.Rahtz@sot-ecs.uucp    (uucp)

rda@cogsci.ed.ac.uk (Robert Dale) (02/03/91)

spqr@ecs.soton.ac.uk (Sebastian Rahtz) writes:

>In article <3635@scott.ed.ac.uk> rda@cogsci.ed.ac.uk (Robert Dale) writes:
>   I've often wished for such a thing, but it has always seemed to me
>   that this really ought to have been part of [La]TeX's functionality:
>   after all, only [La]TeX knows how many words were *really* output --
>LaTeX has no idea how many words were output. What is a `word'? LaTeX
>outputs a description of a page composed of a series of glyphs. LaTeX
>is not a word-processor!
>
>The exisiting detex and delatex programs do a good enough job for
>spelling checkers, so I assume word counts would work OK

The version of detex we run is, of course, useful, but doesn't address
the problems that I mentioned in my response: doing a word count on a
[La]TeX file is the wrong time to do it, since the number of "words"
that appears in the source file need bear no relation to the number of
"words" that will appear in the text.  Running a word count on the
output of a dvivdu would be better.

R
-- 
Robert Dale        Phone: +44 31 650 4416       | University of Edinburgh
UUCP:   ...!uunet!mcvax!ukc!its63b!cogsci!rda   | Centre for Cognitive Science
ARPA:   rda%cogsci.ed.ac.uk@nsfnet-relay.ac.uk  | 2 Buccleuch Place
JANET:  rda@uk.ac.ed.cogsci or R.Dale@uk.ac.ed  | Edinburgh EH8 9LW Scotland