[comp.lang.postscript] Postscript to ASCII filter?

tneff@bfmny0.BFM.COM (Tom Neff) (01/30/91)

In article <9101301059.AA29311@lilac.berkeley.edu> lwv27@CAS.BITNET writes:
>I know this is going to sound strange.  But I would like to be able
>to turn a Postscript document into a plain text file.  

I suspect you could hack GhostScript to do this by redefining "show."
But read on...

>                                                        The reason is
>that I would like to run it thru a spelling checker - and I have yet
>to find one which can interpret most PostScript mangled texts.  For instance,
>most words in such documents do not even appear as words, but as pieces
>of words positioned on the page...

The problem here is what you do AFTER you've spell checked the text
file!  You need to rebuild the PostScript again afterwards if what you
wanted was a spell-checked, printer-ready document.  So, if you have
something that lets you take text and reformat it as PostScript, why not
do your spell checking in the text itself BEFORE creating any PS?

Because, after all -- PostScript positions the text very precisely where
it's supposed to go to make a nicely formatted document.  ESPECIALLY in
the documents Larry mentions above where words themselves are split!  If
you plow through that kind of PostScript changing c/thub/thumb/ and
c/pinochio/Pinocchio/, you will mess up the layout.

Despite these logistical problems, a text filter would be nice to have,
bearing in mind that order-on-page is entirely arbitrary.

-- 
'The Nazis have no sense of humor, so why   -|  Tom Neff
should they want television?' -- Phil Dick  |-  tneff@bfmny0.BFM.COM

lwv27@CAS.BITNET (01/30/91)

I know this is going to sound strange.  But I would like to be able
to turn a Postscript document into a plain text file.  The reason is
that I would like to run it thru a spelling checker - and I have yet
to find one which can interpret most PostScript mangled texts.  For instance,
most words in such documents do not even appear as words, but as pieces
of words positioned on the page...

any ideas?
--
Larry W. Virden                 UUCP: osu-cis!chemabs!lwv27
Same Mbox: BITNET: lwv27@cas    INET: lwv27%cas.BITNET@CUNYVM.CUNY.Edu
Personal: 674 Falls Place,   Reynoldsburg,OH 43068-1614
America Online: lvirden