nagar@netcom.COM ( Nagar) (05/26/91)
I am looking for a postscript to text converter, is there such a program available through ftp from simtel20 or some other site? I would appreciate a e_mail.. thanks
mathew@jane.Jpl.Nasa.Gov (Mathew Yeates) (05/27/91)
In article <1991May26.063129.26177@netcom.COM> nagar@netcom.COM ( Nagar) writes: >I am looking for a postscript to >text converter, is there such a >program available through >ftp from simtel20 or some other >site? > >I would appreciate a e_mail.. > >thanks I too am interested, and dubious that such a thing exists.
gtoal@tardis.computer-science.edinburgh.ac.uk (05/27/91)
In article <1991May26.181915.14910@elroy.jpl.nasa.gov> mathew@jane.Jpl.Nasa.Gov (Mathew Yeates) writes: >In article <1991May26.063129.26177@netcom.COM> nagar@netcom.COM ( Nagar) writes: >>I am looking for a postscript to >>text converter, is there such a >>program available through >>ftp from simtel20 or some other >>site? >> >>I would appreciate a e_mail.. >> >>thanks > >I too am interested, and dubious that such a thing exists. This is going to sound silly, but the best way of getting what you want is to print out your postscript and scan it back in! If you haven't got a scanner, get a copy of Ghostscript and output to some bitmap form which can be read in by one of the PD OCR packages -- cut out the middle man :-) If you're a real hacker, get the Ghostscript sources and hack them to output any text to a data structure instead of the bitmap, and do an x-y sort on your data structure. Modulo superscripts and subscripts, you might have a chance of reconstructing lines. Graham PS Don't mail me asking where to find ghostscript or ocr software - I don't know...
clewis@ferret.ocunix.on.ca (Chris Lewis) (05/28/91)
In article <9105262212.AA29690@ucbvax.Berkeley.EDU> gtoal@tardis.computer-science.edinburgh.ac.uk writes: >In article <1991May26.181915.14910@elroy.jpl.nasa.gov> mathew@jane.Jpl.Nasa.Gov (Mathew Yeates) writes: >>In article <1991May26.063129.26177@netcom.COM> nagar@netcom.COM ( Nagar) writes: >>>I am looking for a postscript to >>>text converter, is there such a >>>program available through >>>ftp from simtel20 or some other >>>site? >This is going to sound silly, but the best way of getting what you >want is to print out your postscript and scan it back in! If you have a scan-2-text converter rather than simply a raster reader. >If you're a real hacker, get the Ghostscript sources and hack them >to output any text to a data structure instead of the bitmap, and >do an x-y sort on your data structure. Modulo superscripts and >subscripts, you might have a chance of reconstructing lines. You can do this without Ghostscript. I've taken the output of various text processors and reconstructed an ASCII version using perl (this is also doable in awk). You need to search for the (x,y) coordinate settings, and translate these into row and column positions, and then "drop" the strings enclosed in parenthesis at that position. Hard things are if the postscript contains reverse line motion (which requires you to buffer a whole page). Or, if the point sizes vary a lot. Of course, this approach won't handle graphics and other stuff, but as long as your scanner is reasonably accurate in only snagging x:y and text display commands, it'll work well enough. If you're familiar with awk or perl, you can usually whomp one of these things up in about an hour. Sorry I didn't save the one I did for someone else on the net. -- Chris Lewis, Phone: (613) 832-0541, Domain: clewis@ferret.ocunix.on.ca UUCP: ...!cunews!latour!ecicrl!clewis; Ferret Mailing List: ferret-request@eci386; Psroff (not Adobe Transcript) enquiries: psroff-request@eci386 or Canada 416-832-0541. Psroff 3.0 in c.s.u soon!