dhw@iti.org (David H. West) (05/25/91)
The Subject says it all. Fully general conversion isn't necessary, luckily, since I'm only interested in the text content of the input. I thought I saw one or more of these go by a month or two ago, but I didn't need it then, and I can't remember the name(s), and a scan through the comp.sources.misc index doesn't yield any likely names. -David West dhw@iti.org
raymond@math.berkeley.edu (Raymond Chen) (05/25/91)
The only way to do it (short of writing your own Postscript interpreter) is to customize the parser for the Postscript file itself. Here's one I wrote that handles groff Postscript output. #!/usr/unsupported/perl # Skip the leading glop while (<> !~ /^%%Page: 1 2/) { ; } @stack = (); $y = 0; main: while(<>){ chop; while (s/\\$//) { $_ .= <>; chop; } next if /^%/; s/\\\(/\\050/g; s/\\\)/\\051/g; while ($_) { s/^\s*//;#nuke leading whitespace if (s/^([\d.-]+)//) { # a number push(@stack, $1); } elsif (s/^\/[@_a-zA-Z-]+//) { # a literal push(@stack, ""); } elsif (s/^\(([^)]*)\)//) { # a string push(@stack, $1); } elsif (s/^(\w+)//) { # a command $c = $1; if ($c eq "C") { print &spaceout($stack[2]); } elsif ($c eq "E") { print "~" if $stack[1] > 0; print &fixup($stack[0]); } elsif ($c eq "F") { print &fixup($stack[1]); } elsif ($c eq "F2") { ; } elsif ($c eq "G") { print &spaceout($stack[1]); } elsif ($c eq "H") { print &fixup($stack[2]); } elsif ($c eq "Q") { &moveshow; } elsif ($c eq "R") { shift(@stack); &moveshow; } elsif ($c eq "S") { shift(@stack); &spaceout($stack[0]); &moveshow; } elsif ($c eq "T") { shift(@stack); shift(@stack); &moveshow; } elsif ($c eq "BP") { } elsif ($c eq "EP") { print "\n", "-" x 40, "\n"; } elsif ($c eq "SF") { } elsif ($c eq "end") { last main; } else { print STDERR "\7", join(":", @stack), " <$c>?\n"; } @stack = (); } elsif (s/^<(.*)>//) { # a hex string $c = ""; $d = $1; while ($d =~ s/(..)//) { $c .= sprintf("%c", hex($1)); } push(@stack, $c); } else { print STDERR "\7How to parse $_?\n"; } } } sub moveshow { if ($y != $stack[2]) { $y = $stack[2]; print "\n"; } else { print "~"; } print &fixup($stack[0]); } sub spaceout { @t = split(//, $_[0]); $_[0] = &fixup(join(" ", @t)); } sub fixup { $_[0] =~ s/\.\.\.\.[.]*/\.\.\.\./; $_[0] =~ s/\\(\d\d\d)/sprintf("%c",oct($1))/eg; $_[0] =~ s/\214/fi/g; $_[0] =~ s/\215/fl/g; $_[0]; }