[comp.lang.postscript] Pageview problem

brown@vidiot.UUCP (Vidiot) (03/30/91)

In article <JJC.91Mar28210438@jclark.UUCP> jjc@jclark.UUCP (James Clark) writes:
<
<I have had several people report this problem to me.  Apparently
<pageview gets confused by lines after the first line that begin with
<`%!'.  Note that such lines are allowed by the Document Structuring
<Conventions, so this is a bug in pageview.  If you compile groff
<(actually just ps/ps.c) with -DBROKEN_SPOOLER, it will, amongst other
<things, strip out any such lines.  In fact the Makefile explicitly
<says to use -DBROKEN_SPOOLER if you're going to be using pageview.
<(although it gave an incorrect reason for this in versions earlier
<than 1.01).

See previous posting about a cure for this problem, with what James points
out here as one of them.

But, I have to agree with the logic at Sun regarding how pageview works.
As pointed out in my previous posting, pageview considers the next instance
of %! to mean that another PostScript document has been included and to
ignore all comments until the next %%Trailer, which better be part of the
included document.  Then it will look at the comments again.  This idea
here, as I gather it, is that the included document is part of the page
that it is trying to display.  The included document should only really
be one page and part of the page it was included with.

Well, groff kind-of messes up that logic.  By adding the %! line again,
pageview thought that another PostScript file was included and rightfully
displayed everything until the %%Trailer came along.  In this case, there
wasn't an included document.  I agree with the person responsible for the
pageview code.  If the offending %! is edited from the groff document, the
document is displayed correctly, one page at a time.  With the -DBROKEN_
SPOOLER flag turned on, it too displays correctly.

I feel that groff should not put in the extra %! line at all.  It is the
only thing that fools pageview, no matter what the state of -DBROKEN_SPOOLER.
Either PostScript output will display correctly, except as noted.

I suspect that James will disagree with me, but it is what I believe.
-- 
      harvard\     att!nicmad\          spool.cs.wisc.edu!astroatc!vidiot!brown
Vidiot  ucbvax!uwvax..........!astroatc!vidiot!brown
      rutgers/  decvax!nicmad/ INTERNET:vidiot!brown%astroatc@spool.cs.wisc.edu

glenn@heaven.woodside.ca.us (Glenn Reid) (03/31/91)

Vidiot writes
> But, I have to agree with the logic at Sun regarding how pageview works.
> As pointed out in my previous posting, pageview considers the next instance
> of %! to mean that another PostScript document has been included and to
> ignore all comments until the next %%Trailer, which better be part of the
> included document.  Then it will look at the comments again.  This idea
> here, as I gather it, is that the included document is part of the page
> that it is trying to display.  The included document should only really
> be one page and part of the page it was included with.
> 
> Well, groff kind-of messes up that logic.  By adding the %! line again,
> pageview thought that another PostScript file was included and rightfully
> displayed everything until the %%Trailer came along.  In this case, there
> wasn't an included document.  I agree with the person responsible for the
> pageview code.  If the offending %! is edited from the groff document, the
> document is displayed correctly, one page at a time.  With the -DBROKEN_
> SPOOLER flag turned on, it too displays correctly.
> 
> I feel that groff should not put in the extra %! line at all.  It is the
> only thing that fools pageview, no matter what the state of -DBROKEN_SPOOLER.
> Either PostScript output will display correctly, except as noted.

There is a specific convention for dealing with embedded PS files, and
looking for %! is not it.  After all, %! means very little, in that it is
not even claming to be a conforming document.  In addition, the %%Trailer
comment is optional, and may contain code following the trailer in any
case.  It is certainly a bad idea to do what pageview is described as
doing.

The comments %%BeginDocument and %%EndDocument were created for this
purpose; a spooler could safely skip matching pairs of these comments,
and the comments between them, when scanning a document.  Looking for
%! and %%Trailer is simply not reliable, and in fact would almost never
work, it seems to me.

The BROKEN_SPOOLER constant is useful, although perhaps it should be
BROKEN_PREVIEWER.

Remember, %! just means that the file is PostScript, not that it represents
a document or anything else.  It can appear an arbitrary number of times
within a document.  It is only significant as the very first two characters
in a file.

--
 Glenn Reid				RightBrain Software
 glenn@heaven.woodside.ca.us		NeXT/PostScript developers
 ..{adobe,next}!heaven!glenn		415-851-1785 (fax 851-1470)

brown@vidiot.UUCP (Vidiot) (03/31/91)

In article <464@heaven.woodside.ca.us> glenn@heaven.woodside.ca.us (Glenn Reid) writes:
[...]
<Remember, %! just means that the file is PostScript, not that it represents
<a document or anything else.  It can appear an arbitrary number of times
<within a document.  It is only significant as the very first two characters
<in a file.

I stand corrected.  Thanks for setting me straight.  Now, what is the new
pageview going to do?
-- 
      harvard\     att!nicmad\          spool.cs.wisc.edu!astroatc!vidiot!brown
Vidiot  ucbvax!uwvax..........!astroatc!vidiot!brown
      rutgers/  decvax!nicmad/ INTERNET:vidiot!brown%astroatc@spool.cs.wisc.edu

naughton@wind.Eng.Sun.COM (Patrick Naughton) (04/07/91)

In article <464@heaven.woodside.ca.us>, glenn@heaven.woodside.ca.us (Glenn Reid) writes:
|> 
|> There is a specific convention for dealing with embedded PS files, and
|> looking for %! is not it.  After all, %! means very little, in that it is
|> not even claming to be a conforming document.  In addition, the %%Trailer
|> comment is optional, and may contain code following the trailer in any
|> case.  It is certainly a bad idea to do what pageview is described as
|> doing.
|> 
|> The comments %%BeginDocument and %%EndDocument were created for this
|> purpose; a spooler could safely skip matching pairs of these comments,
|> and the comments between them, when scanning a document.  Looking for
|> %! and %%Trailer is simply not reliable, and in fact would almost never
|> work, it seems to me.

I was motivated to look for %! when I found a large number of
non-conforming files constructed of many (possibly conforming) files
concatenated together; as well as properly formatted EPSF documents
inserted into a document *without* the %%Begin/EndDocument pair
surrounding them.  I needed to ignore the %%Page (which was also
unnecessarily included) in the EPSF document and the only means at my
disposal was to ignore all comments between %! and %%Trailer (the
comments which were in the "EPSF" documents...)  Since then pageview is
more strict about the first line matching the "%!PS-Adobe-#.# EPSF-#.#"
format.

Basically all of the problems that ANY previewer faces trying to parse
arbitrarily mangled "PostScript" files is that the PostScript language
was under-specified in the first place.  Using *comments* to describe
such important features of the language as *scope* and *flow control*
is totally ludicrous! (especially in a non-serialized,
non-laserwriter-like, real world scenario)

The failure to completely define the language has given developers
enough rope to produce volumes of VERY BAD "PostScript".  Writing a
previewer which only previews documents which conform to the latest
revision of the "commenting conventions" is almost useless...  Writing
one which ignores the comments and relies on showpage being called to
find pagebreaks is also useless for people who want to read the 300'th
page of a document without wadeing through the whole thing.  Writers of
previewers are forced to write heuristics and hacks to interpret the
miriad of garbage that many people call PostScript simply because it
produces a printed document on an Apple LaserWriter!

|> 
|> The BROKEN_SPOOLER constant is useful, although perhaps it should be
|> BROKEN_PREVIEWER.

useful comment there glenn, thanks for the input...

|> 
|> Remember, %! just means that the file is PostScript, not that it represents
|> a document or anything else.  It can appear an arbitrary number of times
|> within a document.  It is only significant as the very first two characters
|> in a file.
|> 
|> --
|>  Glenn Reid				RightBrain Software
|>  glenn@heaven.woodside.ca.us		NeXT/PostScript developers
|>  ..{adobe,next}!heaven!glenn		415-851-1785 (fax 851-1470)

-- 
    ______________________________________________________________________
    Patrick J. Naughton				   email: naughton@sun.com
    Sun Laboratories				   voice: (415) 336 - 1080

naughton@wind.Eng.Sun.COM (Patrick Naughton) (04/07/91)

In article <1565@vidiot.UUCP>, brown@vidiot.UUCP (Vidiot) writes:
|> I stand corrected.  Thanks for setting me straight.  Now, what is the new
|> pageview going to do?
|> -- 
|> Vidiot  ucbvax!uwvax..........!astroatc!vidiot!brown

it will parse the Begin/EndDocument comments if they are present.

-- 
    ______________________________________________________________________
    Patrick J. Naughton				   email: naughton@sun.com
    Sun Laboratories				   voice: (415) 336 - 1080