[comp.lang.postscript] ^D embedded in PS files

richk@pogo.WV.TEK.COM (Richard G. Knowles) (05/04/89)

In article <22489@ccicpg.UUCP> evans@ccicpg.UUCP ( Scott Evans) writes:
>In article <98@snll-arpagw.UUCP>, paolucci@snll-arpagw.UUCP (Sam Paolucci) writes:
>> Thus it appears to me that in a PC world where access to the serial
>> device is easy and transparent, the only option is for the application
>> program to append a ^D to the end of the PostScript output.  This
>> guarantees that the file will be handled properly whether it is
>> redirected to the serial device, or later copied to it.
>> 
>Can we help it if the PC is a "brain-dead" kind of machine and has no
>other way telling when it gets to an "end of file"????
>
>This is purely a kludge to get around a limitation on the PC.  It is
>especially apparent when you see a ^D not only at the end of a Postscript
>file but also at the beginning!!!
>
>The postscript printer that is hooked up to our Unix system does not like the
>^D at the beginning of the file and so we have to filter all the files that
>come from a PC before they can be printed.  

Whoa there.  Even though I hate to defend the PC, your argument for having
to filter the ^D's out of PC produced PS files is misdirected.  The printer
doesn't care one whit about whether a job has a ^D at the start or not.
What is "brain-dead" is your UNIX spooler which is relying on the ^D echo to
decide when a job has finished.  A ^D at the start of a job makes the
spooler think the job has finished even before all it has been sent.  This
is one dependancy that Adobe's Transcript package has with its lpr filters
(and one which I have removed to be able to effectively use both HP-GL and
PS in the same printer through the same spooler).

BTW, a ^D at the beginning of the file is there precisely because it makes
up for those prior jobs that don't end with a ^D and would leave me with a
printer whose state quite likely is not compatible with my job (like not
leaving much VM or having redefined some operators that I use).

-------- Whatever I say is my fault and no one elses! -----------

Richard G. Knowles                        richk@pogo.WV.TEK.COM
Graphics Printing and Imaging                (503) 685-3860
Tektronix, Inc; D/S 63-356
Wilsonville, Or 97070			or just yell "Hey, Rich!"

batcheldern@hannah.dec.com (Ned Batchelder, PostScript Eng.) (05/05/89)

In article <7207@pogo.WV.TEK.COM>, richk@pogo.WV.TEK.COM (Richard G.
Knowles) states that the Unix spooler is "brain-dead" because it cares
about ^D.

How should a spooler detect the end of the job?  The entire reason the ^D
is echoed is for a spooler to be able to detect the end of the job.  ^D's
are the provence of the spooler ENTIRELY.  THEY DO NOT BELONG IN FILES.
THEY ARE NOT PART OF THE POSTSCRIPT LANGUAGE.  PERIOD.  To anyone who
believes ^D is part of the PostScript language:  Show me where in the
red book it says that it is.

If no files contained ^D, and all spoolers used them, as was intended,
everything would be fine.  Richard states that the reason he likes ^D at
the beginning of a file is so that if the last file didn't end with one,
then everything is ok.  The correct solution to the problem is for your
printing software to use whatever signal is appropriate for end-of-file (it
may not be ^D) at the end of every job. Then everything will really be fine.

--Ned Batchelder, Digital Equipment Corp, BatchelderN@Hannah.DEC.com

wcs) (05/16/89)

In article <8905051356.AA01321@decwrl.dec.com> batcheldern@hannah.dec.com (Ned Batchelder, PostScript Eng.) writes:
]In article <7207@pogo.WV.TEK.COM>, richk@pogo.WV.TEK.COM (Richard G.Knowles)
]states that the Unix spooler is "brain-dead" because it cares about ^D.
]
]How should a spooler detect the end of the job?  [...]
]^D's are the provence of the spooler ENTIRELY.  THEY DO NOT BELONG IN FILES.
]THEY ARE NOT PART OF THE POSTSCRIPT LANGUAGE.  PERIOD.  To anyone who

But *why* should a spooler care about the end of a job?  A typical
postscript program sets up some initial stuff, does some pages each ending
in showpage, and does some cleanup.  The main things a spooler does that are
job-based are preloading fonts and reversing pages, which do require support
from the language to do (if you count EPSF as part of the language, and if
not, you shouldn't try reversing pages).

So if the printer gets stuff starting with EPSF introductory comments, it
should do whatever printer-setups it needs, and process stuff until it gets
to an EPSF terminating comment.  If it gets stuff starting with anything
else, it should just keep reading data and doing a page at a time until
something interesting comes along to reset it.  I agree that ^D  is not the
ideal character to put in a file, but it's real ASCII, and should be
tolerated (albeit treated as an undefined word if nobody's used it.)

-- 
# Bill Stewart, AT&T Bell Labs 2G218 Holmdel NJ 201-949-0705 ho95c.att.com!wcs
	# also found at 201-271-4712 tarpon.att.com!wcs 
# But the treaty says we have to give Panama back to the Panamanians!
#    Don't worry - we'll think of something.  Corruption?  Yeah, that's it!

richk@pogo.WV.TEK.COM (Richard G. Knowles) (05/18/89)

In article <645@cbnewsh.ATT.COM> wcs@cbnewsh.ATT.COM (Bill Stewart 201-949-0705 ho95c.att.com!wcs) writes:
>
> But *why* should a spooler care about the end of a job?

The primary reason a spooler needs to know the actual end of a job (ie, when
the interpreter has finished parsing/executing all input tokens and is waiting
for more) is for proper resource accounting.  A secondary reason is to be
able to give the user some assurance that his job will not be effected by
someone else's (ever had someone invoke the image/colorimage command but not
supply enough data?).

I have no idea how many installations actually *use* accounting info, but
since the spoolers generally support the concept you need to be able to
detect when the job is complete so the amount of consumables can be
calculated and an appropriate charge be made to the user.  There is more
than one way to do that, such as:

  1)  adding a special message to the end of a job;
        cons: -- must wrap the job in a save/restore so as to protect the
	         message from redefinitions made by the job
                 *or* append an END-OF-JOB character/message (^D) to the job
		 followed by your special message;
              -- exitserver requests could not be allowed;
              -- user could duplicate the message and thereby avoiding
		 charges;
	      -- How long do you wait?;
	      -- Unreliable if communication channel has any chance of
                 overrunning (quite possible on busy CPU's);

  2)  send status requests and watching for the "waiting" or "idle" state;
        pros: -- probably should be doing status requests anyway to be able
	         to detect error conditions;
        cons: -- "waiting" is not a sufficient indicator (since waiting can
		 in the middle of jobs if your comm channel is slow).  Might
		 have to append a END-OF-JOB (^D) to force the idle state,
		 but imbedded ^D's could also confuse the issue.

I'm sure there are additional ways, but using a ^D echo is much simpler.
The spooler could be coded in such a way that ^D's are valid in jobs and
still use it itself by just detecting any embedded ^D's.  Once detected,
pass it on but hold back any data that follows until the printer echos it
back or the status goes idle, then continue on with the rest of the data.
This way, the spooler knows that only one ^D is outstanding at any one time
and can make a fairly safe assumption about the echo it gets back from the
^D it appended to the job.

-------- Whatever I say is my fault and no one elses! -----------

Richard G. Knowles                        richk@pogo.WV.TEK.COM
Graphics Printing and Imaging                (503) 685-3860
Tektronix, Inc; D/S 63-356
Wilsonville, Or 97070			or just yell "Hey, Rich!"

batcheldern@hannah.dec.com (Ned Batchelder, PostScript Eng.) (05/22/89)

In article <645@cbnewsh.ATT.COM>, wcs@cbnewsh.ATT.COM (Bill Stewart) writes:
> But *why* should a spooler care about the end of a job?

In article <7279@pogo.WV.TEK.COM>, rick@pogo.WV.TEK.COM (Richard G.
Knowles), replies that accounting is a good reason.

Another good reason is that a queueing system should be able to tell when
an entry has safely left the queue.  For a printing system, that is only
when the job has successfully printed, and that can only be determined by
waiting for the ^D echo back from the printer.  

On VMS systems, you can PRINT/DELETE a file.  Users would be extremely
upset if the symbiont simply assumed that because it had sent all the data,
that the job had printed, and then deleted the file.  There are many ways
for a printer not to complete a job.

Another reason is for accurate notification of job completion.  Even if we
could assume that shipping the data to the printer guaranteed that the job
would complete, when would it complete?  By having an explicit handshake
between the spooler and the printer, the spooler knows that, in fact, the
job is completed, and if the user gets up from his desk and walks over to
the printer, he will be able to pick the job up and walk away.

Ned Batchelder, Digital Equipment Corp, BatchelderN@Hannah.DEC.com

greid@adobe.com (Glenn Reid) (05/23/89)

In article <8905221304.AA14261@decwrl.dec.com> batcheldern@hannah.dec.com (Ned Batchelder, PostScript Eng.) writes:
>Another reason is for accurate notification of job completion.  Even if we
>could assume that shipping the data to the printer guaranteed that the job
>would complete, when would it complete?  By having an explicit handshake
>between the spooler and the printer, the spooler knows that, in fact, the
>job is completed, and if the user gets up from his desk and walks over to
>the printer, he will be able to pick the job up and walk away.

This is a worthy goal, but it isn't always true, unfortunately.  The
end-of-file (it isn't always a ^D) is echoed by the printer when an EOF
is read from the input stream. Usually, this corresponds to the end of
the job, since the interpreter has finished executing all the tokens it
has received, and wants more.  However, there are several circumstances
under which a printer can consume the EOF, say "thank you" (by echoing
another EOF), but not finish processing the job for quite some time.
One of these is a job that reads data from the end of file itself, in
which case it is not the scanner that hits EOF, it is, say, the
"readhexstring" operator.  If the program then turns around to process
the data, it may be quite some time before it finishes, in an extreme
case.  It can also generate ERRORS even after the data connection has
been broken with the host, which can be difficult deal with.  Another
instance (I think) is when the EOF follows the last token directly,
without intervening white space, as in:

	showpage^D

instead of

	showpage
	^D

The EOF then delimits the token, and is seen by the scanner while
recognizing the name.  I think the EOF is echoed then, by the scanner,
rather than after the "showpage" name has been executed, but I'm not
sure of it.  In any case, it is an implementation detail.

To echo what Ned Batchelder said, envision a scenario where you have a
2k communications buffer (very common), and you are sending a 1k file.
The whole file will fit into the buffer (including the EOF), and
theoretically you could break the connection then or send another file
on its heels.  But you may, instead, want to wait for the EOF to be
echoed back to indicate that the job is done, or to detect an error and
potentially retransmit the job instead of transmitting the next one.

One way to think of it is that it is not a printer at all, but another
computer connected through some communications protocol.  The protocol
usually has (and should have) some basic handshaking to determine when
a file has been completely received, and that's what the echoing of EOF
does.  The "spooler" is not a traditional spooler, because it is not
dealing with a dumb printer.  It is really a communications program, a
printer monitor, and lots of other things.

Glenn Reid
Adobe Systems

roy@phri.UUCP (Roy Smith) (05/23/89)

batcheldern@hannah.dec.com (Ned Batchelder, PostScript Eng.) writes:
> On VMS systems, you can PRINT/DELETE a file. [...]  By having an explicit
> handshake between the spooler and the printer, the spooler knows that, in
> fact, the job is completed, and if the user gets up from his desk and
> walks over to the printer, he will be able to pick the job up and walk
> away.

	Ned deserves lots of praise for his wonderful n-up printing work a
few years ago, but I think he missed the point by a mile on this one.  By
having an explicit handshake between the printer and the spooler process,
all the process knows is that the printer *says* the job is complete.
That's better than just shoveling the data at the printer and hoping it can
cope, but it sure isn't a firm promise that you can pick up the output and
walk away with it.  Somebody might have loaded the printer with the wrong
forms.  The printer might have jammed and not detected it.  It might have
run out of toner (our most common form of "silent failures" on print jobs).
Maybe the printer had a software failure and is lying about the job status.
Or, maybe somebody simply got to the printer first and took your output by
mistake (or even on purpose).

	The point is, I think PRINT/DELETE is a *big* misfeature to put in
a printer spooler.  Maybe it should be some hard-to-get-at option for
internal use, but not as a easily misused command line option.  For once, I
can't dump on VMS for this one; the Unix spoolers have similar options,
with exactly the same problem.  In fact, it's probably even worse on Unix.
It's a lot easier to type "-r" by mistake than "/DELETE".  I know.  I've
done it.  More than once.
-- 
Roy Smith, System Administrator
Public Health Research Institute
{allegra,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@phri.nyu.edu
"The connector is the network"

gore@eecs.nwu.edu (Jacob Gore) (05/23/89)

/ comp.lang.postscript / greid@adobe.com (Glenn Reid) / May 22, 1989 /
[Lots of reasons why not to expect a ^D to mean "printer is done with the job"]

Any suggestions?  Our spooler does need to know when the job terminates (if
nothing else, then to request the new page count in order to do
accounting).  Does it help to have the spooler strip ^D's out of output
sent to the printer?

Jacob Gore				Gore@EECS.NWU.Edu
Northwestern Univ., EECS Dept.		{oddjob,chinet,att}!nucsrl!gore

CET1@phoenix.cambridge.ac.UK (Chris Thompson) (05/24/89)

Glenn Reid writes:
> The end-of-file (it isn't always a ^D) is echoed by the printer when
> an EOF is read from the input stream.
and
> I think the EOF is echoed then, by the scanner, rather than after the
> "showpage" name has been executed, but I'm not sure of it. In any
> case, it is an implementation detail.
and much else in the same vein.

I suppose it is foolish of me to argue with someone from Adobe about
their own software, but I think Glenn is just plain wrong about this.
The ^D (EOT) character is sent *from* the printer as part of the
server loop, in the finish-off-the-current-job code, before it starts
to think about where to get the next job from. Reaching the EOT in
the input buffer just causes the scanner (or anyone else reading
"currentfile") to see end-of-file (and you can't get past it, at least,
not without trickery). Nothing is sent from the printer at this stage.

As Glenn points out, if this wasn't the case then the ^D's wouldn't
properly delimit the error messages produced by different jobs, and
in fact they do. (I suppose I should add: on a LaserWriter, rev 0 or
2; but surely this can't be a matter of different implementations on
different printers?)

As others have pointed out, it would be unwise to assume that the
^D from the printer meant "I have printed that job perfectly" rather
than "I have told you all I am ever going to tell you about that job",
but at least it does mean the latter.

Chris Thompson
JANET: cet1@uk.ac.cam.phx
ARPA:  cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk

greid@adobe.com (Glenn Reid) (05/26/89)

In article <A05B438E08C6CA70@UK.AC.CAM.PHX> CET1@phoenix.cambridge.ac.UK (Chris Thompson) writes:
>Glenn Reid writes:
>> The end-of-file (it isn't always a ^D) is echoed by the printer when
>> an EOF is read from the input stream.
>and
>> I think the EOF is echoed then, by the scanner, rather than after the
>> "showpage" name has been executed, but I'm not sure of it. In any
>> case, it is an implementation detail.
>and much else in the same vein.
>
>I suppose it is foolish of me to argue with someone from Adobe about
>their own software, but I think Glenn is just plain wrong about this.

I was mistaken.  You're right.  ["I'm left, she's gone..."]

>The ^D (EOT) character is sent *from* the printer as part of the
>server loop, in the finish-off-the-current-job code, before it starts
>to think about where to get the next job from. Reaching the EOT in
>the input buffer just causes the scanner (or anyone else reading
>"currentfile") to see end-of-file (and you can't get past it, at least,
>not without trickery). Nothing is sent from the printer at this stage.

This is correct; it is handled by the server loop when the output file
stream is closed.  Closing a file causes the appropriate EOF to be
written.

>As Glenn points out, if this wasn't the case then the ^D's wouldn't
>properly delimit the error messages produced by different jobs, and
>in fact they do. (I suppose I should add: on a LaserWriter, rev 0 or
>2; but surely this can't be a matter of different implementations on
>different printers?)

It's the same on all the printers that support serial communications.
My error. Sorry about that.

>As others have pointed out, it would be unwise to assume that the
>^D from the printer meant "I have printed that job perfectly" rather
>than "I have told you all I am ever going to tell you about that job",
>but at least it does mean the latter.

Right.  Whether the job was printed to the user's satisfaction is a
different issue than whether the job was printed to the printer's
satisfaction.  For example, many printers are happy to substitute
Courier for unknown fonts.  A message is sent back to the host to
indicate this substitution, but a job printed in a default font is
usually not to a user's satisfaction, and should be reprinted with the
correct fonts downloaded or perhaps the whole print job should be
rejected by the spooler as unprintable, depending on the user's wishes.

It is complicated, isn't it?  Maybe we should rehash the simple issues
of what should be done, rather than the implementation details.  Here's
my opinion about it:

	1.  Page composition software composes page, selecting
	    fonts (by name), placing images and text at arbitrary
	    locations in user space.

	2.  Page composition software produces device-independent
	    PostScript language file representing imaging
	    instructions required to paint the above-mentioned
	    page, and puts %%Comments in place to indicate the
	    proper media and font requirements of the print file.

	3.  This file is handed to a print spooler, which attempts
	    to resolve the printer-dependent issues like media
	    selection and perhaps downloading of fonts, depending
	    on the document requirements.  A more refined print
	    file is produced, where the %%Comments are resolved
	    for the particular intended printer.  More %%Comments
	    are put surrounding the printer-dependent code, so
	    that the process can be repeated by another spooler
	    if necessary.

	4.  The spooler opens a communication channel with the
	    chosen printer.

	5.  The spooler transmits the file to the printer, monitoring
	    status, watching for %%[ Error ]%% messages, and other
	    appropriate things.

	6.  The spooler sends the appropriate EOF indication
	    at the end of the job, based on (4).

	7.  The spooler waits for confirmation of the job, in the
	    form of an echoed EOF, from the printer.  Any errors
	    are logged or sent back to the original submittor.
	    Job accounting may take place (counting pages, looking
	    at wristwatch, etc.)

No doubt this will stir further controversy somehow, but it does seem
to be an important issue.

Glenn Reid
Adobe Systems

dbrooks@osf.OSF.ORG (David Brooks) (05/27/89)

As the one who wrote the Prime PostScript despooler, my problem (once
I had finished) was precisely with the non-standard nature of the
implementation. 

I interpreted Appendix D of the Red Book, and the implementations on
machines that came after the LaserWriter (QMS1200/2400, QMS800, Data
Products LZRsomething) to mean:

"This isn't actually the PostScript standard, but we're establishing
 it as a de facto expectation."

This I took to mean things like the serverloop execution, ^D, ^C, ^T,
most of statusdict, and so on.  I kind of came to rely on these,
although I was nervous because I knew Adobe didn't regard them as
standard (and I didn't want to go playing with printer description
files).  Everything went well until...

Bong! the DEC LNO3 standard password wasn't 6 digits, but a string
whose initial value was LN03R.  And, since the software read the
password out of a (protected) environment file, and assumed it was
only 6 characters long, and we now had to write (LN03R) (7 characters)
to get the semantics right... you fill in the rest.

We found a disgusting hack to work around it.  You don't want to know.

I'm not at Prime any more, and I did all this without benefit of prior
art.  In particular, what I'm about to describe may be familiar to
users of other despoolers, or not.  I worried particularly about races
and timeouts.  That's not easy to do on a timeshared computer without
multi-event wait, and some of the code is gross.  Also, I don't like
"unattended" software that just goes quiet or dies without telling
anyone how, especially if it's because of external misbehaving
hardware.  It does behave sensibly if it finds a ^D embedded in the
file!  In case anyone is interested, here's a summary of the less
obvious features:


1. Establishing contact with the printer didn't assume a printer that
was alive, or even one that was quiescent.  First I hit it with ^T and
looked for a properly formatted reply among potential junk that could
be streaming back from yesterday's job.  Then ^D (wait for ^D), and if
no ^D returned, ^C, ^D again (wait for ^D).  This covers a printer
that's quiescent, or busy, or in interactive mode.

2. A hack like Apple used: check the printer for a flag and, if it's
not there, download some standard dictionaries in separate jobs.  This
is repeated before each user job, in case the printer was turned off
and on again.

3. Print the header page with "large" characters printed in dark gray
(so they don't wash out) and 84-point -- except that if the line would
get too long the point size is shrunk to fit (the printer itself does
these calculations).

4. Read pagecount if possible (and parseable), and start the job,
offering host-based text emulation for non-PostScript files.  Field
all returned messages.

5. If the returned message is %%[ Flushing..., drop the job (yes, it's
possible for a program to fake that, but why bother?).

6. Do the ^D negotiation. (Send one, wait for one).  As I said, we
stripped out embedded ^Ds, but if we hadn't it would, I think, still
be OK, except you might attach messages from this job onto the next job.

7. Print in text any uplinked data, as a separate job.  That way the
user sees error messages and anything else he cared to "print", on
sheets after the job's proper output.  Any messages that are clearly
meant for the operator/administrator get put in a log file (who watches
system consoles these days?).  Get pagecount again.

8. During jobs, execute a timeout.  How do you distinguish between a
user who submits "{}loop" and one who submits <8Mb of data imaged>
showpage?  We judged that a printer that hasn't ^D-ed about 20 minutes
after the last character sent is almost certainly wedged, so we ^C
after 30 minutes it and do end-of-job.  If no ^D is still sent back
(after 15 seconds) we declare the printer seriously dead.

9. Exception to the above: If you get a %%[ PrinterError during the
timeout, the clock must be halted until the printer is well again, as
determined by a healthy response to ^T.  The problem is caused by
paper out on last sheet of job, probably.  Care is needed not to get
stuck if the printer decides to croak at this point.

10. General robustness was built in.  For example, I've seen a QMS1200
return a %%[ status line with no trailing newline, so we won't hang
waiting for a complete line.


As I said, the problem with coding like this is relying on the
nonstandard semantics, and for the most part Adobe has kept things
standard (even if not always sensible...want to talk about suppressing
bare CR characters?).  I do rely on: ^C, ^D, ^T semantics, and the
text returned by ^T; exitserver semantics including password; and for
convenience sake the statusdict entries pagecount, printername,
product, revision, jobname and username (it won't actually break if
statusdict is absent).

So, Adobe, the point of all this is to ask: are you ever going to
change these expectations and make Prime unhappy (since I don't think
they have anyone there who understands all this stuff)?
-- 
David Brooks			dbrooks@osf.org
Open Software Foundation	uunet!osf.org!dbrooks
11 Cambridge Center		Personal views, not necessarily those
Cambridge, MA 02142, USA	of OSF, its sponsors or members.