[comp.lang.postscript] How do you count the number of pages in a postscript document?

seth@cunixc.cc.columbia.edu (Seth Strumph) (10/13/89)

Newsgroups: comp.lang.prolog
Subject: how do you count the number of pages in a postscript file?
Summary: 
Expires: 
Sender: 
Reply-To: seth@cunixc.cc.columbia.edu (Seth Strumph)
Followup-To: 
Distribution: 
Organization: Columbia University
Keywords: 

does anyone have (or know of) a program that will count the number of
printed pages that an arbitrary postscript file (including those
created on a macintosh) will generate?

if you can help, please reply by email.

seth@cunixc.cc.columbia.edu
...!rutgers!columbia!cunixc!seth

sakkinen@tukki.jyu.fi (Markku Sakkinen) (10/16/89)

In article <1955@cunixc.cc.columbia.edu> seth@cunixc.cc.columbia.edu (Seth Strumph) writes:
>
>does anyone have (or know of) a program that will count the number of
>printed pages that an arbitrary postscript file (including those
                       ^^^^^^^^^
>created on a macintosh) will generate?

Theoretically impossible!
Since PostScript is a "Turing-equivalent" language (i.e. a full-powered
programming language), even the termination of an _arbitrary_ PS programme
(file) is undecidable.
The only way to find out is a rather complete emulator or interpreter.
And you still have problems with possibly nonterminating programmes.

However, PostScript code generated by text processors and similar
application software should be very well-behaving.
Indeed, if it conforms to the structuring conventions recommended
by Adobe, such things as the number of pages are given in special
comments at the beginning or end of the file.

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland

geof@apolling (Geof Cooper) (10/18/89)

In article <1955@cunixc.cc.columbia.edu> seth@cunixc.cc.columbia.edu (Seth Strumph) writes:
>does anyone have (or know of) a program that will count the number of
>printed pages that an arbitrary postscript file (including those
>created on a macintosh) will generate?

I'll bite.  The code below purports to do this function.  It may cause
some poorly written programs to loop infinitely, if they assume that
showpage or copypage is an operator in a recursive call in a bound
procedure.  Otherwise will probably work fine.  

Many PostScript printers maintain a count of the number of pages ever
printed on them in NOVRAM.  It strikes me that checking this count is
more effective that what I did.

- Geof

-----------------------CUT HERE---------------------------
%!
%
% Exitserver module that redefines showpage and copypage to
% be procedures that count the number of pages processed since
% it was loaded.
%
% The code makes use of the known "bug" in existing implementations
% of PostScript whereby contents of strings are not restored by the
% "restore" operator.
%
%
% Minimally Tested.  Released to Public Domain with no guarantees.
% - Geof Cooper, QMS/IMAGEN
%

0 serverdict /exitserver get exec

userdict begin
    /_cnt 4 string def
    /_storeCount {
        dup              255 and //_cnt 0 3 2 roll put
        dup -8  bitshift 255 and //_cnt 1 3 2 roll put
        dup -16 bitshift 255 and //_cnt 2 3 2 roll put
            -24 bitshift 255 and //_cnt 3 3 2 roll put
    } bind def
    /_getCount {
        //_cnt 3 get 24 bitshift
        //_cnt 2 get 16 bitshift or
        //_cnt 1 get  8 bitshift or
        //_cnt 0 get             or
    } bind def

    /_rshowpage /showpage load def
    /_rcopypage /copypage load def

end

/showpage
{
    _getCount #copies add _storeCount
    _rshowpage
}
bind store

/copypage
{
    _getCount #copies add _storeCount
    _rcopypage
}
bind store

cet1@cl.cam.ac.uk (C.E. Thompson) (10/20/89)

In article <1517@tukki.jyu.fi>, sakkinen@tukki.jyu.fi (Markku Sakkinen)
writes:
> However, PostScript code generated by text processors and similar
> application software should be very well-behaving.
> Indeed, if it conforms to the structuring conventions recommended
> by Adobe, such things as the number of pages are given in special
> comments at the beginning or end of the file.

Specifically, you can look for the "%%Pages:" structure comment. (This
is what we do locally, and output it as "Pages expected" in an operator
message.) Note, however, that it is not mandatory under the structuring
conventions to have this structure comment. (It is only mandatory that
if it occurs it is correct... or in other words if it is wrong the
document is non-conforming... bad luck!)

Certainly any approach based on a naive search for "showpage" will get
you nowhere, in practice as well as in theory.

However, the situation for knowing how many pages a document *did*, in
fact, print is rather better, and doesn't need the patch that Geoff
Cooper posted. The "pagecount" operator in "statusdict" (see page 296
of the Red Book) gives you the number of pages ever printed by the
LaserWriter (or since the last time the EEROM was replaced, anyway).
You can take the difference of its value at the start and end of a job,
and print the difference in an operator message, or on a cover page.

The determined user will be able to cheat even on this unless you
execute both examinations of the page count in nice, clean, start of
job, PostScript environments. This requires keeping the initial value
external to the LaserWriter. (Or perhaps Geoff can work out for us how
to keep it in a protected, exitserver-installed, piece of workspace
that survives between jobs!)

Chris Thompson
JANET:    cet1@uk.ac.cam.phx
Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk

robert@shangri-la.gatech.edu (Robert Viduya) (10/21/89)

> 
> However, the situation for knowing how many pages a document *did*, in
> fact, print is rather better, and doesn't need the patch that Geoff
> Cooper posted. The "pagecount" operator in "statusdict" (see page 296
> of the Red Book) gives you the number of pages ever printed by the
> LaserWriter (or since the last time the EEROM was replaced, anyway).
> You can take the difference of its value at the start and end of a job,
> and print the difference in an operator message, or on a cover page.
> 
> The determined user will be able to cheat even on this unless you
> execute both examinations of the page count in nice, clean, start of
> job, PostScript environments. This requires keeping the initial value
> external to the LaserWriter. (Or perhaps Geoff can work out for us how
> to keep it in a protected, exitserver-installed, piece of workspace
> that survives between jobs!)
> 

I've managed to do this for our publicly accessible PostScript
printers so that we can do accurate page accounting for them.
Specifically, the first time a printer is accessed after power-up, a
small job gets sent down that does an "exitserver" and then redefines
the serverloop.  The new serverloop sends a message to the host after
each job with the number of pages that job used.

The main problem with this is that the serverloop differs from printer
to printer and PostScript ROM level to PostScript ROM level.  We have
QMS PS2000's, Apple LaserWriter II NTX's and one LaserWriter Plus.
The serverloop mods for the QMS's and the NTX's were different but
very similar.  We retired the LaserWriter Plus because the serverdict
dictionary wasn't big enough for us to add just one more item (it was
showing its age anyway).

Working out how to change the serverloop requires dumping the
dictionaries in the printer and browsing around.  If you can code
PostScript and can handle doing that sort of thing, it isn't all the
difficult, just a bit time consuming.

			robert

--
Robert Viduya					   robert@shangri-la.gatech.edu
Office of Computing Services
Georgia Institute of Technology					 (404) 894-6296
Atlanta, Georgia	30332-0275

ron@clarity.Princeton.EDU (Ronald Beekelaar) (10/21/89)

In article <2677@hydra.gatech.EDU> robert@shangri-la.gatech.edu (Robert Viduya) writes:

.   > 
.   > The determined user will be able to cheat even on this unless you
.   > execute both examinations of the page count in nice, clean, start of
.   > job, PostScript environments. This requires keeping the initial value
.   > external to the LaserWriter. (Or perhaps Geoff can work out for us how
.   > to keep it in a protected, exitserver-installed, piece of workspace
.   > that survives between jobs!)
.   > 
.
.   I've managed to do this for our publicly accessible PostScript
.   printers so that we can do accurate page accounting for them.
.   Specifically, the first time a printer is accessed after power-up, a
.   small job gets sent down that does an "exitserver" and then redefines
.   the serverloop.  The new serverloop sends a message to the host after
.   each job with the number of pages that job used.
.
.			   robert
.
As long as you are able to send a "small job" to the printer to change exit-
server, anybody who wants to cheat, can also do this. Or change it
temporarily, so his pages won't be recorded!

Or this page accounting not ment for charging of the users. Obviously in that
case you don't have to be afraid for cheating.

ron

robert@shangri-la.gatech.edu (Robert Viduya) (10/21/89)

> As long as you are able to send a "small job" to the printer to change exit-
> server, anybody who wants to cheat, can also do this. Or change it
> temporarily, so his pages won't be recorded!

Well, first off, I don't change exitserver; I change the serverloop. 
There's a big difference.  Secondly, users can't do an exitserver to
change the serverloop if they don't know the exitserver password.
Thirdly, since it's also my code that's driving the printers on the
host side, I can guarantee that it's my modify-serverloop job that
gets to the printer first and not anyone elses.  Fourthly, a successful
exitserver command sends back a message that looks like:

	%%[ exitserver: permanent state may be changed ]%%

which gives us a convenient red flag for catching any misbehaving
users who have managed to figure out the exitserver password.

		robert

--
Robert Viduya					   robert@shangri-la.gatech.edu
Office of Computing Services
Georgia Institute of Technology					 (404) 894-6296
Atlanta, Georgia	30332-0275

CXT105@PSUVM.BITNET (Christopher Tate) (10/22/89)

>
> However, the situation for knowing how many pages a document *did*, in
> fact, print is rather better, and doesn't need the patch that Geoff
> Cooper posted. The "pagecount" operator in "statusdict" (see page 296
> of the Red Book) gives you the number of pages ever printed by the
> LaserWriter (or since the last time the EEROM was replaced, anyway).
> You can take the difference of its value at the start and end of a job,
> and print the difference in an operator message, or on a cover page.
>

As I recall, the original question was to be able to tell how many pages the
document will print *before* actually printing it.  This would be quite useful
to me here, for example, where I am charged 20 cents per page for any laser
printing done on a public printer.  While word processors can tell you how
many pages will print, PostScript downloads won't, and so it becomes necessary
to try to guesstimate how much money it's going to take....

(BTW, the per-page charging is done on the fly, through a VendaCard (tm)
machine attached to the paper-feed mechanism of the printers.  If you don't
put the card in the machine, the printer thinks its out of paper.)

-------
Christopher Tate                   |
cxt105@psuvm.psu.edu               |  You can lead a horse to water,
 ..!psuvax1!psuvm.bitnet!cxt105    |    but a vest has no sleeves.
cxt105@psuvm.bitnet                |

dkelly@npiatl.UUCP (Dwight Kelly) (10/23/89)

ron@clarity.Princeton.EDU (Ronald Beekelaar) writes:

>In article <2677@hydra.gatech.EDU> robert@shangri-la.gatech.edu (Robert Viduya) writes:

>As long as you are able to send a "small job" to the printer to change exit-
>server, anybody who wants to cheat, can also do this. Or change it
>temporarily, so his pages won't be recorded!

You can always change the exitserver password to prevent others from
redefining your serverloop!

Dwight Kelly
Network Publications, Inc.

jbw@unix.cis.pitt.edu (Jingbai Wang) (10/24/89)

In article <1517@tukki.jyu.fi> sakkinen@jytko.jyu.fi (Markku Sakkinen) SAKKINEN@FINJYU.bitnet (alternative) writes:
>In article <1955@cunixc.cc.columbia.edu> seth@cunixc.cc.columbia.edu (Seth Strumph) writes:
>>
>>does anyone have (or know of) a program that will count the number of
>>printed pages that an arbitrary postscript file (including those
>                       ^^^^^^^^^
>>created on a macintosh) will generate?
>
>Theoretically impossible!
>Since PostScript is a "Turing-equivalent" language (i.e. a full-powered
>programming language), even the termination of an _arbitrary_ PS programme
>(file) is undecidable.
>The only way to find out is a rather complete emulator or interpreter.
>And you still have problems with possibly nonterminating programmes.
>
>However, PostScript code generated by text processors and similar
>application software should be very well-behaving.
>Indeed, if it conforms to the structuring conventions recommended
>by Adobe, such things as the number of pages are given in special
>comments at the beginning or end of the file.

I think the Q-A is starting to make sense. Since the question is not that
clear, and hence the answers can be rather random. I even saw a piece
of PS code that was claimed to be able to count number of pages. In that
event, you only need to print the PS file, and hence the is no any doubt
how many pages it contains.

I wrote a program called PSScribe (Post-Scribe) that sorts enables you to
count number of pages in a Scribe-generated PostScript file, pull out
certain pages, divide it into even and odd pages, divide it into a few files,
or merge a few files together. However, it is based on the fact that
Scribe PostScript file follows Adobe convention the best (as good as
Adobe illutstrator), but it can also make mistakes, because you can
include a PostScript file that has all the
%%Page
and other comments, since showpage was nulled in the inclusion shell.
I enhanced a version of TeX dvi2ps that also generates reasonably good
Adobe PostScript, but I would still hesitate to use PSScribe on it, because
the bitmap font definitions do not all stay in the Prologue, and neither
in individual pages. They are like blocked global variables in a C code,
rather difficult to figure out which fonts or characters are already defined
before a certain page. 


JB Wang
jbw@pittvms.bitnet
jbw@cisunx.UUCP

seth@cunixc.cc.columbia.edu (Seth Strumph) (10/24/89)

as the poster of the original question, let me give a clarification
of what i was interested in.

we're trying to set up a postscript printer on which we will enforce a
weekly quota of pages.  this printer (ideally) will be accessible from
both macintoshes and unix boxes (using transcript and cap).  we would
like to be able to reject any jobs that would put the user over their
page limit, so the best possible situation would be for us to be able
to count the number of pages *before* the file gets sent to the
laserwriter.  

If anyone can help, please reply by email

Seth Strumph
seth@cunixc.cc.columbia.edu