forrest@ux1.lbl.gov (Jon Forrest) (09/06/88)
I'm a newcomer to Postscript and have been spending lots of time recently reading the standard Postscript material. One thing that has struck me is how hard it is to get used to a stack based language after being a Fortran and C programmer for so long. If I has been a Forth or HP calculator programmer I probably wouldn't have this problem. Anyway, an idea occurred to me that would help people like me read (and eventually write) Postscript. I thought I'd run it by this newsgroup to see what you all think. What about a "compiler" that converts standard postscript into what I'll call infix/function form? In otherwords, the output would look something like Fortran or C and would consist of infix arithmetic and logical operators, and functions calls. The control operators would also look like C/Pascal control structures. As a companion to such a compiler would be another program that would take Postscript written in this form and output standard Postscript. Note that neither program would change the semantics of a Postscript program in any way. There may be technical reasons why such an approach wouldn't work that I haven't thought of, given my inexperience. Or, it may be the case, as a friend of mine whose judgement I respect claimed, that so few people actually read and write Postscript that actually writing such programs wouldn't be worthwhile. I'd appreciate hearing any comments about this idea. Jon Forrest FORREST@LBL.GOV ucbvax!lbl-csam!ux1!forrest FORREST@LBL (bitnet)
bkc@sun.soe (Brad Clements) (09/06/88)
From article <940@helios.ee.lbl.gov>, by forrest@ux1.lbl.gov (Jon Forrest): > I'm a newcomer to Postscript and have been spending lots of time > > Anyway, an idea occurred to me that would help people like me > read (and eventually write) Postscript. I thought I'd run it by > this newsgroup to see what you all think. > > What about a "compiler" that converts standard postscript into > what I'll call infix/function form? In otherwords, the output would Personally, I don't own an HP calculator. I do most of my work in C. But I have done an extensive amount of PostScript coding. I like the odd-ballness of PostScript's 'reverse-polish' notation and I use the stack quite extensively, especially in ways that neither C nor Fortran could ever do, or be made to appear to do. I like it the way it is, and I think that I would get very confused by such an idea. Of course, this is just my personal opinion. I'm sure there are very good reasons for carying through on this idea. Brad Clements
cosell@bbn.com (Bernie Cosell) (09/06/88)
In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes: }I'm a newcomer to Postscript and have been spending lots of time }recently reading the standard Postscript material. One thing that }has struck me is how hard it is to get used to a stack based }language after being a Fortran and C programmer for so long. }If I has been a Forth or HP calculator programmer I probably }wouldn't have this problem. I don't think this is true. Postfix languages are (IMHO) inherently hard to understand. Problem is twofold: (a) when it does things, you have NO way to know "why" (for's and if's, for example) and so it is hard to understand what the code is trying to do: so you have to read on, see what is supposedly happening, and then back up to reread what happened. Which makes it more complicated than its rather simple repertoire of control structures would lead you to think; and (b) it is VERY hard to keep an intuitive grasp of what is happening on the stack. This last is mostly a coding style question: many programmers are (these days) shying away from the cavalier use of side-effects because they make code harder to understand and more fragile. Problem is that stack machines are ALL side effects: you call a function and you now have to REMEMBER how many stack things it eats, and how many it puts back, or else you're lost. Again, it goes to making relatively simple operations _seem_ more complicated than they have to be and so harder to understand. (and please, no "lessons" about RPN languages... I have a drawerfull of HP calculators and I've worked in APL and Forth. I _can_ deal with postfix languages when I have to... I just don't LIKE them and I don't think they lead to _clear_ solutions to problems.) }What about a "compiler" that converts standard postscript into }what I'll call infix/function form? I think this'd be nice, but a serious problem is that stack machines are inherently more powerful than "conventional" languages, and the simple parts of postscript (converting "a b div" -> "a / b") aren't generally the places where you run into trouble. I think it'd be nice if you could do it, but I fear it'd be hard (especiallyt o expect it to eat a bit of random postscript and do anything helpful with it.. almost as bad a job to try to do as the old "flow charters" fromthe old days that tried (unsuccessfully) to make sense out of unrestricted assembly code). If it is all full of dup's and rolling the stack and doing large-scale tweaking of the stack, just to fall into a single (magic) procedure call at the end (with by-then well hidden args!), I think you'll be in trouble. In my own postscript code, I've adopted various "defensive" techniques that don't help me much with reading other folks's code, but at least make *my* code pretty plain: a) when I enter a procedure, I "copy" the args into a bunch of local vbls. (generally with a string of "/varname exch def"s) b) I tend *NOT* to do long, involved stack manipulations. Generally, I'll compute something and stuff it into a local, usually with a reasonable name, and then instead of seeing a "<whateverprocedure>" out in the middle of nowhere leaving the reader to guess what's on the stack, I'll have done: <compute> /reasonabletemp exch def ... /reasoanbletemp /temp /.... <callprocedure> A bit of extra def'ing and stack'ing, but it makes the code MUCHO easier to understand. When I'm DONE with the code and I know that it works, I can "encrypt" it to make everything be all stack-like. c) I avoide long "open coding" of {stuff} in if's and for's. It is bad enough getting an intuition for what is going on to have cur last < drawtitle if but having the boolean separated from the "if" by 25 lines of code that you don't know if you should be scanning or not is too much for me (ditto for the BOOL part: if it is more than some expression that is easily apprehended, I'll stick it in a temp) Do you wizards have other tricks and stuff to make code more understandable and *readable*, or is the emphasis always on getting things faster and more compact? __ / ) Bernie Cosell /--< _ __ __ o _ BBN Sys & Tech, Cambridge, MA 02238 /___/_(<_/ (_/) )_(_(<_ cosell@bbn.com
mccanne@h2opolo.ee.lbl.gov (Steve McCanne) (09/06/88)
In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes: [stuff about difficulty of reading PostScript] > >Anyway, an idea occurred to me that would help people like me >read (and eventually write) Postscript. I thought I'd run it by >this newsgroup to see what you all think. > >What about a "compiler" that converts standard postscript into >what I'll call infix/function form? In otherwords, the output would Van Jacobson thought this was a good idea as well. He developed a language called "PreScript" that is a lot like C, and compiles into PostScript. I helped with the implementation; it should be out for beta testing very soon, if not already. Look for it in comp.sources.misc. Steven McCanne mccanne@helios.ee.lbl.gov
richard@gryphon.CTS.COM (Richard Sexton) (09/06/88)
While I will admit to tinking about writing something that makes Postscrit ``readable'', I came to the conclusion that if I spent the amount of time it would take to do this just hacking postscript, I wouldnt need it. :-) / 2 -- ``Beam THAT between your pointy ears'' richard@gryphon.CTS.COM {backbone}!gryphon!richard
sears@sun.uucp (Daniel Sears) (09/07/88)
In the Adobe PostScript archives (ps-file-server@adobe.com) you will find a PostScript beautifier. Here is the README from that distribution: README The "psformat" program parses PostScript files and rearranges them based on reasonable syntactic and indentation rules. Its primary purpose in life is to take miserably formatted (or compacted) PostScript files and make them tolerable. USAGE: psformat < infile > outfile "psformat" reads from stdin and writes to stdout. "psformat" basically leaves line breaks alone, except where they affect the indentation of { }. This means, among other things, that output with lots of newlines (or newlines in funny places, as appear in DEC Document output, for example) do not get fixed. It is too hard to figure out where newlines should go heuristically. The "psformat" program also pulls out any comments that start with %%! %%+ or %%A-Z and writes them to the stderr channel. This is the rudimentary beginnings of a program to check the comments and report which of them are bogus, mal-formed, out of order, or whatever. Unfortunately, just recognizing them syntactically is, of course, merely the tip of the iceberg, and the issues of parsing them have not yet been addressed. In order to save the comments in a file, you can use a shell hack something like this: (psformat < infile > outfile) >& comments.out Since the only thing that will come out of the (subshell) is the stderr output, this will trap the comments into the file called comments.out. In general, this isn't worth it. It is nice to see the comments go by on stderr so you can monitor the progress of the program. -- Daniel Sears Sun Microsystems, Inc. Technical Publications MS 5-42 (415) 336-7435 2550 Garcia Avenue sears@sun.com Mountain View, CA 94043
cplai@daisy.UUCP (Chung-Pang Lai) (09/07/88)
In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes:
]
]What about a "compiler" that converts standard postscript into
]what I'll call infix/function form?
]
This is an idea, but I think it is hard to maintain the source and the
translated copy of the same postscript program.
I am also new to postscript. I notice the hardest thing to read a postscript
program is to tell which argument belong to which operator.
Once you are familiar with the operator set (some 200+ built-in operators),
you can read your postscript code easily. After that you have to get familiar
with all the new procedures or operators that are developed locally in your
company to understand what argument goes to what operator again.
e.g. Try reading the following most common postscript code.
/Times-Roman findfont 6 scalefont setfont (A) show showpage
You have to look up the reference manual to know that findfont takes one
argument from the stack and leaves one. scalefont takes two from the
stack and leaves one. And setfont takes one and leaves none. And show
takes one and leaves none, showpage takes none and leaves none. I learnt
these five after awhile, but I still have to learn the rest 200+ operators.
And I have to look up the reference manual from time to time to check if
I passed the arguments correctly.
When I took the postscript class at Adobe Systems, I discuss this issue with
the instructor. I came up with a suggestion, but the discussion leaded to
no conclusion.
My suggestion was:
Two OPTIONAL noise characters can be used by the programmer to delimit the
scope of the procedure invocations. I understand {} are reserved for
procedure declaration, I steal this two characters just for the purpose of
illustrating my idea below. Another two unique symbols should be defined
by the postscript language to avoid conflicts.
Assuming postscript interpretor treats { and } as noise, i.e. read and ignore,
then I can rewrite the above line as follows:
{{{/Times-Roman findfont} 6 scalefont} setfont} {(A) show} {showpage}
This shows clearly that /Times-Roman is used by findfont.
The result from findfont and 6 are used by scalefont.
The result from scalefont is used by setfont
(A) is used by show.
nothing is used by showpage.
These delimitors should be optional and can be used like comments to help
programmer read the code. Usage as below may be more practical:
{{/Times-Roman findfont} 6 scalefont} setfont
(A) show
showpage
Do you think it is more readable?
I'll bet some LISP haters hate my idea :-)
It is just optional, don't blame me for it.
The price:
A lot of noise has to be transmitted through the cable to the printer.
Take up two special symbols.
All good candidates are used for other purpose already, e.g. {} [] <> ().
The gain:
An optional way to make code easier to maintain. Will not hurt existing
drivers because postscript produced by software are not meant to be read
by human and hence delimitors not needed.
--
.signature under construction ...
{cbosgd,fortune,hplabs,seismo!ihnp4,ucbvax!hpda}!nsc!daisy!cplai C.P. Lai
Daisy Systems Corp, 700B Middlefield Road, Mtn View CA 94039. (415)960-6961
barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (09/07/88)
In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes: |What about a "compiler" that converts standard postscript into |what I'll call infix/function form? I am trying to visualize a language that would let you call a procedure that returns three values on a stack, then uses two of them, does some other functions, and then uses the third. When this is nested, the problem gets worse. Then add variable binding, and delayed parsing of names. The values on the stack have no names, and the positions change with every call. The function of the names could change dynamically. A graphic language could be used, but how could a text language do this? It makes my head hurt. :-) I suppose you could convert a PostScript program to pop all values off the stack, and assign names to each value. Then all operations would be done with variables. But that program would be very inefficent. I believe you should master the language. Having a C front end for PostScript is like having a Basic or Pascal front end for C. Yes, you can do it, but you lose so much. PostScript formatters, and programs to automatically document the stack contents seem much more useful. -- Bruce G. Barnett <barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP> uunet!steinmetz!barnett
mccanne@h2opolo.ee.lbl.gov (Steve McCanne) (09/07/88)
In article <5361@vdsvax.steinmetz.ge.com> barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) writes: >In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes: > >I am trying to visualize a language that would let you call >a procedure that returns three values on a stack, then uses two of >them, does some other functions, and then uses the third. >When this is nested, the problem gets worse. Then add variable binding, and >delayed parsing of names. > Van's solution to multiple valued functions (in his PreScript language) was to get at individual return values via the dot operator. So, a function can be declared: "foo(x) returns (a, b, c)", and its values accessed via foo(x).a, etc. Also, "bar(foo(x))" is valid if bar takes three arguments. Or, you can say "a=foo(x)", where "a" is an array. >The values on the stack have no names, and the positions change They don't have names, but they do have relative offsets from the top of the stack. These values can be accessed via roll, exch, etc.--this turns out to be a good approach since stack manipulations are generally faster than the name to address resolution for named variables. >with every call. The function of the names could change dynamically. > This is a problem. PreScript simply disallows assignment to a function name; this makes perfect sense from a C standpoint. You can, however, assign a function's name to a variable. >I suppose you could convert a PostScript program to >pop all values off the stack, and assign names to each value. >Then all operations would be done with variables. > >But that program would be very inefficent. Yes, this is indeed inefficient. Keeping track of where things are on the stack, on the other hand, works pretty well. > Bruce G. Barnett <barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP> > uunet!steinmetz!barnett Steven McCanne mccanne@helios.ee.lbl.gov
orr@cs.glasgow.ac.uk (Fraser Orr) (09/07/88)
In article <943@helios.ee.lbl.gov> mccanne@helios.ee.lbl.gov (Steve McCanne) writes: >In article <940@helios.ee.lbl.gov> forrest@ux1.lbl.gov (Jon Forrest) writes: >>What about a "compiler" that converts standard postscript into >>what I'll call infix/function form? In otherwords, the output would > >Van Jacobson thought this was a good idea as well. He developed a language >called "PreScript" that is a lot like C, and compiles into PostScript. >I helped with the implementation; it should be out for beta testing very >soon, if not already. Look for it in comp.sources.misc. > I too have written such a program, and used it quite extensively. (I'm using NeWS rather than postscript, but the problems are essentially the same.) I have found it more than a little benificial. Someone said that they had started doing this but had come to the conclusion that they would be better just hacking postscript. In my experience this is not true. Even if you just hack yacc parser to give you infix and prefix notation for arithmetic operators, control structs and function calls the difference in readability and writability is quite amazing. I wrote such a parser in a couple of hours (and saved weeks of work therby.) If you add things like auto creation of local vars, named params, multiple assignment etc etc ad infinitum then postscript would become a positively appealing language to program in. (By the way, for those of you who are interested in the merits and demerits of the compiler approach, there is a discussion going on about this very subject in comp.lang.forth) I look forward with anticipation to the posting of PreScript. Regards, ==Fraser Orr ( Dept C.S., Univ. Glasgow, Glasgow, G12 8QQ, UK) UseNet: {uk}!cs.glasgow.ac.uk!orr JANET: orr@uk.ac.glasgow.cs ARPANet(preferred xAtlantic): orr%cs.glasgow.ac.uk@nss.cs.ucl.ac.uk
tynor@pyr.gatech.EDU (Steve Tynor) (09/08/88)
I think your suggestion to add parenthesization 'comments' to postscript is a good one. Of course I'm an unabashed Lisp hack... =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= No problem is so formidable that you can't just walk away from it. Steve Tynor Georgia Instutute of Technology ...{akgua, allegra, amd, harpo, hplabs, ihnp4, masscomp, ut-ngp, rlgvax, sb1, uf-cgrl, unmvax, ut-sally} !gatech!gitpyr!tynor
orr@cs.glasgow.ac.uk (Fraser Orr) (09/08/88)
In article <5361@vdsvax.steinmetz.ge.com> barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) writes: >I am trying to visualize a language that would let you call >a procedure that returns three values on a stack, then uses two of >them, does some other functions, and then uses the third. >When this is nested, the problem gets worse. Then add variable binding, and >delayed parsing of names. > (a,b,c) = ReturnsThreeValues () UsesTwoValues (b,c) OtherFunction1 (AnotherVariableName) OtherFunction2 () UsesOtherValue (a) >The values on the stack have no names, and the positions change >with every call. The function of the names could change dynamically. Ugh! >I suppose you could convert a PostScript program to >pop all values off the stack, and assign names to each value. >Then all operations would be done with variables. Yes, thats what I do. >But that program would be very inefficent. Herein lies the crux of the matter. Firstly I think it is exaggeration to say that an operation like name binding will make your program very inefficient. Name look up is very fast, name binding just as fast. Overall the loss would be very little (this is of course my opinion based on a little practical experience, it would be interesting to see any figures comparing the efficiency of a "typical" PS program using stack manipulations and using names.) Look at what you gain though. Programs that you can understand, programs that you can debug etc. PS programs are often run only a few times so the amount of time spent writing the program is a much more significant proportion of the time taken to generate the output than is normal with any other programming language. Hence any improvment that can be made in the programability of the system is a considerable *overall* efficiency gain. Using names instead of stack machinations is a considerable improvment on the programability, thus a significant overall effiecency gain. I believe this argument is valid for general programming languages, but is much more so of languages like postscript. >I believe you should master the language. Having a C front end >for PostScript is like having a Basic or Pascal front end for C. > >Yes, you can do it, but you lose so much. The programming language is not to be considered a hurdle to communicating your intent to the computer, but a means by which that communication can be made. The programming language should intrude on your thoughts and design as little as possible. In my experience of reading postscript programs this is not the case. Having a C front end to postscript is not like having a Basic front end to C, but more like having a C front end to 68000 assembler (this isn't mean to be a critisism of PostScript, I'm just trying to say that it is at much lower a level than any human should ever have to deal with.) As for loss. I don't see that you loose much more than a few seconds per page run time wise. Time you can easily regain in reduction in programming and maintanance time. > Bruce G. Barnett <barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP> Regards, ==Fraser Orr ( Dept C.S., Univ. Glasgow, Glasgow, G12 8QQ, UK) UseNet: {uk}!cs.glasgow.ac.uk!orr JANET: orr@uk.ac.glasgow.cs ARPANet(preferred xAtlantic): orr%cs.glasgow.ac.uk@nss.cs.ucl.ac.uk
greid@ondine.COM (Glenn Reid) (09/11/88)
You people tend to forget that the PostScript language is interpreted. It is well and good to use tools to convert to and from PostScript, but it is not quite as "transparent" as we all might think. I like to think of a big grandfather clock, with the pendulum swinging. Each time pendulum swings, the PostScript interpreter gets to do one operation. The "granularity" of the clock is nowhere near the speed of a microprocessor instruction set, and any comparison with assembly languages doesn't make sense. The difference between: 0 0 moveto and 0 0 /arg2 exch def /arg1 exch def arg1 arg2 moveto can sort of be measured in "ticks" of the interpreter's clock. It's not quite this simple, since simply pushing a literal is faster than executing a real PostScript operator, but it is a rough rule of thumb. It will take about three times as long to execute the second of these in a tight loop, and about five times as long if it is transmitted and scanned each time. My rule of thumb is that if you have roughly the same number of tokens in your stack approach as you do with your 'exch def' approach, the 'exch def' is likely to be much more readable and better. Otherwise, I usually go with the stack approach. One other thing of note is that if you have too much stack manipulation going on, it may well be symptomatic of a problem in the original program design. Also, most procedures don't do any stack manipulation at all, they simply use their arguments directly from the stack. In this situation, it is especially wasteful (and confusing, I think) to declare intermediate variables. Compare: % sample procedure call: (Text) 100 100 12 /Times-Roman SETTEXT % approach 1: /SETTEXT { %def findfont exch scalefont setfont moveto show } def % approach 2: /SETTEXT { %def /arg5 exch def /arg4 exch def /arg3 exch def /arg2 exch def /arg1 exch def arg5 findfont arg4 scalefont setfont arg2 arg3 moveto arg1 show } def Which of these is easier for you to understand? Anyway, I think the discussion is a good one, but let's not forget that PostScript it is an interpreted language. And I don't think it is terribly hard to use and understand, if it is written well. Glenn Reid Adobe Systems
cochran@cadsun.DAB.GE.COM (Craig Cochran) (09/12/88)
In article <1619@crete.cs.glasgow.ac.uk> orr%cs.glasgow.ac.uk@nss.ucl.ac.uk (Fraser Orr) writes: >In article <5361@vdsvax.steinmetz.ge.com> barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) writes: >>I am trying to visualize a language that would let you call >>a procedure that returns three values on a stack, then uses two of >>them, does some other functions, and then uses the third. >>When this is nested, the problem gets worse. Then add variable binding, and >>delayed parsing of names. >> > >(a,b,c) = ReturnsThreeValues () >UsesTwoValues (b,c) >OtherFunction1 (AnotherVariableName) >OtherFunction2 () >UsesOtherValue (a) > I think the point Bruce is trying to make here is that high-level languages such as C remove you from the concept of maintaining and operating on a stack. Sure, you can write functions like the ones shown above, but what have you gained? You still need to maintain an knowledge of what's on the stack at any given time. To really implement a high-level front end, you would need to manage the stack for the user, which is what any high-level language does. So why not write a high-level language that manages the stack for the user and gives the appealing fluency of C ? I'll tell you why. PostScript is an interpretive language. That is the chief reason it was developed in postfix notation. Anyone who has used an HP calculator can tell you that postfix is more efficient. Were it a compiled language, then high-level is the way to go. However, interpretive languages should not be high-level, because of the added burden of maintaining a stack - look how inefficient BASIC is! The high-level language that could do all this for PostScript would most certainly crawl, since functions would have to maintain variables rather than use the stack - a loss in memory and CPU cycles. I think that PostScript does an excellent job at what it was designed to do. It's an interpretive page description language, not a high-level complied language (which has the luxury of compiling to machine code where stack management is not as costly). It also affords much freedom in programming, allowing graceful stack gymnastics to make programs more efficient. Just my point of view... -Craig -- Craig S. Cochran <cochran@ge-dab.GE.COM> General Electric Company UUCP: ...!mcnc!ge-rtp!ge-dab!cochran 1800 Volusia Ave, Rm 4112 Phone: (904) 239-3124 Daytona Beach, FL 32015
mccanne@hot.ee.lbl.gov (Steve McCanne) (09/13/88)
In article <1351@ge-dab.GE.COM> cochran@ge-dab.GE.COM (Craig Cochran) writes: ... >The high-level language that could do all this for PostScript would >most certainly crawl, since functions would have to maintain variables >rather than use the stack - a loss in memory and CPU cycles. There has been a lot of talk about a high-level language (to PostScript compiler) being forced to use variable names rather than stack locations for storage management. However, there's nothing wrong with the conventional approach of allocating stack space for local variables (and function arguments). The PreScript compiler, in fact, does just this. Here's the function example Glenn Reid gave: >% sample procedure call: > (Text) 100 100 12 /Times-Roman SETTEXT > >% approach 1: > /SETTEXT { %def > findfont exch scalefont setfont moveto show > } def >% approach 2: > /SETTEXT { %def > /arg5 exch def > /arg4 exch def > /arg3 exch def > /arg2 exch def > /arg1 exch def > arg5 findfont arg4 scalefont setfont > arg2 arg3 moveto arg1 show > } def > A compiler that kept track of stack locations, could produce the code in approach 1. >Which of these is easier for you to understand? It doesn't matter which *result* is easier to read; it matters which *source* file is easier to read. (Who would compare assembler output for readability in register compilers?) Specifically, the PreScript source would look (almost) like: function SETTEXT(string s; int x, y, scale_factor; name f) { setfont(scalefont(findfont(f), scale_factor)); moveto(x, y); show (s); } Steve McCanne mccanne@helios.ee.lbl.gov