bradlee@cg-atla.UUCP (Rob Bradlee) (07/26/89)
Has anyone seen or created their own grammar to describe PostScript? How about using yacc and lex to parse PostScript (including comments)? I'm having a go at it, and would love to hear anyone else's ideas or efforts. Thanks in advance. -- Rob Bradlee w:(508)-658-5600 X5153 h:(617)-944-5595 AGFA Compugraphic Division. ...!{ima,ulowell,ism780c}!cg-atla!bradlee 200 Ballardvale St. Wilmington, Mass. 01887 The Nordic Way: Ski till it hurts!
batcheldern@level.dec.com (Ned Batchelder) (07/27/89)
In article <7456@cg-atla.UUCP>, bradlee@cg-atla.UUCP (Rob Bradlee) writes: > Has anyone seen or created their own grammar to describe PostScript? > How about using yacc and lex to parse PostScript (including comments)? > I'm having a go at it, and would love to hear anyone else's ideas or > efforts. Thanks in advance. You could easily use lex to tokenize PostScript, but I don't think a grammar makes much sense. PostScript is purely token-oriented; after tokens, there really isn't much else for structure. For example, this is a common construct: foo bar gt { this do } { that do } ifelse but in fact, there is no rule that it has to be done this way. I could have said: { this do } { that do } foo bar gt 3 1 roll ifelse or, /baz load /quux load lic % (Long Involved Computation) foo bar gt 3 1 roll { ifelse } stopped pop or any number of other bizarre things. So long as there is a boolean and two executables on the stack when the ifelse is executed, it's legal. So to determine if you had valid PostScript "syntax", you would have to interpret the tokens, not just clump them together into a grammar. And by the way, even at the token level, PostScript can get kind of tricky, since the PostScript program can take over the reading of the input from the interpreter. Check out any large image file: set_up_the_image foo bar baz image 12d480c7b61a2c3e9f8d7c6a2f1e39d8c723b1987a58745 % lots more lines of hex stuff... 120d7a23498762d34a98f7e2c34b6978e62f3d498a76234 showpage Even to know that the hex should not be tokenized would require complex interpretation of the PostScript. Ned Batchelder, Digital Equipment Corp., BatchelderN@Hannah.DEC.com