bpendlet@esunix.UUCP (02/19/87)
[]
The trouble with most pretty printers is that they try to get too
fancy. There is a fairly simple way to write a "programmable" pretty
printer. The trick is to use the keywords and punctuation symbols of
the language to trigger a set of actions in the pretty printer.
First you write a token scanner for the language you want to pretty
print. The scanner should return the actual string that was scanned
and a classification code every time it is called. You need the
string because you will need to write it out again, and you might
want to do something like convert it to upper or lower case. You need
the code to control an N-way branch. I would use a CASE statement,
but you might prefer a vector of pointers to procedures, or even a
computed goto, what the heck, any old N-way branch will do.
Next you decide on a set of operations that you want the pretty
printer to be able to perform. Things like convert the string to
upper case, advance the output file to the next line, indent to the
current indenting level, increment the current indenting level,
decrement the current indenting level, push the current indenting
level on a stack, restore the indenting level from a stack, and write
the current token to the output file. As you play around with your
pretty printer you will think of several other operations.
So once you've picked the operations write a procedure that
implements each operation. Now for each entry in your N-way branch,
that is, for each keyword, punctuation symbol, and whatever in the
language, write a sequence of calls to your operation procedures. For
instance the entry for "IF" might be something like.
NextLine;
Indent;
WriteToken;
IncrementIndent( 2 );
The entry for "END" ( I'm using Modula-2 for these examples ) might be
NextLine;
DecrementIndent( 2 );
Indent;
WriteToken;
The pretty printer is simple, easy to change, "programmable" ( change
an entry, recompile, and link ), and can be written over the average
weekend. Not to mention not needing to write a parser for the input
language, not needing to read, parse, ... a configuration file, and
it runs pretty fast.
I've used this technique and I've used pretty printers written by
other people that were based on this technique, it works better than
you might think.
Bob Pendleton
P.S.
I don't know the origin of this technique. I learned it from an old
programmer at UNIVAC in 1979 or '80, he said he learned it from someone else,
several years earlier.
--
---
Bob Pendleton
Evans & Sutherland Computer Corporation
UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet
Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet
hacking code, hacking code.
sometimes happy, sometimes bored,
almost lost in a pile of spode.
tell the people "I'm only hacking code."
---
I am solely responsible for what I say.
---msb@sq.UUCP (02/24/87)
Along about now it should be pointed out, again, that there's no way for any prettyprinter to distinguish in general between: if (cond1) if (cond1) normalaction1 normalaction else if (cond2) else normalaction2 if (cond2) else erroraction1 normalaction3 else erroraction2 Mark Brader
mouse@mcgill-vision.UUCP (02/28/87)
In article <366@esunix.UUCP>, bpendlet@esunix.UUCP (Bob Pendleton) writes: > The trouble with most pretty printers is that they try to get too > fancy. There is a fairly simple way to write a "programmable" pretty > printer. The trick is to use the keywords and punctuation symbols of > the language to trigger a set of actions in the pretty printer. > [basically, scan tokens and do things as you get them] > For instance the entry for "IF" might be something like. > NextLine; > Indent; > WriteToken; > IncrementIndent( 2 ); Except that this technique loses comments and formatting in the input. Doing the right thing with comments can be difficult. Doing the right thing when breaking expressions over multiple lines can be very difficult and in some cases impossible (eg, "len1+len2 + extra+1" versus "len1 + len2+extra + 1" - how do you tell which + signs should have spaces around them? It depends on how they group conceptually.). der Mouse USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!musocs!mcgill-vision!mouse think!mosart!mcgill-vision!mouse Europe: mcvax!decvax!utcsri!musocs!mcgill-vision!mouse ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu