bpendlet@esunix.UUCP (02/19/87)
[] The trouble with most pretty printers is that they try to get too fancy. There is a fairly simple way to write a "programmable" pretty printer. The trick is to use the keywords and punctuation symbols of the language to trigger a set of actions in the pretty printer. First you write a token scanner for the language you want to pretty print. The scanner should return the actual string that was scanned and a classification code every time it is called. You need the string because you will need to write it out again, and you might want to do something like convert it to upper or lower case. You need the code to control an N-way branch. I would use a CASE statement, but you might prefer a vector of pointers to procedures, or even a computed goto, what the heck, any old N-way branch will do. Next you decide on a set of operations that you want the pretty printer to be able to perform. Things like convert the string to upper case, advance the output file to the next line, indent to the current indenting level, increment the current indenting level, decrement the current indenting level, push the current indenting level on a stack, restore the indenting level from a stack, and write the current token to the output file. As you play around with your pretty printer you will think of several other operations. So once you've picked the operations write a procedure that implements each operation. Now for each entry in your N-way branch, that is, for each keyword, punctuation symbol, and whatever in the language, write a sequence of calls to your operation procedures. For instance the entry for "IF" might be something like. NextLine; Indent; WriteToken; IncrementIndent( 2 ); The entry for "END" ( I'm using Modula-2 for these examples ) might be NextLine; DecrementIndent( 2 ); Indent; WriteToken; The pretty printer is simple, easy to change, "programmable" ( change an entry, recompile, and link ), and can be written over the average weekend. Not to mention not needing to write a parser for the input language, not needing to read, parse, ... a configuration file, and it runs pretty fast. I've used this technique and I've used pretty printers written by other people that were based on this technique, it works better than you might think. Bob Pendleton P.S. I don't know the origin of this technique. I learned it from an old programmer at UNIVAC in 1979 or '80, he said he learned it from someone else, several years earlier. -- --- Bob Pendleton Evans & Sutherland Computer Corporation UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet hacking code, hacking code. sometimes happy, sometimes bored, almost lost in a pile of spode. tell the people "I'm only hacking code." --- I am solely responsible for what I say. ---
msb@sq.UUCP (02/24/87)
Along about now it should be pointed out, again, that there's no way for any prettyprinter to distinguish in general between: if (cond1) if (cond1) normalaction1 normalaction else if (cond2) else normalaction2 if (cond2) else erroraction1 normalaction3 else erroraction2 Mark Brader
mouse@mcgill-vision.UUCP (02/28/87)
In article <366@esunix.UUCP>, bpendlet@esunix.UUCP (Bob Pendleton) writes: > The trouble with most pretty printers is that they try to get too > fancy. There is a fairly simple way to write a "programmable" pretty > printer. The trick is to use the keywords and punctuation symbols of > the language to trigger a set of actions in the pretty printer. > [basically, scan tokens and do things as you get them] > For instance the entry for "IF" might be something like. > NextLine; > Indent; > WriteToken; > IncrementIndent( 2 ); Except that this technique loses comments and formatting in the input. Doing the right thing with comments can be difficult. Doing the right thing when breaking expressions over multiple lines can be very difficult and in some cases impossible (eg, "len1+len2 + extra+1" versus "len1 + len2+extra + 1" - how do you tell which + signs should have spaces around them? It depends on how they group conceptually.). der Mouse USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!musocs!mcgill-vision!mouse think!mosart!mcgill-vision!mouse Europe: mcvax!decvax!utcsri!musocs!mcgill-vision!mouse ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu