[net.lang] compiling nested conditionals

debray@sbcs.UUCP (Saumya Debray) (06/15/85)

This is a query about the efficient compilation of nested conditionals.

In general, code generated naively for nested conditionals will contain
chains of go-to's.  Of course, these chains can subsequently be collapsed
during peephole optimization, but this requires traversing the goto
chains, which takes O(n) time (assuming that an arbitrary instruction can
be accessed in O(1) time).

In a Prolog compiler I'm writing, the problem is aggravated by the fact
that I don't have adequate random-access structures available.  I'm forced
to used linked lists, which have O(n) access time, so that collapsing the
goto-chains becomes O(n^2), which is unacceptable.

My solution has been to generate the label of the ultimate branch target
and pass it down during syntax-directed translation, so that goto-chains
are never generated.  I use this to compile Prolog disjunctions (';') and
if_then_else's, and it turns out I can generate optimal branching code
in O(1) time (no linear overhead for traversing goto-chains).  My question
is: has this - or any similar method - been used in any other language
compiler that anyone knows of?  References will be greatly appreciated.
A brief description of my algorithm follows (I assume the compilation of
AND-OR sentences, as in Prolog):

   The idea is the following: execution branches need come together only
   if the goals are of the form (c1 OR c2) AND c3.  In this case, when we
   see the AND, we generate a label, which is the label of the place where
   the execution paths should meet.  This is then passed into the routines
   that process the disjunction, as a parameter "meet(Label)".

   The decision on when to actually emit the "meet" label is decided by
   passing around a parameter, Depth.  This can take a value of 0 or 1.
   A depth value of 0 indicates an "outer disjunction", i.e. a goal of
   the form ( (c1 OR c2) AND c3 ).  A depth value of 1 indicates an "inner
   disjunction", e.g. the inner OR in the case ((c1 OR (c2 OR c3)) AND c4).
   This information is used to determine when to emit the label
   corresponding to the "meet" label: this is emitted if and only if
   (i) a meet exists, i.e. is nonvariable, and (ii) the depth is 0.  If
   these conditions are met, the meet label is generated and the depth set
   to 1 so that duplicate labels are not produced.  When compiling a goal
   ((c1 OR c2) AND c3), a new meet is generated and passed into the
   disjunction (c1 OR c2), together with a depth of 0, while the meet and
   depth passed into c3 are those passed down from above by the calling
   routine.
-- 
Saumya Debray
SUNY at Stony Brook

	uucp: {allegra, hocsd, philabs, ogcvax} !sbcs!debray
	arpa: debray%suny-sb.csnet@csnet-relay.arpa
	CSNet: debray@sbcs.csnet

johnl@ima.UUCP (06/18/85)

> My solution has been to generate the label of the ultimate branch target
> and pass it down during syntax-directed translation, so that goto-chains
> are never generated.

Both the Ritchie and PCC C compilers and their f77 and Pascal 
cousins use this approach for compliling complicated conditional 
expressions.  As noted, it works very well.  It's particularly 
useful for things like Ratfor output, since Ratfor tends to translate this:

    if (expression) stuff

into this:

    if(.not.(expression)) goto 12345
    stuff
12345 continue

Even if the conditional expression is pretty complicated, the compilers
unscramble them well enough that there is no evidence in the runtime code
that the scrambling happened.

John Levine, ima!johnl