[comp.lang.c] Translating Pascal ==> C: nested procedures

rang@cpsin3.cps.msu.edu (Anton Rang) (06/04/89)

I'm curious.  This is not intended to start a flame war, by the way
(which is why it's not X-posted to comp.lang.pascal :-).  I've heard
of several "Pascal to C" translators.  How do they handle nested
procedures?  For instance, suppose I have:

		procedure insert_in_symtab(what : node);
		  function conflict(n1, n2 : node) : boolean;
		  ...
		  function check_if_full : boolean;
		  ...
		...

		procedure insert_in_strtab(what : string);
		  function check_if_full : boolean;
		    { this code uses "what" }
		  ...
		...

Do they just rename the nested procedures to unique names and make
them "static"?  If so, how do they handle accesses to variables which
are declared in enclosing blocks?
  (If you don't think this is interesting to C people, please E-mail
responses to me.)

  Thanks,

	Anton

+---------------------------+------------------------+
| Anton Rang (grad student) | "VMS Forever!"         |
| Michigan State University | rang@cpswh.cps.msu.edu |
+---------------------------+------------------------+

chris@mimsy.UUCP (Chris Torek) (06/05/89)

In article <3276@cps3xx.UUCP> rang@cpsin3.cps.msu.edu (Anton Rang) writes:
>... I've heard of several "Pascal to C" translators.  How do they handle
>nested procedures?

Chances are that they do not.

>For instance, suppose I have:
>
>		procedure insert_in_symtab(what : node);

[Underscores are not legal in Pascal; perhaps you mean

		procedure insertinsymtab(what : node);

:-) (actually, it should be all uppercase as well, but that goes *too* far)]

>		  function conflict(n1, n2 : node) : boolean;
>		  ...
>		  function check_if_full : boolean;
>		  ...
>		...
>
>		procedure insert_in_strtab(what : string);
>		  function check_if_full : boolean;
>		    { this code uses "what" }
>		  ...
>		...
>
>Do they just rename the nested procedures to unique names and make
>them "static"?  If so, how do they handle accesses to variables which
>are declared in enclosing blocks?

The most efficient way to handle this is generally to expand the argument
lists to intermediate routines as necessary.  For instance, one could
change

	procedure addstr(what : string);
	    function isfull : boolean;
	    begin isfull := false ... end;
	begin ... if isfull then ... end

to

	int addstr_isfull(string_t *what) {	/* pANS syntax */
		int _ret;
		ret = 0; ...
	}

	void addstr(string_t what) {
		...
		if (addstr_isfull(&what)) ...
	}

This gets a bit unwieldy if variables must be passed through several
intermediate procedures or functions:

	procedure foo;
	var a, b, c, d, e, f : int;
	    procedure bar;
		procedure baz;
		    procedure raz;
		    begin ... a := b; c := d; e := f ... end;
		begin ... end; { without using a, b, c, d, e, f }
	    begin ... end;
	begin ... end

which becomes something like

	void foo_bar_baz_raz(a, b, c, d, e, f)
		int *a, *b, *c, *d, *e, *f;	/* K&R 1 syntax */
	{
		...
		*a = *b;
		*c = *d;
		*e = *f;
		...
	}

	void foo_bar_baz(a, b, c, d, e, f)
		int *a, *b, *c, *d, *e, *f;
	{
		...	/* a, b, c, d, e, f unused except to pass to raz() */
	}

	void foo_bar(a, b, c, d, e, f)
		/* and so on */

In practise, however, such sequences are rare.  (Just how rare I cannot
say, but those who have done the analysis prefer static links over
displays, and this is the sort of case where static links might be
slower.)

If the Pascal code uses procedure pointers, translation gets harder.
The most straightforward approach is to pass static links about.  (At
this point you are no longer doing a `translation'; you are compiling,
using C as an assembler.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

djones@megatest.UUCP (Dave Jones) (06/06/89)

> In article <3276@cps3xx.UUCP> rang@cpsin3.cps.msu.edu (Anton Rang) writes:
>>... I've heard of several "Pascal to C" translators.  How do they handle
>>nested procedures?
> 

I've never seen one of these Pascal to C translators, but I can offer a
suggestion as to how to go about it.  In fact, it's the only reasonable way to
do it that I can think of right off, without doing some pretty fancy
calculating in the compiler ( -- Oops. "Translator").

For a procedure whose locals are referenced by a nested procedure, rather than
passing single parameters to it, pass a reference to a struct which contains
all the parameters. Likewise, declare all its locals in a struct. Maintain two
"display-pointers" for each lexical level, one pointing to the current
parameters at that level, one to the current locals at that level. (If after
finishing this article, you wonder why two pointers are needed for each level,
rather than just one pointing to record containing both parameters and locals,
think about the the "forward" directive and a "one pass" compiler.)

The C code won't be very readable to us mere humans, but if you keep the Pascal
source, and only use the C as an intermediate, that's okay. If you want to
convert to C as source, well...  I never liked nested scope anyway. Use the
generated C as a hint to convert the routines so that they pass pointers to
structures, rather than using scope.

Example...

program foo(input, output);
  var global: integer;

  procedure one(var x: integer; y: integer);
     var z: integer;

     procedrue two;
     begin 
       global := 5;
       x := 3;
       z := 4;
     end;

  begin
    x := 2;
    two;
  end;

begin
  global := 1;
  one(global, global);
end;

There follows a C equivalent. (Hope I don't mess this up. I just winged it
to give you the idea. I also omited initialization code for the files input
and output, etc. ) Some of the code which handles the display could be
optimized out, but we want it here for instruction.

#define MAX_LEVEL 16 /* or whatever */
struct {
  void* params;
  void* locals;
} _display[MAX_LEVEL];

main(argc, argv)
  char** argv;
{
/*  _init(argc, argv); */
  foo();
}

/* Structure definitions for parameter and local
 * variable "activations".
 */

typedef struct { int global; } Var_struct_foo;
typedef struct { int *x; int y } Parm_struct_one;
typedef struct { int z; } Var_struct_one;

foo_one_two()
{
  ((Var_struct_foo*)(_display[1].locals))->global = 5;
  *(((Parm_struct_one*)(_display[2].params))->x) = 3;
  ((Var_struct_one*)(_display[2].locals))->z = 4;
}

foo_one(params)
  register Parm_struct_one *params;
{
  void* save_params = _display[2].params;
  void* save_locals = _display[2].locals;
  Var_struct_one Var_one;

  _display[2].locals = (void*)(&Var_one);
  _display[2].params = (void*)(params);

  *(params->x) = 2;
  foo_one_two();

end:
  _display[2].locals = save_locals;
  _display[2].params = save_params;
}

foo()
{
  Var_struct_foo Var_foo;
   _display[1].locals = (void*) &Var_foo;

  Var_foo.global = 1;

  { Parm_struct_one Parm_one;
    Parm_one.x = &Var_foo.global;
    Parm_one.y = Var_foo.global;
    foo_one(&Parm_one);
  }

}