stt@inmet.inmet.com (12/27/89)
With regard to instantiations inside a procedure, with an actual subprogram parameter being a local procedure as well... The Ada Reference Manual very properly avoids descriptions of implementation approach, since there are many ways to implement any feature, with tradeoffs of compile-time complexity versus run-time speed, space versus time, etc. Nevertheless, I can describe the strategy employed by various Ada compilers with which I am familiar: The Intermetrics compiler passes a "static link" when calling a nested subprogram. The static link is passed in a particular register (depends on the target, obviously), and points to the stack frame for the lexically enclosing subprogram. The static link may be used by the called procedure to gain access to the enclosing subprogram's local variables and parameters, as well as its static link pointing to the next enclosing subprogram (if any). I am sure other compilers use this method as well. A possible optimization is to determine whether the called subprogram makes any use of the static link (only possible if it is not a separately compiled subunit), and suppress passing the link if appropriate. The Verdix compiler typically uses a "local display" for handling nested subprograms. Each subprogram maintains a table of pointers to lexically enclosing stack frames. The table is at a known offset within the stack frame, and the nested subprogram builds its own by copying its caller's local display and augmenting it with a pointer to its own stack frame. This moves the run-time cost to the start of a subprogram with nesting, rather than at the call point and at the point of up-level references. I think some versions of the Verdix compiler also support static links. Another dimension of difference between compilers has to do with the implementation of generics. For the Intermetrics compiler, generics are implemented strictly as a macro expansion. The body of the generic is expanded at the point of the instantiation, with the actuals substituted for the uses of the formals. Certain other compilers will share the code between generic instantiations, though generally only under restricted circumstances where the instantiations are similar "enough." Both the newer versions of the Verdix compiler and the Dec Vax compiler provide some support for "generic sharing." The Rational compiler shares generics universally, as far as I know, taking advantage of their descriptor-oriented/object-oriented hardware architecture. When generics are shared, there is generally an instantiation descriptor created as part of the instantiation, which can involve some amount of overhead both at its creation and at each use of the actual generic parameters (since they are fetched from the instantiation descriptor rather than being "inlined" in the macro expansion). Another source of overhead associated with nested subprograms involves the access-before-elaboration check. Generally, each subprogram spec will have an associated elaboration bit which will be cleared when the spec is encountered, and set when the body is encountered. This bit will be checked at each call to determine whether the body is being referenced before it has been elaborated. If there is no separate spec for the subprogram then the elaboration check can be eliminated (though some compilers still do it for simplicity/uniformity's sake). Even if there is a separate spec, if there is no "interesting" code between the spec and the body, the check can still be eliminated. Your nested subprogram didn't have a separate spec so there should be no need for any elaboration-check overhead. The global subprogram would only need an elaboration-check if the spec were separately compiled. Furthermore, there is an elaboration check associated with a generic instantiation. This may or may not be present, again depending on separate compilation and compiler optimization details. Anyway, the possibilities go on and on. If you are really concerned about performance, you may have to request an assembly listing of the result and do the instruction cycle counts by hand. pragma Suppress(elaboration_check) may be useful (there are some who feel it ought to be the default!-)). I hope this helps (at least it should make you feel sorry for Ada compiler implementors)... S. Tucker Taft Intermetrics, Inc. Cambridge, MA 02138