[comp.lang.ada] User requirements on generics

emery@ARIES.MITRE.ORG (David Emery) (12/16/88)

I wrote up the following in response to some discussions with various
compiler implementors on what I thought was important in implementing
generics.  These are listed in order.  Maybe this will spark some
debate on user requirements for generics and other parts of the
language.  I think there are a lot of optimizations that users need
which are not high on the average compiler vendor's To-Do list, and
perhaps we can present a suggested reordering...
				dave emery
				emery@mitre.org
---  
Here is a view on what a GOOD compiler should do to support Ada
Generics.  This is based on our recent experience in evaluating Ada
compilers, and our experience using Ada's generics.   

0.  Don't Break
We've broken a lot of compilers with generics of moderate complexity.
Generics is one area where the ACVC's could be strengthened, if our
experience is any guide.  In one case (at the IEEE Programming Contest
last year), an internal compiler error on a subunit of a generic
required a complete overhaul of our design.  (And, it didn't help us win
the contest, either...)

1.  Relaxed Compilation Order
A good compiler must support the compilation of an instantiation
before the body of the corresponding generic.  It is legal, according
to the LRM, to require the compilation of the body before the
instantiation, but there are some programs which cannot be compiled
with this restriction.  More importantly, this has a severe effect on
software development, since all users of a generic must wait for the
implementor to be finished before they can compile their
instantiation.  

If we permit compiling an instantiation before the body of the
generic, then when is all of the work to do the instantiation done?
Verdix has a model where "whoever gets there last does all the work."
So, if I compile generic G, instantiators I1 and I2, and then body B, 
the compiler does all the work when compiling B.  I see messages like
"compiling instantion I1.  compiling instantiation I2." when compiling
B.  

This is usually OK, but occasionally I have to recompile the body
of a generic used in many different places, and then I have to wait
for the compiler to finish all the re-instantiations.  An alternative
model would permit the linker/binder to do all the work of
instantiating generics.  If I1 were my main program, then I would not
have to "pay" for the re-instantiation in I2, if I1 does not reference
I2.  Instead, the linker would decide that I1 needs re-instantiation,
and ignores the (out of date) instantiation in I2.  

Maybe the ultimate model is a option that permits me to control when
generics are expanded in the compilatio process.  The basic decisions
are "as soon as possible" or "as late as possible", with lots of
variations in between.

2.  Shared Code. 
Shared code is another requirement for a good compiler.  As we build up
more libraries of generic components, they will be used more often in
many places.  One example from our own work concerns our Diana Query
Language (DQL).  There are several hundred basic queries in our DQL.
Each query requires at least one traversal of a Diana tree; many
require several traversals, starting at different places in the tree
and going either depth-first or breadth-first.  The generic
tree-traversal package is itself implemented using either a generic
stack package or a generic queue package.  Without code sharing, the
several hundred basic query subprograms will generate many hundreds of
tree-traversal, stack and queue packages.  With code sharing, the
resulting code is much smaller, but we do pay some runtime overhead.

Greg Burns from Verdix has convinced me that it is much more difficult
to do code sharing on arbitrary private types than I had supposed at
first look.  However, here is a list of what I consider to be the
minimal requirements for shared generics, based on the generic formal
parameters.  
	1.  code sharing among generics with scalar formal parameters
		(e.g.  integers, enumeration types, etc)
	2.  code sharing among generics with formal subprograms
	3.  code sharing among generics with private formal paramters,
		where the actual parameter is a scalar type
		(e.g.  instantiate a stack package where the private
			formal ELEM has INTEGER as an actual.)
	4.  code sharing among generics with private formal paramters,
		where the actual parameter is an access type.

There may well be certain classes of instantiations, based on the
actual type provided for a generic formal private type.  For instance,
there may be one code segment for scalar actual types, a second for
access types (but I hope that would be treated as a scalar), a third
for constrained objects without initializations (such as fixed-length
arrays), a fourth for constrained objects with initializations (such as
a record with a task object), and a fifth for unconstrained/varying
length objects, such as variant records and unconstrained arrays.

Furthermore, the user needs to control the conditions for code sharing.
Depending on the program and computer architecture, the user may decide
that he wants some instantiations to share code, but not others.  

There should be a pragma that controls code sharing.  I propose the
following (this is a URG issue):
	pragma SHARE_CODE (name : string; bool : boolean);

When "bool" is true, code is shared among conforming instantiations.
The meaning of "name" is either the name of a generic (in which case
the pragma appears within the same scope as the generic declaration, or
immediately following the generic if it is a compilation unit), or the
name of an instantiation.

If "name" is a generic, then this establishes the default for all
instantiations of that generic.  If "name" is an instantiation, then
the default is overridden.  Note that if the default for a generic is
no sharing, there must be more than one occurance of SHARE_CODE on
instantiations for the pragma to have any effect. 

The documentation for the compiler should specify the default for
generics if no pragmas are issued.  Also, the compiler documentation
should explain the effect of pragma OPTIMIZE on generics (if any).
		
3.  Debugging
(Assertion:  Good compilers have debuggers...)  The debugger must
provide facilities for handling generics.  It should permit debugging
both the generic and specific instiations.  For instance, given a
generic STACK_PACKAGE, it should be possible to set a breakpoint in the
generic, that is triggered by any instantiation of STACK_PACKAGE.  It
should also be possible to set a breakpoint in a specific instantiation
of STACK_PACKAGE.  

Note that code sharing can affect the debugger's ability to do this.
If code sharing for STACK_PACKAGE is true, then the debugger may not be
able to support setting a breakpoint inside a specific instance.

stachour@umn-cs.CS.UMN.EDU (Paul Stachour) (12/18/88)

In article <8812161453.AA02985@aries> emery@ARIES.MITRE.ORG (David Emery) writes:
>  ... 
>Note that code sharing can affect the debugger's ability to do this.
>If code sharing for STACK_PACKAGE is true, then the debugger may not be
>able to support setting a breakpoint inside a specific instance.  

Dave, I don't see why this must be true.   From the debuggers's viewpoint,
one sets a breakpoint in the shared-code for the generic.  When the 
breakpoint is hit, the debugger checks the context of the breakpoint
with respect to who did the calling from what line, etc., and thus can
determine from the symbol-table info which instance of the generic that
it is actually in.  The breakpoint is "taken" if that is the instance
in which the breakpoint was set, otherwise the breakpoint is "ignored".
  ...Paul

eisen@bozon.SRC.Honeywell.COM (Greg Eisenhauer) (12/18/88)

In article <10479@umn-cs.CS.UMN.EDU> stachour@umn-cs.CS.UMN.EDU (Paul Stachour) writes:

   Path: srcsip!ems!amdahl!ames!xanth!nic.MR.NET!umn-cs!stachour
   From: stachour@umn-cs.CS.UMN.EDU (Paul Stachour)
   Newsgroups: comp.lang.ada
   Date: 17 Dec 88 17:06:01 GMT
   References: <8812161453.AA02985@aries>
   Reply-To: stachour@umn-cs.cs.umn.edu (Paul Stachour)
   Organization: CSci Dept., University of Minnesota, Mpls.
   Lines: 14

   In article <8812161453.AA02985@aries> emery@ARIES.MITRE.ORG (David Emery) writes:
   >  ... 
   >Note that code sharing can affect the debugger's ability to do this.
   >If code sharing for STACK_PACKAGE is true, then the debugger may not be
   >able to support setting a breakpoint inside a specific instance.  

   Dave, I don't see why this must be true.   From the debuggers's viewpoint,
   one sets a breakpoint in the shared-code for the generic.  When the 
   breakpoint is hit, the debugger checks the context of the breakpoint
   with respect to who did the calling from what line, etc., and thus can
   determine from the symbol-table info which instance of the generic that
   it is actually in.  The breakpoint is "taken" if that is the instance
   in which the breakpoint was set, otherwise the breakpoint is "ignored".
     ...Paul

The "symbol-table information" that you propose to use would have to be much
more complex than the information that is generated and used in many systems.
A simple virtual-address to source-line mapping, as is produced by the Verdix
Ada system is not sufficient.  Consider that you may have calls to several
generic instantiations in a single source line.  It will be difficult for the
debugger to figure out which call is actually being performed.  It can perhaps 
be done, but if you think about a case like a lengthy expression where all of
the operators are generic instantiations with shared bodies, things get hairy
fast.

An easier approach might be to create a stub subprogram for each instantiation
that did nothing but call the shared code.  That way the debugger would have
something unique on the call stack for each instantiation.  Unfortunately, you
also have the expense of 2 procedure calls where you got by with one before.
Might be too high a price to pay just to be able to set breakpoints in
separate shared instantiations.  Another option would be to have the compiler
generate an extra parameter to generic routines that identified the
instantiation.  Verdix does this for other reasons, so it should be feasable
for the debugger to use it to disambiguate the call.

Greg Eisenhauer, Honeywell SRC		     Socrates, Pythagoras,
Phone: (612) 782-7318			     Yin and bloody Yang,
MAIL:  3660 Technology Drive, Mpls, MN       Hatha Yoga, Ommm,
ARPA:  eisen@src.honeywell.com		     Bennet, Gurdjieff, Jesus
UUCP:  {umn-cs,ems,bthpyd}!srcsip!eisen		-- Peter Murphy