[net.lang.ada] Production Quality Ada Compiler Criteria

hogan@AEROSPACE.ARPA (04/15/86)

I am writing a standard for procuring an Ada compiler and would
like to check some numbers against the Ada community.  
I would appreciate hearing most from Ada implementor but also 
knowledgeable systems programmer types who have used Ada to build
some moderately sized (~5000 lines) Ada programs.  I got the numbers
from examining what a few of the validated Ada compilers provide and
using my own judgement, but I need your input so we don't publish a 
standard that no compiler can or will ever meet.

The following requirements are indended as minimal requirements that
any Ada compiler must meet to be judged as production quality.  
(note the (M) means mandatory and the (A) means the requirement may be
mandatory depending on the user's application).  It may be that a very
good compiler will meet most of these  requirements and only fail on a
few.  We have not yet attached a weighting factor to the requirements.

2.1.1    (M)  A compiler shall compile the ACVC and ACEC test suites with at 
              an average rate of at least 250 Ada source statements per minute 
              (elapsed time), for each 1 MIPS of rated processing speed of the 
              specified host computer. 

2.1.2    (M)  A compiler shall compile the ACVC and ACEC test suites with at 
              an average rate of 100 Ada source statements per minute (elapsed 
              time), for each 1 MIPS of rated processing speed of the 
              specified host computer, while meeting the object code 
              requirements.

2.2.1    (M)  The compiler shall produce an object code program that requires 
              no more than 15% additional target computer memory space than an 
              equivalent program written in assembly language.

2.2.2    (M)  The compiler shall produce an object code program that requires 
              no more than 5% additional execution time than an equivalent 
              program written in assembly language.


                                   Capacity

The following are minimal capacity values, i.e. a compiler can allow more
than 1000 compilation units in a program but it must allow at least that
number.

Item                       			    allowed size

compilation units					1000
source lines/ compilation unit				10,000
characters/ source line					80
identifiers/compilation unit			source lines = 10,000
library units in a context clause		compilation units = 1000
Levels of nesting in a program unit		 	16
Declarations in a compilation unit		number of identifiers=10000
Formal parameters in an entry or subprogram		64
Frames an exception may be propagated through		256
Exceptions & handlers in a program unit/frame		256
Number of bits in any object				MAX_INT
Number of characters in a STRING object			50,000
Enumeration literals in a type				256
Dimensions in an array					16
Elements in an array (all dimensions)			MAX_INT
Discriminants in a record				64

				Ada-Related

The following requirements are specifically related to implementation
 of certain Ada constructs.  I would appreciate comments about their
appropriateness and content.

5.1.1    (M)  The compiler shall eliminate statements or subprograms that will 
              never be executed (dead code) because their execution depends on 
              a condition known to be false at compilation time.

5.2.1    (M)  In addition to the basic 60-character set, a compiler shall 
              allow any of the 26 lower case letters in identifiers and both 
              the 26 lower case letters and 13 other special characters in 
              specified in LRM paragraph 2.1 in character strings and comments 
              to the extent that the underlying host computer supports them.

5.2.2    (A)  A compiler that provides the predefined package TEXT_IO shall 
              permit input and output data to contain any of the 95 graphic 
              characters or 5 form effectors of the ISO seven-bit character 
              set (ISO Standard 646) to the extent supported by the target 

5.3.1    (M)  The compiler shall provide predefined types for all the integer 
              and floating-point types provided by the target computer.

5.3.2    (M)  The attribute 'MACHINE_OVERFLOWS shall be TRUE for all 
              floating-point and fixed-point types.

5.3.3    (M)  The compiler shall implement enumeration types using an 
              underlying type that requires the least amount of memory to 
              represent that type.

5.3.4    (M)  The range of enumeration code values allowed in an enumeration 
              representation clause shall be MIN_INT to MAX_INT.

5.3.5    (A)  The compiler shall support length clauses, enumeration 
              representation clauses, and record representation clauses,
	      and address clauses.

5.3.7    (M)  The attributes T'SIZE for discrete types and T'SMALL for 
              fixed-point types shall be implemented.

5.3.8    (M)  The components of records types and array types named in a 
              pragma PACK shall be stored in contiguous memory bits.

5.5.1    (A)  The compiler shall provide at least 8 priority levels for 
              specifying tasking priorities via the pragma PRIORITY.

5.5.2    (A)  The pragma SHARED or an equivalent capability shall be provided.

5.5.3    (M)  A mechanism for termination of tasks that depend on library 
              packages shall be provided.

5.5.4    (M)  The compiler shall provide a capability for handling target 
              computer hardware or operating system interrupts as calls to Ada 
              task entries.

5.5.5    (A)  The resources to create, interrupt, terminate, fail and abort a 
              task shall be no more than those required to call and return 
              from a subprogram.

5.5.6    (M)  The ordering of select alternatives in a selective wait 
              statement shall not impact overall execution speed of the 
              program.

5.5.7    (M)  The compiler shall dispatch the execution of ready tasks in a 
              manner that will give each task an equal share of the processing 
              resources.

5.5.8    (M)  Tasks that are blocked, completed, terminated, or not activated 
              shall not impact the performance of the remaining code.

5.5.9    (A)  The value of DURATION'DELTA shall not be greater than 1 
              millisecond.

5.5.10   (M)  The basic clock period, TICK, as defined in package SYSTEM shall 
              be the smallest time increment supported by the target computer 
              hardware.

5.6.1    (M)  An exception shall not impact execution speed unless it
	      raised.

5.6.2    (M)  The compiler shall provide the pragma SUPPRESS or 
	      an equivalent capability to permit suppression of 
              the execution of the pre-defined run-time checks for a 
              designated compilation unit.

5.6.3    (M)  The compiler shall issue a warning message for violation of a 
              constraint exception which is always raised at run-time.

5.7.1    (A)  The compiler shall maximize code sharing between multiple 
              instantiations of generic units.

5.8.1    (M)  The compiler shall provide the pragma INTERFACE to allow 
              importing programs written in the assembly language of the 
              target computer.

5.8.2    (A)  The compiler shall provide the pragma INTERFACE, or an 
              equivalent mechanism, to allow incorporation of object modules 
              compiled from other languages.

5.9.1    (M)  The generic library subprograms UNCHECKED_DEALLOCATION and 
              UNCHECKED_CONVERSION shall be implemented with no restrictions 
              except that the target type of UNCHECKED_CONVERSION may exclude 
              unconstrained types.

5.10.1   (A)  An implementation shall provide support for format directed 
              input/output for each target computer that supports text 
              input/output.

---------
Responses via electronic or conventional mail are OK.  I would like to
have them as soon as possible.

Michael Hogan
Aerospace Corp.  M1/106
POB 92957
Los Angeles, CA 90009
(213) 615-4346

young@ICSC.UCI.EDU (Michal Young) (04/16/86)

> 5.6.3    (M)  The compiler shall issue a warning message for violation of a
>		constraint exception which is always raised at run-time.

A compiler meeting this requirement must solve the halting problem.  ;-)   

--Michal Young
  Arpa/CSnet:  young@uci
  U.C. Irvine

macrakis@harvard.HARVARD.EDU (Stavros Macrakis) (04/16/86)

Compiler requirements beyond simple validation are very important in
procuring Ada compilers.  I agree in general with Hogan's list, but in
some places he requires too little, and in others too much.

Herewith, my comments.

2.1.1,2 ACVC tests are unusual and unrepresentative of real code: They
    are very short: startup time becomes disproportionate.  They have
    hardly any computation: runtime figures will be tiny and not
    terribly useful.  They bang on all parts of the language evenly,
    not in proportion to actual usage: real programs' efficiency
    depends on a very few constructs: inline code, array indexing,
    record selection, loops, subprograms.  You should choose better
    benchmarks.

2.2.1,2 How do you make the object code criteria meaningful?
    Problems: (1) what is an equivalent program; (2) what does this
    program do; (3) what sort of assembly coding are we talking about.

 (1) You must be sure that the assembler program is equally secure
    (Range checks, deref checks, etc.) and has the same runtime
    conventions.  For instance, the Intermetrics Ada compiler
    guarantees that every assignment statement will store to memory.
    Although it does excellent register allocation, it refuses to keep
    a variable's value only in a register between statements, to
    improve debuggability, error-tolerance, etc.  The compiler cannot
    understand algorithms well enough to suppress apparently necessary
    checks (e.g. sentinel termination).

 (2) Certain specialized applications may be amenable to assembler
    tricks beyond current compiler technology.

 (3) Quality of hand-written assembly code varies widely.

    Therefore, you should avoid comparing tuned assembler code with
    ordinary Ada code.  Of course, there are many applications where
    current compiler technology does better than typical hand code.


Capacity

-- Capacity is very important.  Your numbers for number of CU's and
    size of CU's seem reasonable; for very large projects, perhaps,
    the number of CU's (remember is-separate's) could be larger.

-- You should discuss program library size as well as number of CU's
    per program.  You should discuss program library functionality,
    capacity, and speed.  Support for sharing of program sublibraries
    is very important.

-- Source lines are probably not a good measure of capacity.  I would
    substitute something equally easy to count but more closely
    related to the content of the code, e.g. number of lexical tokens
    excluding comments, or number of semicolons.

-- No limit should be imposed on the frames an exception propagates
    through.  If an infinite recursion causes stack overflow, the
    resultant Storage_error should be trappable.  Good implementations
    allow as many levels of exception propagation as of subroutine
    calling (limited, I hope, only by stack size).

-- String is defined as array(Positive range <>).  You can just
    specify that Integer'last > 50,000.  Strings with smaller indexing
    types can always be defined by the user.

-- 256 enumeration literals is too few.  Many systems have > 256
    system calls, error types, syntactic productions, part types...

5.3.3 Hard packing is often undesirable on machines with expensive
    byte extraction.  This is what rep specs, Pack, and Optimize are for.

5.3.8 Contiguous bits is too tight: should a record composed of a
    boolean and a byte be stored with the byte non-aligned to be
    continguous to the bit?  Or do you allow the bit to be stored in a
    full byte for quicker access?

5.5.5 This is not reasonable.  A task creation, e.g., will take more
    resources than a subprogram call simply because there is more to
    initialize, there is synchronization overhead, and a new stack
    needs to be allocated.  I agree that task operations should not
    take <much> more than subprograms: `no more than' is too strong.

5.5.6 This criterion would be satisfied by having the compiler
    pre-sort the select alternatives alphabetically.  This is not what
    you have in mind, but it is unclear how to write a reasonable
    `fairness' spec.  Indeed, it is probably a bad idea.  A
    non-erroneous Ada program will work correctly and efficiently
    regardless of the implementation's select strategy.  (A debugging/
    validation tool that made one select alternative higher or lower
    priority than others could be useful.)  Relying on the fairness
    of the implementation is dangerous.

5.5.7 Not necessarily an ideal scheduling strategy.  You are specifying
    some sort of time-slicing: reasonable in many applications, not all.

    A greedy task merely needs to split into two to double its time
    slice, a common pathology in Unix.  To be complete, you need to
    mention priority in this section.

5.5.9 Add: or the finest time interval available in the hardware/OS,
    whichever is larger.

5.5.10 Is this needed?  Doesn't the ref man (9.6, 13.7.1) define it well
    enough?

5.6.3  This is too strong (halting problem).

5.7.1 This is too strong if taken literally, and too vague otherwise.
    The compiler should not maximize (sensu strictu) code sharing if
    runtime is an issue and the two instantiations are different enough.
    This is another reason for Pragma Optimize.

5.8.1 Add: The documentation should fully specify the machine language
    interface for subprograms with arguments and results of any Ada type.

5.8.2 Agreed, in general, but you should not expect that every language
    system on your machine will be able to be made to interface easily.
    I would tighten this requirement to something like `shall provide
    (in order of importance) the pragma Interface for (1) the standard
    systems programming language of the machine/OS (e.g. C for Unix,
    PL/I for Multics, ...)  (2) the standard applications programming
    language(s) of the machine/OS (e.g. Fortran, Cobol) and (3) Fortran
    (as the language most used for large portable libraries).'  Even
    then, there may be some difficulties because of different runtime
    structures and different initialization, etc., requirements.  You
    may have to pay many dollars for this capability.

5.10.1 What do you have in mind?  Fortran formats?  Why not say `an
    implementation shall provide appropriate libraries to allow
    convenient reading and writing of Fortran format text files'?



    -s

	Stavros Macrakis
	Harvard and Intermetrics

Note: I consult for Intermetrics' Ada group.