hogan@AEROSPACE.ARPA (04/15/86)
I am writing a standard for procuring an Ada compiler and would like to check some numbers against the Ada community. I would appreciate hearing most from Ada implementor but also knowledgeable systems programmer types who have used Ada to build some moderately sized (~5000 lines) Ada programs. I got the numbers from examining what a few of the validated Ada compilers provide and using my own judgement, but I need your input so we don't publish a standard that no compiler can or will ever meet. The following requirements are indended as minimal requirements that any Ada compiler must meet to be judged as production quality. (note the (M) means mandatory and the (A) means the requirement may be mandatory depending on the user's application). It may be that a very good compiler will meet most of these requirements and only fail on a few. We have not yet attached a weighting factor to the requirements. 2.1.1 (M) A compiler shall compile the ACVC and ACEC test suites with at an average rate of at least 250 Ada source statements per minute (elapsed time), for each 1 MIPS of rated processing speed of the specified host computer. 2.1.2 (M) A compiler shall compile the ACVC and ACEC test suites with at an average rate of 100 Ada source statements per minute (elapsed time), for each 1 MIPS of rated processing speed of the specified host computer, while meeting the object code requirements. 2.2.1 (M) The compiler shall produce an object code program that requires no more than 15% additional target computer memory space than an equivalent program written in assembly language. 2.2.2 (M) The compiler shall produce an object code program that requires no more than 5% additional execution time than an equivalent program written in assembly language. Capacity The following are minimal capacity values, i.e. a compiler can allow more than 1000 compilation units in a program but it must allow at least that number. Item allowed size compilation units 1000 source lines/ compilation unit 10,000 characters/ source line 80 identifiers/compilation unit source lines = 10,000 library units in a context clause compilation units = 1000 Levels of nesting in a program unit 16 Declarations in a compilation unit number of identifiers=10000 Formal parameters in an entry or subprogram 64 Frames an exception may be propagated through 256 Exceptions & handlers in a program unit/frame 256 Number of bits in any object MAX_INT Number of characters in a STRING object 50,000 Enumeration literals in a type 256 Dimensions in an array 16 Elements in an array (all dimensions) MAX_INT Discriminants in a record 64 Ada-Related The following requirements are specifically related to implementation of certain Ada constructs. I would appreciate comments about their appropriateness and content. 5.1.1 (M) The compiler shall eliminate statements or subprograms that will never be executed (dead code) because their execution depends on a condition known to be false at compilation time. 5.2.1 (M) In addition to the basic 60-character set, a compiler shall allow any of the 26 lower case letters in identifiers and both the 26 lower case letters and 13 other special characters in specified in LRM paragraph 2.1 in character strings and comments to the extent that the underlying host computer supports them. 5.2.2 (A) A compiler that provides the predefined package TEXT_IO shall permit input and output data to contain any of the 95 graphic characters or 5 form effectors of the ISO seven-bit character set (ISO Standard 646) to the extent supported by the target 5.3.1 (M) The compiler shall provide predefined types for all the integer and floating-point types provided by the target computer. 5.3.2 (M) The attribute 'MACHINE_OVERFLOWS shall be TRUE for all floating-point and fixed-point types. 5.3.3 (M) The compiler shall implement enumeration types using an underlying type that requires the least amount of memory to represent that type. 5.3.4 (M) The range of enumeration code values allowed in an enumeration representation clause shall be MIN_INT to MAX_INT. 5.3.5 (A) The compiler shall support length clauses, enumeration representation clauses, and record representation clauses, and address clauses. 5.3.7 (M) The attributes T'SIZE for discrete types and T'SMALL for fixed-point types shall be implemented. 5.3.8 (M) The components of records types and array types named in a pragma PACK shall be stored in contiguous memory bits. 5.5.1 (A) The compiler shall provide at least 8 priority levels for specifying tasking priorities via the pragma PRIORITY. 5.5.2 (A) The pragma SHARED or an equivalent capability shall be provided. 5.5.3 (M) A mechanism for termination of tasks that depend on library packages shall be provided. 5.5.4 (M) The compiler shall provide a capability for handling target computer hardware or operating system interrupts as calls to Ada task entries. 5.5.5 (A) The resources to create, interrupt, terminate, fail and abort a task shall be no more than those required to call and return from a subprogram. 5.5.6 (M) The ordering of select alternatives in a selective wait statement shall not impact overall execution speed of the program. 5.5.7 (M) The compiler shall dispatch the execution of ready tasks in a manner that will give each task an equal share of the processing resources. 5.5.8 (M) Tasks that are blocked, completed, terminated, or not activated shall not impact the performance of the remaining code. 5.5.9 (A) The value of DURATION'DELTA shall not be greater than 1 millisecond. 5.5.10 (M) The basic clock period, TICK, as defined in package SYSTEM shall be the smallest time increment supported by the target computer hardware. 5.6.1 (M) An exception shall not impact execution speed unless it raised. 5.6.2 (M) The compiler shall provide the pragma SUPPRESS or an equivalent capability to permit suppression of the execution of the pre-defined run-time checks for a designated compilation unit. 5.6.3 (M) The compiler shall issue a warning message for violation of a constraint exception which is always raised at run-time. 5.7.1 (A) The compiler shall maximize code sharing between multiple instantiations of generic units. 5.8.1 (M) The compiler shall provide the pragma INTERFACE to allow importing programs written in the assembly language of the target computer. 5.8.2 (A) The compiler shall provide the pragma INTERFACE, or an equivalent mechanism, to allow incorporation of object modules compiled from other languages. 5.9.1 (M) The generic library subprograms UNCHECKED_DEALLOCATION and UNCHECKED_CONVERSION shall be implemented with no restrictions except that the target type of UNCHECKED_CONVERSION may exclude unconstrained types. 5.10.1 (A) An implementation shall provide support for format directed input/output for each target computer that supports text input/output. --------- Responses via electronic or conventional mail are OK. I would like to have them as soon as possible. Michael Hogan Aerospace Corp. M1/106 POB 92957 Los Angeles, CA 90009 (213) 615-4346
young@ICSC.UCI.EDU (Michal Young) (04/16/86)
> 5.6.3 (M) The compiler shall issue a warning message for violation of a > constraint exception which is always raised at run-time. A compiler meeting this requirement must solve the halting problem. ;-) --Michal Young Arpa/CSnet: young@uci U.C. Irvine
macrakis@harvard.HARVARD.EDU (Stavros Macrakis) (04/16/86)
Compiler requirements beyond simple validation are very important in procuring Ada compilers. I agree in general with Hogan's list, but in some places he requires too little, and in others too much. Herewith, my comments. 2.1.1,2 ACVC tests are unusual and unrepresentative of real code: They are very short: startup time becomes disproportionate. They have hardly any computation: runtime figures will be tiny and not terribly useful. They bang on all parts of the language evenly, not in proportion to actual usage: real programs' efficiency depends on a very few constructs: inline code, array indexing, record selection, loops, subprograms. You should choose better benchmarks. 2.2.1,2 How do you make the object code criteria meaningful? Problems: (1) what is an equivalent program; (2) what does this program do; (3) what sort of assembly coding are we talking about. (1) You must be sure that the assembler program is equally secure (Range checks, deref checks, etc.) and has the same runtime conventions. For instance, the Intermetrics Ada compiler guarantees that every assignment statement will store to memory. Although it does excellent register allocation, it refuses to keep a variable's value only in a register between statements, to improve debuggability, error-tolerance, etc. The compiler cannot understand algorithms well enough to suppress apparently necessary checks (e.g. sentinel termination). (2) Certain specialized applications may be amenable to assembler tricks beyond current compiler technology. (3) Quality of hand-written assembly code varies widely. Therefore, you should avoid comparing tuned assembler code with ordinary Ada code. Of course, there are many applications where current compiler technology does better than typical hand code. Capacity -- Capacity is very important. Your numbers for number of CU's and size of CU's seem reasonable; for very large projects, perhaps, the number of CU's (remember is-separate's) could be larger. -- You should discuss program library size as well as number of CU's per program. You should discuss program library functionality, capacity, and speed. Support for sharing of program sublibraries is very important. -- Source lines are probably not a good measure of capacity. I would substitute something equally easy to count but more closely related to the content of the code, e.g. number of lexical tokens excluding comments, or number of semicolons. -- No limit should be imposed on the frames an exception propagates through. If an infinite recursion causes stack overflow, the resultant Storage_error should be trappable. Good implementations allow as many levels of exception propagation as of subroutine calling (limited, I hope, only by stack size). -- String is defined as array(Positive range <>). You can just specify that Integer'last > 50,000. Strings with smaller indexing types can always be defined by the user. -- 256 enumeration literals is too few. Many systems have > 256 system calls, error types, syntactic productions, part types... 5.3.3 Hard packing is often undesirable on machines with expensive byte extraction. This is what rep specs, Pack, and Optimize are for. 5.3.8 Contiguous bits is too tight: should a record composed of a boolean and a byte be stored with the byte non-aligned to be continguous to the bit? Or do you allow the bit to be stored in a full byte for quicker access? 5.5.5 This is not reasonable. A task creation, e.g., will take more resources than a subprogram call simply because there is more to initialize, there is synchronization overhead, and a new stack needs to be allocated. I agree that task operations should not take <much> more than subprograms: `no more than' is too strong. 5.5.6 This criterion would be satisfied by having the compiler pre-sort the select alternatives alphabetically. This is not what you have in mind, but it is unclear how to write a reasonable `fairness' spec. Indeed, it is probably a bad idea. A non-erroneous Ada program will work correctly and efficiently regardless of the implementation's select strategy. (A debugging/ validation tool that made one select alternative higher or lower priority than others could be useful.) Relying on the fairness of the implementation is dangerous. 5.5.7 Not necessarily an ideal scheduling strategy. You are specifying some sort of time-slicing: reasonable in many applications, not all. A greedy task merely needs to split into two to double its time slice, a common pathology in Unix. To be complete, you need to mention priority in this section. 5.5.9 Add: or the finest time interval available in the hardware/OS, whichever is larger. 5.5.10 Is this needed? Doesn't the ref man (9.6, 13.7.1) define it well enough? 5.6.3 This is too strong (halting problem). 5.7.1 This is too strong if taken literally, and too vague otherwise. The compiler should not maximize (sensu strictu) code sharing if runtime is an issue and the two instantiations are different enough. This is another reason for Pragma Optimize. 5.8.1 Add: The documentation should fully specify the machine language interface for subprograms with arguments and results of any Ada type. 5.8.2 Agreed, in general, but you should not expect that every language system on your machine will be able to be made to interface easily. I would tighten this requirement to something like `shall provide (in order of importance) the pragma Interface for (1) the standard systems programming language of the machine/OS (e.g. C for Unix, PL/I for Multics, ...) (2) the standard applications programming language(s) of the machine/OS (e.g. Fortran, Cobol) and (3) Fortran (as the language most used for large portable libraries).' Even then, there may be some difficulties because of different runtime structures and different initialization, etc., requirements. You may have to pay many dollars for this capability. 5.10.1 What do you have in mind? Fortran formats? Why not say `an implementation shall provide appropriate libraries to allow convenient reading and writing of Fortran format text files'? -s Stavros Macrakis Harvard and Intermetrics Note: I consult for Intermetrics' Ada group.