clifton_r@verifone.com (10/26/90)
I am posting this article as a first attempt to fill a glaring gap in the documentation available for the C language. As we all know, the movement to standardize C have led to the ANSI Standard for C, X3.159-1989. This has settled on standards for many previously divergent aspects of C. However, it also enhanced the language substantially. While this has moved the language forward, and has provided many badly needed facilities, it failed to resolve questions about definition or standards for older dialects of C. By far the most common dialect of C has been the "UNIX Version 7" or "V7" C dialect, known to most of us who were using "void" and "enum" and "unsigned char" before ANSI defined them. Many compilers have been written for this dialect, and this effort rose out of an attempt to DQ one. However, as far as I have been able to tell, there has been no published standard or reference on V7 C, and what it consists of. The following is an attempt to draft one. I would welcome any feedback; I recognize that this document is far from perfect as it stands, and would appreciate feedback on errors or suggestions on improvements. My e-mail address is given below. Those with no interest in C, or no interest in the Version 7 dialect of C, can skip on to the next article now. Those interested, read on! UNIX V7 C Language Specification Revision A.1. 24 October 1990 Clifton W. Royston III VeriFone, Inc. HNL - Software Tools 100 Kahelu Avenue Mililani, HI 96789 Tel: +1 808 623 2911 FAX: +1 808 623 3201 E-mail: CLIFTON_R (within VeriFone) (or) clifton_r@zon.verifone.com 0. INTRODUCTION 0.1 Overview Because there is no comprehensive specification available for UNIX "Version 7" C (V7 C), this document will specify the language in terms of its extensions and differences from the "K & R" C language as specified in D.M. Ritchie's _The C Programming Language -- Reference Manual_ [1]. The same document was also published as Appendix A to the 1978 edition of B.W. Kernighan and D.M. Ritchie's _The C Programming Language_ [2]. This document incorporates the language changes published by Ritchie as "Recent Changes to C" [3]. Finally, it relies heavily on Samuel Harbison & Guy Steele's _C: A Reference Manual_ [4] to identify which practises are or were common usage. Thomas Plum's comments on older dialects of C, such as V7 C, are appropriate to cite here: "There is only one Standard for C compilers; it is ANSI X3.159-1989. Nothing else is a standard, especially not the Appendix A of the 1979 Kernighan and Ritchie book. Vendors should not specify or require "conformance" to such non-standards.... [If testing against a V7 specification or test suite]... any "errors" or "remarks" generated in this fashion should just be considered as items to be attended to and discussed, not as any indication of "non-conforming" features." [5] 0.2. Extensions to the "K & R" C language. The bulk of this document is organized to correspond to the section numbers and section names given in Ritchie [1]. The modifications specified incorporate the less well-known document "Recent Changes to C" [3], which added enums and additional structure operations to the language. If Kernighan & Ritchie is used as a reference, the 1978 edition [2] must be used. Appendix A of Kernighan & Ritchie in the 1988 edition [6] is based on the ANSI C specification, and does not correspond to any C implementation prior to ANSI standard C. To summarize the usual V7 C extensions, this implementation of the C language supports: o the "enum" declaration for enumerated types [3]; o structure assignment [3] o structure passing to functions [3]; o structure returning from functions [3]; o union assignment, passing to functions, return from functions; o the "void" data type (but not the "void *" of ANSI C); o the "signed char" type; o the "unsigned char", "unsigned short", and "unsigned long" types; o and calling of function pointers without an explicit dereference. o conditional assignment of structures and unions via "a?b:c"; It also supports the following C features which are considered archaic or traditional, and are not supported in the most recent C compilers: o "Old-fashioned" initializers, such as "int i 3;"; o "Old-fashioned" assignment operators, such as "=+", "=-", etc; o Use of non-octal digits in octal constants, such as 088 (translated as 72 decimal.) 1. Introduction Section 1. (Introduction) of Ritchie does not apply. This document is aimed at a broad class of C compilers, for a variety of CPUs and for UNIX and non-UNIX operating systems. It attempts to describe the functionality of, and the language implemented by the majority of C compilers released with UNIX Version 7. That is, it intends to describe the C dialect generally classed as "UNIX Version 7" C or V7 C. In some areas, a single behavior or feature set will be specified; in other areas, a range of behaviors will be described, any one of which may be considered acceptable or normal in a given implementation. 2. Lexical conventions Section 2. applies in full. 2.1. Comments Section 2.1. applies with the following modification: Where an ambiguous sequence of characters containing a '/*' occurs, and the '/' could be part of the preceding token or part of the '/*' comment indicator, the '/' should be taken as part of the comment indicator. In practise, the only such sequence is the string =/*; this string should be interpreted as '=' '/*', not as '=/' '*'. "a =/* This is a correct comment */b+3;" "j=/*iptr; /* The code on the left is wrong */" 2.2 Identifiers Section 2.2. applies with the following modifications: Identifiers beginning with underscores should be avoided, as some names which start with underscores are reserved for system libraries, and certain other names beginning with underscores may be automatically generated in later stages of the programming system. However, it is not considered an error for the programmer to define an identifier beginning with an underscore, and no compiler error or warning should be emitted. In V7 C compilers, the number of significant characters in an identifier is implementation dependent; 31 characters is common. The Ritchie limit of 8 characters can not be relied on, and the programmer must not assume that two names differing after the 8th character will be considered identical. The compiler documentation should specify the number of significant characters in internal identifiers and in external identifiers. 2.3 Keywords Section 2.3. applies with the following modifications: The words "signed", "void", "enum", and possibly "asm" are reserved as keywords. The words "entry" and "fortran" are not reserved. 2.4. Constants Section 2.4. applies in full. 2.4.1. Integer constants Section 2.4.1. applies in full, with the following clarifications: It is important to note that integer constants have no sign prefix, and hence that values such as "-5" are therefore constant expressions. The type-conversion rules may sometimes cause unexpected results when using such constants within a program. In particular, in an implementation where the "int" type is 16-bit two's-complement, the value "-32768" is a constant expression with type "long", even though the value of the expression can be represented by an "int". If the value of a decimal constant is greater than the largest value representable as "long", or an octal or hex constant is greater than the largest value representable as "unsigned long", the result is undefined. It is preferable for the compiler to generate a warning in this case. However, on most V7 C compilers, no warning will be caused, and a different value will be assigned in place of the constant; the value substituted will be implementation dependent, but sometimes may be equal to the low-order portion of the value, taken as a "long" or "unsigned long". 2.4.2. Explicit long constants Section 2.4.2. applies in full, with the following clarifications: It is important to note that long constants have no sign prefix, and hence that values such as "-5L" are therefore constant expressions, not constants. The type-conversion rules may sometimes cause unexpected results when using such constants within a program. If the value of a decimal long constant is greater than the largest value representable as "long", or an octal or hex constant is greater than the largest value representable as "unsigned long", the result is undefined. It is preferable for the compiler to generate a warning in this case. However, on most V7 C compilers, no warning will be caused, and a different value will be assigned in place of the constant; the value substituted will be implementation dependent, but sometimes may be equal to the low-order portion of the value, taken as a "long" or "unsigned long". 2.4.3 Character constants Section 2.4.3. applies in full, with the following clarifications: The characters '\a' and '\x' are equal to the characters 'a' and 'x', respectively. That is, "a" and "x" have no special interpretation following a backslash. (This is in accordance with Ritchie; the characters have been given a new interpretation in the subsequent ANSI standard.) 2.4.4 Floating constants Section 2.4.4. applies with the following addition: Floating point formats are compiler, hardware, and operating system - dependent. Implementations may vary a great deal as to the level of compile-time arithmetic and validation which they are able to perform on floating-point constants and floating-point constant expressions. The compiler documentation should describe the format used for floating point constants and any range-checking or arithmetic performed at compile-time. 2.5. Strings Section 2.5. applies with the following modification and addition: The result of modifying the contents of a string constant is undefined; it may behave "as expected", or result in behavior which is non-portable or implementation dependent. String constants with the identical value are not guaranteed either to be distinct or to share storage. The Ritchie specification guaranteed them to be distinct; however, not all V7 C compilers have followed this practise. Compiler documentation should specify the compiler's practise in this regard. 2.6 Hardware characteristics [Compiler, hardware, and operating system -specific information] 3. Syntax notation Section 3. (describing type-faces used within the Reference Manual) does not apply to this document. 4. What's in a name (types) Section 4. applies with the following additions: The type "void" is added. Functions declared as returning "void" do not return a value; returning a value from a "void" function is invalid. Expressions may be cast to type "void" to explicitly discard their value. Declaring an object as type "void" is invalid, as is taking the value of a "void" function or a "void" expression. The "signed" attribute may be used to modify an integral type declaration. Since this is the default attribute for "int"s, "short"s, and "long"s, this is primarily useful for "char" declarations. The "char" type may be either signed or unsigned by default. A "char" type declaration may be explicitly declared as either "signed" or "unsigned". The "unsigned" attribute may also be applied in combination with a "short" or "long" size declaration. A simple type declaration may include a storage class, fundamental type, sign attribute (for "char" or "int"), and size attribute (for "int" only.) These elements may appear in any order within the type declaration, but only one of each may occur. 5. Objects and lvalues Section 5. applies in full. 6. Conversions Section 6. applies in full. 6.1 Characters and integers Section 6.1 applies with the following additions: The "char" type may be either signed or unsigned by default. A "char" type declaration may be explicitly declared as either "signed" or "unsigned". Unsigned "char" types are not sign-extended when they are converted to type "int"; signed "char" types are sign-extended when they are converted to type int. A "char" appearing in an expression is widened to the "int" type as part of the usual arithmetic conversions. Many V7 C compilers apply the "unsigned preserving" convention for conversions involving an explicitly unsigned type and a wider signed type; in this case, an unsigned type will always be widened to another unsigned type. For instance, an "unsigned char" appearing in an expression with an "int" will be widened to "unsigned int", and the "int" will therefore also be converted to "unsigned int". An "unsigned int" appearing in an expression with an "long" will be widened to "unsigned long", and the "long" will therefore also be converted to "unsigned long". Other V7 C compilers apply the "value preserving" convention for such conversions. In this case, an unsigned type will always be widened to the next largest type which can hold its full range of values, generally a signed type. For instance, an "unsigned char" appearing in an expression with an "int" will be widened to "int"; an "unsigned int" appearing in an expression with an "long" will be widened to "long". The compiler documentation should specify which convention is used for widening unsigned types. 6.2 Float and double Section 6.2 applies in full, with the following clarifications: As all floating point arithmetic operations are done in double- precision, the "sizeof" operator will give a different result when applied to a floating point expression than when applied to a floating point variable (or, with some compilers, to a floating point constant expression.) For example, following the definition "float f; int i;", the expression "sizeof(f)" should be equal to "sizeof(float)", whereas "sizeof(f+i)" should be equal to "sizeof(double)". 6.3 Floating and integral types Section 6.3 applies in full. 6.4 Pointers and integers Section 6.4 applies in full. 6.5 Unsigned Section 6.5 applies in full. 6.6 Arithmetic conversions Section 6.6 applies in full. 7. Expressions Section 7. applies in full. 7.1. Primary expressions Section 7.1. applies in full, with the following modification: The primary expression preceding the parameter list of a function call may be either of type "function returning ..." or "pointer to function returning ...". If its type is "pointer to function returning ...", the pointer will be implicitly dereferenced before executing the call. That is, if f is declared as: "int (*f)();" (pointer to function returning an integer), the calls "f()" and "(*f)()" are equivalent. 7.2 Unary operators Section 7.2. applies with the following additions: Any expression may be cast to type "void". This discards its value. Note that "++" and "--" are defined by this section to be unary operators, and are of lower precedence than the primary operators "->" and "[]". This means that expressions of the form "a++[1]" or "b++->next" are technically invalid; they must be parenthesized as: "(a++)[1]" or "(b++)->next", respectively. However, some V7 C compilers may accept and correctly process expressions of this form. 7.3 Multiplicative operators Section 7.3 applies in full. 7.4 Additive operators Section 7.4 applies in full. 7.5 Shift operators Section 7.5 applies in full. 7.6 Relational operators Section 7.6 applies in full. 7.7 Equality operators Section 7.7 applies in full. 7.8 Bitwise AND operator Section 7.8 applies in full. 7.9 Bitwise exclusive OR operator Section 7.9 applies in full. 7.10 Bitwise inclusive OR operator Section 7.10 applies in full. 7.11 Logical AND operator Section 7.11 applies in full. 7.12 Logical OR operator Section 7.12 applies in full. 7.13 Conditional operator Section 7.13 applies in full, with the following addition: The second and third expressions of the conditional may be structures or unions of the same type, in which case the type of the expression is the common structure or union type. Not all V7 C compilers accept conditional assignment of structures or unions. 7.14 Assignment operators Section 7.14 applies in full, with the following additions: The simple assignment operator = may be used to assign a structure to another structure. No compound assignment operators may be applied to a structure. The method of structure assignment is not specified; however, after an assignment, it is guaranteed that the content of each element of the left operand is equal to the content of the corresponding element of the right operand. A union may also be assigned to another union in the same fashion, with the same stipulation as to the effect of the assignment. Each of the compound assignment operators may be written in its "archaic" (backwards) form. These operator forms are: lvalue =+ expression equivalent to lvalue += expression lvalue =- expression equivalent to lvalue -= expression lvalue =* expression equivalent to lvalue *= expression lvalue =/ expression equivalent to lvalue /= expression lvalue =% expression equivalent to lvalue %= expression lvalue =>> expression equivalent to lvalue >>= expression lvalue =<< expression equivalent to lvalue <<= expression lvalue =& expression equivalent to lvalue &= expression lvalue =^ expression equivalent to lvalue ^= expression lvalue =| expression equivalent to lvalue |= expression However, it is strongly recommended that these forms should not be used, as they are not allowed in ANSI C and are syntactically ambiguous. For example "a=-3;" must be interpreted as subtracting 3 from the variable "a", whereas it was more likely intended to set "a" to -3. 7.15 Comma operator Section 7.15 applies in full. 8. Declarations Section 8. applies in full, with the following clarification: The declaration specifiers within a declaration list may appear in any order; thus "int register short unsigned" is a valid declaration specifier. NOTE: The syntax given specifically permits the declarator-list to be omitted. Therefore "int ;" is a perfectly valid declaration, which declares no variables. A compiler may issue a warning for a declaration of this form. 8.1. Storage class specifiers Section 8.1. applies in full. 8.2 Type specifiers Section 8.2. applies in full, with the following additions: "void" and "signed" must be added to the list of possible type- specifiers. enum-specifier must be added to the list of possible type-specifiers. (See Section 8.9 for the syntax of enum-specifiers.) The word "signed" may be thought of as an additional adjective which may be applied to an integral type specifier; moreover, "unsigned" may be used in combination with "char", "short", and "long". This means the following additional combinations are acceptable: unsigned char unsigned short (or) unsigned short int unsigned long (or) unsigned long int signed char signed short (or) signed short int signed (or) signed int signed long (or) signed long int The keywords "signed", "unsigned", and "short" may not be applied to "float" or "double." The type "long double" is not supported. A V7 C compiler should always support the "void" type and "enum" types. 8.3 Declarators Section 8.3 applies in full. 8.4 Meaning of declarators Section 8.4 applies in full. 8.5 Structure and union declarations Section 8.5 applies in full. For V7 C compilers, it is considered standard for each structure or union type to have its own name space. That is, the same component name may appear within two or more structure types, with reference to completely different components. In Ritchie, component names for all structures were drawn from a single name space, and components of different structures could have the same name only if they had the identical type and identical offset (relative position) within the structure. A V7 C compiler should not enforce this restriction. 8.6 Initialization Section 8.6 applies in full, with the following addition: The syntax given allows initializers to be specified for formal parameters to a function. For example, the syntax allows: "int f(a) int a=1; { return a }" A compiler may issue a warning for a declaration of this form. 8.7 Type names Section 8.7 applies in full. 8.8 Typedef Section 8.8 applies, with the following additions: Some V7 C implementations allow the programmer to use a combination of typedef names and other type specifiers within a declaration. Example: "typedef long int bigint; unsigned bigint x;" However, this is not recommended; it is preferred for the compiler to generate an error or warning message if a typedef name is used in combination with any other type specifier. 8.9. Enumeration type The following section is not in Kernighan & Ritchie [2]; it is an direct quotation of the information contained in "Recent Changes to C" [3]. enum-specifier: "enum" "{" enum-list "}" "enum" identifier "{" enum-list "}" "enum" identifier enum-list: enumerator enum-list "," enumerator enumerator: identifier identifier "=" constant-expression The role of the identifier in the enum-specifier is entirely analogous to that of the structure tag in a struct-specifier; it names a particular enumeration. For example, "enum color ( chartreuse, burgundy, claret, windark );" ... "enum color *cp, col;" makes color the enumeration-tag of a type describing various colors, and then declares "cp" as a pointer to an object of that type and "col" as a object of that type. The identifiers in the enum-list are declared as integral constants and may appear wherever constants are required. If no enumerators with "=" appear, then the values of the constants begin at 0 and increase by 1 as the declaration is read from left to right. An enumerator with "=" gives the associated identifier the value indicated; subsequent identifiers continue the progression from the assigned value. Enumeration tags and constants must all be distinct, and unlike structure tags and members, are drawn from the same set as ordinary identifiers. Objects of a given enumeration type are regarded as having a type distinct from objects of all other types. [New material added:] If the progression from one enumerator to another leads to a value greater than the maximum signed integer value, the value assigned to the subsequent values will be implementation-dependent. In general, it is likely to be treated either as a negative integer or as a large unsigned integer. (This is likely to happen if an enumerator is defined with "=" to a very large value such as the maximum integer, and is followed by subsequent enumerator identifiers.) The compiler may or may not generate a warning for this case. 9. Statements Section 9. applies in full. 9.1 Expression statement Section 9.1 applies in full. 9.2 Compound statement, or block Section 9.2 applies in full. 9.3 Conditional statement Section 9.3 applies in full. 9.4 While statement Section 9.4 applies in full. 9.5 Do statement Section 9.5 applies in full. 9.6 For statement Section 9.6 applies in full. 9.7 Switch statement Section 9.7 applies in full, with the following clarification: If the constant expression for a case value evaluates to a value greater than the maximum integer (for instance a constant expression of type "long" or "unsigned"), it will be truncated to an integer value. The compiler may or may not generate a warning in this case. 9.8 Break statement Section 9.8 applies in full. 9.9 Continue statement Section 9.9 applies in full. 9.10 Return statement Section 9.10 applies with the following modifications: The value returned should be assigment-compatible with the type of the function in which it appears. It is incorrect for a function to return an expression which is not assignment-compatible with the type of the function, and the compiler should generate a warning or error message in this case. A function may return a structure or union, if the function was defined as returning a structure or union of that type. Statements of the form "return;" are the only form of return statement allowed in a function which is declared as returning "void". It is incorrect to return any expression from such a function, and the compiler should generate a warning or error message in this case. 9.11 Goto statement Section 9.11 applies in full. 9.12 Labeled statement Section 9.12 applies in full. 9.13 Null statement Section 9.13 applies in full. 10. External definitions Section 10. applies in full. 10.1 External function definitions Section 10.1 applies, with the following modifications: A structure or union may appear as the type-specifier for a function, and may be declared as a formal parameter to a function. 10.2 External data definitions Section 10.2 applies in full. 11. Scope rules Section 11. applies in full. 11.1 Lexical scope Section 11.1 applies in full, with the following clarifications: The name space for formal parameters to functions is the same as that for typedef names and other global identifiers. It is invalid to use a keyword for a formal parameter name, and the compiler should generate a warning or error message in this case. However, it is legal for a formal parameter to be an identifier which is already in use as a globally scoped identifier, such as a typedef name, variable, or function. In this case, the parameter will suspend the declaration of the global identifier within the lexical scope of the function. 11.2 Scope of externals Section 11.2 applies in full, with the following additions: The restriction specified, that a multi-module program must contain one and only one external definition of an identifier without the keyword "extern", generally must be enforced by the linker, not the compiler. Because of this, many V7 C compilers are unable to apply this restriction. The compiler documentation should describe whether this restriction is enforced. There are two different approaches taken to the scope of an "extern" definition within an inner block. Most V7 C compilers will take the approach defined by Section 11.1: "because all references to the same external identifier refer to the same object... their scope is increased to the whole file in which they appear." That is, the scope of an "extern" identifier extends from the line on which it appears to the end of the source file. A few V7 C compilers may take the approach later codified by ANSI, in which "extern" declarations follow lexical scope; that is, an "extern" definition within an inner block is visible only until the end of that block. The compiler documentation should describe which scope rule is followed in this case. 12. Compiler control lines Section 12. applies in full, with the following addition: The "#" character indicating a compiler control, or preprocessor, line may be required to be in the first column. However, it is preferable for the compiler or preprocessor to recognize a compiler control line whenever the "#" is the first non-blank character of an input line. The "#" character may also be required to immediately precede the preprocessor command, with no intervening whitespace. However, it is preferable for the compiler or preprocessor to recognize a preprocessor command whenever the command is the first token following the "#" on an input line. Comments or whitespace on an input line following a preprocessor command should be ignored. In most V7 C implementations, any tokens which are not used as the arguments to the preprocessor command are ignored. However, the compiler or preprocessor may issue a warning when there are extraneous arguments present. Example: "#undef XYZ notneeded" may cause a warning. "#undef XYZ /* not needed */" should always compile without warning. 12.1 Token replacement Section 12.1 applies in full, with the following addition: If the #define command is used to define an identifier which has been previously defined, the result is compiler dependent. If the two definitions are identical, token-for-token ("benign redefinition"), the new definition may simply be ignored. If the definitions are different, it is preferable for the compiler to generate a warning message; some V7 C compilers will accept the redefinition silently, or will generate an error message. The #undef command takes only one parameter; after that parameter, any trailing parameters on the input line may be ignored. This is the most common behavior for V7 C compilers; however, the compiler may issue a warning for the trailing parameters. Note: The "#" and "##" operators for string processing within a preprocessor line are additions defined by the ANSI C specification, and are not supported in V7 C. 12.2 File inclusion Section 12.2 applies in full. 12.3 Conditional compilation Section 12.3 applies in full, with the following additions: Any trailing parameters on an #else or #endif line may be ignored. Any trailing parameters after the first parameter to an #ifdef or #ifndef line may be ignored. This is the most common behavior for V7 C compilers; however, the compiler may issue a warning for trailing parameters. 12.4 Line control Section 12.4 applies in full, with the following clarification: A #line command containing an identifier or string constant to set the file name should appear before any #line command with only a line number. If this does not occur (i.e. if a #line command with only a line number occurs first), then the value of the file name which will be used in error diagnostics or returned by the __FILE__ macro is implementation dependent. A compiler may issue a warning in this case. 12.5 Implicit macros The following section is not in Ritchie [1]; it is an addition to this specification. V7 C compilers normally provide the builtin __FILE__ and __LINE__ macros for use by the programmer. The __FILE__ macro is predefined to be the string name of the source file being compiled, and the __LINE__ macro is predefined to be the current line number of the source file being compiled; the value of the macro is taken dynamically at each point where it is evaluated. These macros can not be undefined (via #undef) or redefined (via #define.) If the #line directive is used within the program being compiled, the __FILE__ and __LINE__ macros will take their values from the values given in the #line statement, not from the actual source file name or line number. Some V7 C implementations may define additional macros containing information about the environment. For instance, a compiler running under UNIX may predefine the macro variable "unix". The presence of these macros is implementation-dependent. 13. Implicit declarations Section 13. applies in full, with the following additions: As a particular case of the implicit definition rules, if no declaration is given for a formal parameter to a function, it implicitly defaults to an "int". (Since it is a parameter, the storage-class specifier is not applicable.) This is common behavior for V7 C compilers. 14. Types revisited Section 14. applies in full. 14.1 Structures and unions Section 14.1 applies, with the following modifications: The simple assignment operator = may be used to assign a structure to another structure. No compound assignment operators may be applied to a structure. The method of structure assignment is not specified; however, after an assignment, it is guaranteed that the content of each element of the left operand is equal to the content of the corresponding element of the right operand. A structure may appear as the type-specifier for a function, may be declared as a formal parameter to a function, and may be passed as an actual parameter to a function. A function may return a structure, if the function was defined as returning a structure of that type. A union may appear in all those contexts listed for a structure. That is, it may be assigned to a union of the same type, specified as the type for a function, declared as a formal parameter to a function, passed as an actual parameter to a function, and returned from a function whose type is a union of the same type. 14.2 Functions Section 14.2 applies, with the following modifications: A function pointer followed by a parenthesized parameter list is interpreted as a dereference of the function pointer, followed by a call of the function; thus following the declaration "int (*funcp)();" the two statements "(*funcp)();" and "funcp();" are equivalent. 14.3 Conditional compilation Section 14.3 applies in full, with the following clarifications: It is important to note that even though an identifier of array type will be converted to a pointer to the first member of an array when it appears within an expression, or as an actual or formal parameter to a function, a declaration of an array is NOT equivalent to a declaration of a compatible pointer. In particular, declaring an external identifier as an array in one module and as a pointer in a different module is incorrect and will usually lead to serious run-time errors, due to the different levels of indirection associated with the name. 14.4 Explicit pointer conversions Section 14.4 applies in full. 15. Constant expressions Section 15. applies in full. 16. Portability considerations Section 16. applies in full. 17. Anachronisms Section 17. applies in full, with the following additions: The normal behavior for V7 C compilers is to accept both of the obsolete constructions listed, namely: "=op" for assigment operators and "int x 1" for initializers. The compiler may generate a warning message. Compiler documentation should specify whether the obsolete forms are accepted. 18. Syntax Summary The comment of Section 18. most definitely applies: the syntax given is not adequate for use in writing a compiler, but can be used, for example, to help check the correctness of an expression. 18.1 Expressions 18.1 applies in full. 18.2 Declarations 18.2 applies with the following modifications: Add enum-specifier to the type-specifier list. Add "signed" and "void" to the type-specifier list. enum-specifier: "enum" "{" enum-list "}" "enum" identifier "{" enum-list "}" "enum" identifier enum-list: enumerator enum-list "," enumerator enumerator: identifier identifier "=" constant-expression 18.3 Statements 18.3 applies in full. 18.4 External definitions 18.4 applies in full. 18.5 Preprocessor 18.5 applies in full. REFERENCES [1] Dennis M. Ritchie, _The C Programming Language -- Reference Manual_, pp. 247-276 in _UNIX Programmer's Manual, Vol. 2_, 7th edition, Bell Telephone Laboratories, Inc., Holt, Rinehart and Winston, New York, NY [1979,1983] [2] Brian W. Kernighan & Dennis M. Ritchie, _The C Programming Language_, 1st ed., Prentice Hall, Englewood Cliffs, NJ [1978] [3] Dennis M. Ritchie, _Recent Changes to C; November 15, 1978_, p. 277 in _UNIX Programmer's Manual, Vol. 2_, 7th edition, Bell Telephone Laboratories, Inc., Holt, Rinehart and Winston, New York, NY [1979,1983] [4] Samuel P. Harbison & Guy L. Steele Jr, _C: A Reference Manual_, 2nd ed., Prentice Hall, Englewood Cliffs, NJ [1987] [5] _The Plum Hall Validation Suite_, Plum Hall, Inc. Cardiff, NJ [1990] [6] Brian W. Kernighan & Dennis M. Ritchie, _The C Programming Language_, 2nd ed., Prentice Hall, Englewood Cliffs, NJ [1988]
henry@zoo.toronto.edu (Henry Spencer) (10/28/90)
In article <2442.272704b8@verifone.com> clifton_r@verifone.com writes: > I am posting this article as a first attempt to fill a glaring gap in >the documentation available for the C language. Uh, what glaring gap? Implementors are aiming at ANSI C, which is well documented. People who have to use a wide variety of earlier compilers tend to use H&S as the basic reference. Attempting to specify a single standard for pre-ANSI C is pointless: the users can't rely on it because the pre-ANSI compilers differ, and the implementors won't care because ANSI compatibility is their major concern now and they're not interested in conforming to a pseudo-standard that nobody else conforms to. Incidentally, I think you are grossly underestimating the labor involved in producing a high-quality standard. You would be much better off to start with ANSI C and specify deletions and modifications to it. > To summarize the usual V7 C extensions, this implementation of the C >language supports: > ... > o the "void" data type (but not the "void *" of ANSI C); > o the "signed char" type; > o the "unsigned char", "unsigned short", and "unsigned long" types; > o and calling of function pointers without an explicit dereference. Here we already see the "standard" falling apart. None of these things were in V7 C, although some (not all!) implementors added them later. `signed char' is particularly odd, since as far as I know `signed' was an X3J11 invention and there were *no* pre-ANSI compilers featuring it. Please don't try to fob off the peculiar specs of your own pet compiler as a "standard". Your time would be better spent fixing your compiler to conform to the standard we already have. -- The type syntax for C is essentially | Henry Spencer at U of Toronto Zoology unparsable. --Rob Pike | henry@zoo.toronto.edu utzoo!henry
gwyn@smoke.brl.mil (Doug Gwyn) (10/29/90)
In article <1990Oct27.230447.5456@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >`signed char' is particularly odd, since as far as I know `signed' was >an X3J11 invention and there were *no* pre-ANSI compilers featuring it. Actually, there was existing practice here (Whitesmiths). I agree with your comments about the impracticality of attempting to define a "UNIX Version 7 C standard". Having maintained versions of BOTH 7th Edition UNIX C compilers, I can add that even the genuine article came in two not entirely equivalent flavors. X3J11 took the UNIX C reference manual (essentially an update to K&R 1st Edition Appendix A) as the language base document for the eventual C standard. Changes and additions made during this process were the result of trying to accommodate important real-world concerns that any such standard should have addressed. Anyone who thinks that he can do better working on his own must be woefully ignorant of the issues involved. A large number of the world's most experienced experts in the use and implementation of the C programming language have finally produced the first genuine, officially sanctioned standard for C; use it and be happy.