[comp.lang.c] Tentative specification for UNIX Version 7 C

clifton_r@verifone.com (10/26/90)

   I am posting this article as a first attempt to fill a glaring gap in
the documentation available for the C language.
   As we all know, the movement to standardize C have led to the ANSI 
Standard for C, X3.159-1989.  This has settled on standards for many 
previously divergent aspects of C.  However, it also enhanced the language
substantially.  While this has moved the language forward, and has provided
many badly needed facilities, it failed to resolve questions about
definition or standards for older dialects of C. 
   By far the most common dialect of C has been the "UNIX Version 7" or 
"V7" C dialect, known to most of us who were using "void" and "enum" and 
"unsigned char" before ANSI defined them.  Many compilers have been written 
for this dialect, and this effort rose out of an attempt to DQ one.
   However, as far as I have been able to tell, there has been no published 
standard or reference on V7 C, and what it consists of.  The following is 
an attempt to draft one.  I would welcome any feedback; I recognize that 
this document is far from perfect as it stands, and would appreciate 
feedback on errors or suggestions on improvements.  My e-mail address is
given below.
   Those with no interest in C, or no interest in the Version 7 dialect
of C, can skip on to the next article now.  Those interested, read on!

                      UNIX V7 C Language Specification
                       Revision A.1.  24 October 1990

                           Clifton W. Royston III
                               VeriFone, Inc.
                            HNL - Software Tools
                             100 Kahelu Avenue
                             Mililani, HI 96789

                            Tel: +1 808 623 2911
                            FAX: +1 808 623 3201

                    E-mail: CLIFTON_R (within VeriFone)
                      (or) clifton_r@zon.verifone.com

0. INTRODUCTION

0.1 Overview

     Because there is no comprehensive specification available for UNIX 
"Version 7" C (V7 C), this document will specify the language in terms of 
its extensions and differences from the "K & R" C language as specified in 
D.M. Ritchie's _The C Programming Language -- Reference Manual_ [1].  The 
same document was also published as Appendix A to the 1978 edition of B.W. 
Kernighan and D.M. Ritchie's _The C Programming Language_ [2].  This 
document incorporates the language changes published by Ritchie as "Recent 
Changes to C" [3].  Finally, it relies heavily on Samuel Harbison & Guy 
Steele's _C: A Reference Manual_ [4] to identify which practises are or were 
common usage.

     Thomas Plum's comments on older dialects of C, such as V7 C, are 
appropriate to cite here:

     "There is only one Standard for C compilers; it is ANSI X3.159-1989.  
     Nothing else is a standard, especially not the Appendix A of the 1979 
     Kernighan and Ritchie book.  Vendors should not specify or require 
     "conformance" to such non-standards....  [If testing against a V7 
     specification or test suite]... any "errors" or "remarks" generated in 
     this fashion should just be considered as items to be attended to and 
     discussed, not as any indication of "non-conforming" features." [5]

0.2. Extensions to the "K & R" C language. 

     The bulk of this document is organized to correspond to the section 
numbers and section names given in Ritchie [1].  The modifications specified 
incorporate the less well-known document "Recent Changes to C" [3], which 
added enums and additional structure operations to the language.  If 
Kernighan & Ritchie is used as a reference, the 1978 edition [2] must be 
used.  Appendix A of Kernighan & Ritchie in the 1988 edition [6] is based on 
the ANSI C specification, and does not correspond to any C implementation 
prior to ANSI standard C.

     To summarize the usual V7 C extensions, this implementation of the C 
language supports:
  o  the "enum" declaration for enumerated types [3];
  o  structure assignment [3]
  o  structure passing to functions [3];
  o  structure returning from functions [3];
  o  union assignment, passing to functions, return from functions;
  o  the "void" data type (but not the "void *" of ANSI C);
  o  the "signed char" type;
  o  the "unsigned char", "unsigned short", and "unsigned long" types;
  o  and calling of function pointers without an explicit dereference.
  o  conditional assignment of structures and unions via "a?b:c"; 
     
     It also supports the following C features which are considered archaic 
or traditional, and are not supported in the most recent C compilers:
  o  "Old-fashioned" initializers, such as "int i 3;";
  o  "Old-fashioned" assignment operators, such as "=+", "=-", etc;
  o  Use of non-octal digits in octal constants, such as 088 (translated as 
     72 decimal.)

1.  Introduction

     Section 1. (Introduction) of Ritchie does not apply.

     This document is aimed at a broad class of C compilers, for a variety 
of CPUs and for UNIX and non-UNIX operating systems.  It attempts to 
describe the functionality of, and the language implemented by the majority 
of C compilers released with UNIX Version 7.  That is, it intends to 
describe the C dialect generally classed as "UNIX Version 7" C or V7 C.  In 
some areas, a single behavior or feature set will be specified; in other 
areas, a range of behaviors will be described, any one of which may be 
considered acceptable or normal in a given implementation.

2.  Lexical conventions

     Section 2. applies in full.

2.1. Comments

     Section 2.1. applies with the following modification:

     Where an ambiguous sequence of characters containing a '/*' occurs, and 
the '/' could be part of the preceding token or part of the '/*' comment 
indicator, the '/' should be taken as part of the comment indicator.  In 
practise, the only such sequence is the string =/*; this string should be 
interpreted as '=' '/*', not as '=/' '*'.

     "a =/* This is a correct comment */b+3;"

     "j=/*iptr;     /* The code on the left is wrong */"
      
2.2 Identifiers

     Section 2.2. applies with the following modifications:

     Identifiers beginning with underscores should be avoided, as some names 
which start with underscores are reserved for system libraries, and certain 
other names beginning with underscores may be automatically generated in 
later stages of the programming system.  However, it is not considered an 
error for the programmer to define an identifier beginning with an 
underscore, and no compiler error or warning should be emitted.

     In V7 C compilers, the number of significant characters in an 
identifier is implementation dependent; 31 characters is common.  The 
Ritchie limit of 8 characters can not be relied on, and the programmer must 
not assume that two names differing after the 8th character will be 
considered identical.  The compiler documentation should specify the number 
of significant characters in internal identifiers and in external 
identifiers.

2.3 Keywords

     Section 2.3. applies with the following modifications:

     The words "signed", "void", "enum", and possibly "asm" are reserved as 
keywords.  The words "entry" and "fortran" are not reserved.

2.4. Constants

     Section 2.4. applies in full.

2.4.1. Integer constants

     Section 2.4.1. applies in full, with the following clarifications:

     It is important to note that integer constants have no sign prefix, and 
hence that values such as "-5" are therefore constant expressions.  The 
type-conversion rules may sometimes cause unexpected results when using such 
constants within a program.  In particular, in an implementation where the 
"int" type is 16-bit two's-complement, the value "-32768" is a constant 
expression with type "long", even though the value of the expression can be 
represented by an "int".

     If the value of a decimal constant is greater than the largest value 
representable as "long", or an octal or hex constant is greater than the 
largest value representable as "unsigned long", the result is undefined.  It 
is preferable for the compiler to generate a warning in this case.  However, 
on most V7 C compilers, no warning will be caused, and a different value 
will be assigned in place of the constant; the value substituted will be 
implementation dependent, but sometimes may be equal to the low-order 
portion of the value, taken as a "long" or "unsigned long".  

2.4.2. Explicit long constants

     Section 2.4.2. applies in full, with the following clarifications:

     It is important to note that long constants have no sign prefix, and 
hence that values such as "-5L" are therefore constant expressions, not 
constants.  The type-conversion rules may sometimes cause unexpected results 
when using such constants within a program.

     If the value of a decimal long constant is greater than the largest 
value representable as "long", or an octal or hex constant is greater than 
the largest value representable as "unsigned long", the result is undefined.  
It is preferable for the compiler to generate a warning in this case.  
However, on most V7 C compilers, no warning will be caused, and a different 
value will be assigned in place of the constant; the value substituted will 
be implementation dependent, but sometimes may be equal to the low-order 
portion of the value, taken as a "long" or "unsigned long".  

2.4.3 Character constants

     Section 2.4.3. applies in full, with the following clarifications:  

     The characters '\a' and '\x' are equal to the characters 'a' and 'x', 
respectively.  That is, "a" and "x" have no special interpretation following 
a backslash.  (This is in accordance with Ritchie; the characters have been 
given a new interpretation in the subsequent ANSI standard.)

2.4.4 Floating constants

     Section 2.4.4. applies with the following addition:

     Floating point formats are compiler, hardware, and operating system -
dependent.  Implementations may vary a great deal as to the level of 
compile-time arithmetic and validation which they are able to perform on 
floating-point constants and floating-point constant expressions.  The 
compiler documentation should describe the format used for floating point 
constants and any range-checking or arithmetic performed at compile-time.

2.5. Strings

     Section 2.5. applies with the following modification and addition:

     The result of modifying the contents of a string constant is undefined; 
it may behave "as expected", or result in behavior which is non-portable or 
implementation dependent.

     String constants with the identical value are not guaranteed either to 
be distinct or to share storage.  The Ritchie specification guaranteed them 
to be distinct; however, not all V7 C compilers have followed this practise.  
Compiler documentation should specify the compiler's practise in this 
regard.

2.6 Hardware characteristics

     [Compiler, hardware, and operating system -specific information]

3. Syntax notation

     Section 3. (describing type-faces used within the Reference Manual) 
does not apply to this document. 

4. What's in a name (types)

     Section 4. applies with the following additions:

     The type "void" is added.  Functions declared as returning "void" do 
not return a value; returning a value from a "void" function is invalid.  
Expressions may be cast to type "void" to explicitly discard their value.  
Declaring an object as type "void" is invalid, as is taking the value of a 
"void" function or a "void" expression.

     The "signed" attribute may be used to modify an integral type 
declaration.  Since this is the default attribute for "int"s, "short"s, and 
"long"s, this is primarily useful for "char" declarations.

     The "char" type may be either signed or unsigned by default.  A "char" 
type declaration may be explicitly declared as either "signed" or 
"unsigned". 

     The "unsigned" attribute may also be applied in combination with a 
"short" or "long" size declaration.

     A simple type declaration may include a storage class, fundamental 
type, sign attribute (for "char" or "int"), and size attribute (for "int" 
only.)  These elements may appear in any order within the type declaration, 
but only one of each may occur.  

5. Objects and lvalues

     Section 5. applies in full.

6. Conversions

     Section 6. applies in full.

6.1 Characters and integers

     Section 6.1 applies with the following additions:

     The "char" type may be either signed or unsigned by default.  A "char" 
type declaration may be explicitly declared as either "signed" or 
"unsigned". 

     Unsigned "char" types are not sign-extended when they are converted to 
type "int"; signed "char" types are sign-extended when they are converted to 
type int.  A "char" appearing in an expression is widened to the "int" type 
as part of the usual arithmetic conversions.  

     Many V7 C compilers apply the "unsigned preserving" convention for 
conversions involving an explicitly unsigned type and a wider signed type; 
in this case, an unsigned type will always be widened to another unsigned 
type.  For instance, an "unsigned char" appearing in an expression with an 
"int" will be widened to "unsigned int", and the "int" will therefore also 
be converted to "unsigned int".  An "unsigned int" appearing in an 
expression with an "long" will be widened to "unsigned long", and the "long" 
will therefore also be converted to "unsigned long".  

     Other V7 C compilers apply the "value preserving" convention for such 
conversions.   In this case, an unsigned type will always be widened to the 
next largest type which can hold its full range of values, generally a 
signed type.  For instance, an "unsigned char" appearing in an expression 
with an "int" will be widened to "int"; an "unsigned int" appearing in an 
expression with an "long" will be widened to "long".  

     The compiler documentation should specify which convention is used for 
widening unsigned types. 

6.2 Float and double 

     Section 6.2 applies in full, with the following clarifications:

     As all floating point arithmetic operations are done in double-
precision, the "sizeof" operator will give a different result when applied 
to a floating point expression than when applied to a floating point 
variable (or, with some compilers, to a floating point constant expression.) 

     For example, following the definition "float f; int i;", the expression 
"sizeof(f)" should be equal to "sizeof(float)", whereas "sizeof(f+i)" should 
be equal to "sizeof(double)". 
          

6.3 Floating and integral types

     Section 6.3 applies in full.

6.4 Pointers and integers

     Section 6.4 applies in full.

6.5 Unsigned

     Section 6.5 applies in full.

6.6 Arithmetic conversions

     Section 6.6 applies in full.

7. Expressions

     Section 7. applies in full.

7.1. Primary expressions

     Section 7.1. applies in full, with the following modification:

     The primary expression preceding the parameter list of a function call 
may be either of type "function returning ..." or "pointer to function 
returning ...".  If its type is "pointer to function returning ...", the 
pointer will be implicitly dereferenced before executing the call.  

     That is, if f is declared as: "int (*f)();" (pointer to function 
returning an integer), the calls "f()" and "(*f)()" are equivalent.

7.2 Unary operators

     Section 7.2. applies with the following additions:     

     Any expression may be cast to type "void".  This discards its value.

     Note that "++" and "--" are defined by this section to be unary 
operators, and are of lower precedence than the primary operators "->" and 
"[]".  This means that expressions of the form "a++[1]" or "b++->next" are 
technically invalid; they must be parenthesized as: "(a++)[1]" or 
"(b++)->next", respectively.  However, some V7 C compilers may accept and 
correctly process expressions of this form.

7.3 Multiplicative operators

     Section 7.3 applies in full.

7.4 Additive operators

     Section 7.4 applies in full.

7.5 Shift operators

     Section 7.5 applies in full.

7.6 Relational operators

     Section 7.6 applies in full.

7.7 Equality operators

     Section 7.7 applies in full.

7.8 Bitwise AND operator

     Section 7.8 applies in full.

7.9 Bitwise exclusive OR operator

     Section 7.9 applies in full.

7.10 Bitwise inclusive OR operator

     Section 7.10 applies in full.

7.11 Logical AND operator

     Section 7.11 applies in full.

7.12 Logical OR operator

     Section 7.12 applies in full.

7.13 Conditional operator

     Section 7.13 applies in full, with the following addition:  

     The second and third expressions of the conditional may be structures 
or unions of the same type, in which case the type of the expression is the 
common structure or union type.  Not all V7 C compilers accept conditional 
assignment of structures or unions.

7.14 Assignment operators

     Section 7.14 applies in full, with the following additions:

     The simple assignment operator = may be used to assign a structure to 
another structure.  No compound assignment operators may be applied to a 
structure.  The method of structure assignment is not specified; however, 
after an assignment, it is guaranteed that the content of each element of 
the left operand is equal to the content of the corresponding element of the 
right operand.  A union may also be assigned to another union in the same 
fashion, with the same stipulation as to the effect of the assignment.

     Each of the compound assignment operators may be written in its 
"archaic" (backwards) form.  These operator forms are:

     lvalue =+ expression     equivalent to   lvalue += expression 
     lvalue =- expression     equivalent to   lvalue -= expression 
     lvalue =* expression     equivalent to   lvalue *= expression 
     lvalue =/ expression     equivalent to   lvalue /= expression 
     lvalue =% expression     equivalent to   lvalue %= expression 
     lvalue =>> expression    equivalent to   lvalue >>= expression
     lvalue =<< expression    equivalent to   lvalue <<= expression
     lvalue =& expression     equivalent to   lvalue &= expression 
     lvalue =^ expression     equivalent to   lvalue ^= expression 
     lvalue =| expression     equivalent to   lvalue |= expression 

     However, it is strongly recommended that these forms should not be 
used, as they are not allowed in ANSI C and are syntactically ambiguous.
For example "a=-3;" must be interpreted as subtracting 3 from the variable 
"a", whereas it was more likely intended to set "a" to -3.

7.15 Comma operator

     Section 7.15 applies in full.

8. Declarations

     Section 8. applies in full, with the following clarification:

     The declaration specifiers within a declaration list may appear in any 
order; thus "int register short unsigned" is a valid declaration specifier.

     NOTE: The syntax given specifically permits the declarator-list to be 
omitted.  Therefore "int ;" is a perfectly valid declaration, which declares 
no variables.  A compiler may issue a warning for a declaration of this 
form.

8.1. Storage class specifiers

     Section 8.1. applies in full.

8.2 Type specifiers

     Section 8.2. applies in full, with the following additions:

     "void" and "signed" must be added to the list of possible type-
specifiers.

     enum-specifier must be added to the list of possible type-specifiers.
(See Section 8.9 for the syntax of enum-specifiers.)

     The word "signed" may be thought of as an additional adjective which 
may be applied to an integral type specifier; moreover, "unsigned" may be 
used in combination with "char", "short", and "long".  This means the 
following additional combinations are acceptable:

     unsigned char
     unsigned short      (or)      unsigned short int 
     unsigned long       (or)      unsigned long int
     signed char    
     signed short        (or)      signed short int
     signed              (or)      signed int
     signed long         (or)      signed long int

     The keywords "signed", "unsigned", and "short" may not be applied to 
"float" or "double."  The type "long double" is not supported.

     A V7 C compiler should always support the "void" type and "enum" types.

8.3 Declarators

     Section 8.3 applies in full.

8.4 Meaning of declarators

     Section 8.4 applies in full.

8.5 Structure and union declarations

     Section 8.5 applies in full.

     For V7 C compilers, it is considered standard for each structure or 
union type to have its own name space.  That is, the same component name may 
appear within two or more structure types, with reference to completely 
different components.  In Ritchie, component names for all structures were 
drawn from a single name space, and components of different structures could 
have the same name only if they had the identical type and identical offset 
(relative position) within the structure.  A V7 C compiler should not 
enforce this restriction.

8.6 Initialization

     Section 8.6 applies in full, with the following addition:

     The syntax given allows initializers to be specified for formal 
parameters to a function.  For example, the syntax allows: 
     "int f(a) int a=1; { return a }"

     A compiler may issue a warning for a declaration of this form.

8.7 Type names

     Section 8.7 applies in full.

8.8 Typedef

     Section 8.8 applies, with the following additions:

     Some V7 C implementations allow the programmer to use a combination of 
typedef names and other type specifiers within a declaration.  Example:

     "typedef long int bigint;  unsigned bigint x;"

     However, this is not recommended; it is preferred for the compiler to 
generate an error or warning message if a typedef name is used in 
combination with any other type specifier.

8.9. Enumeration type

     The following section is not in Kernighan & Ritchie [2]; it is an 
direct quotation of the information contained in "Recent Changes to C" [3].

     enum-specifier:
          "enum" "{" enum-list "}"
          "enum" identifier "{" enum-list "}"
          "enum" identifier
          
     enum-list:
          enumerator
          enum-list "," enumerator

     enumerator:
          identifier
          identifier "=" constant-expression

     The role of the identifier in the enum-specifier is entirely analogous 
to that of the structure tag in a struct-specifier; it names a particular 
enumeration.  For example,

     "enum color ( chartreuse, burgundy, claret, windark );"
     ...
     "enum color *cp, col;"

makes color the enumeration-tag of a type describing various colors, and 
then declares "cp" as a pointer to an object of that type and "col" as a 
object of that type.

     The identifiers in the enum-list are declared as integral constants and 
may appear wherever constants are required.  If no enumerators with "=" 
appear, then the values of the constants begin at 0 and increase by 1 as the 
declaration is read from left to right.  An enumerator with "=" gives the 
associated identifier the value indicated; subsequent identifiers continue 
the progression from the assigned value.

     Enumeration tags and constants must all be distinct, and unlike 
structure tags and members, are drawn from the same set as ordinary 
identifiers.  

     Objects of a given enumeration type are regarded as having a type 
distinct from objects of all other types.

     [New material added:]

     If the progression from one enumerator to another leads to a value 
greater than the maximum signed integer value, the value assigned to the 
subsequent values will be implementation-dependent.  In general, it is 
likely to be treated either as a negative integer or as a large unsigned 
integer.  (This is likely to happen if an enumerator is defined with "=" to 
a very large value such as the maximum integer, and is followed by 
subsequent enumerator identifiers.)  The compiler may or may not generate a 
warning for this case.  

9. Statements

     Section 9. applies in full.

9.1 Expression statement

     Section 9.1 applies in full.

9.2 Compound statement, or block 

     Section 9.2 applies in full.

9.3 Conditional statement

     Section 9.3 applies in full.

9.4 While statement

     Section 9.4 applies in full.

9.5 Do statement

     Section 9.5 applies in full.

9.6 For statement

     Section 9.6 applies in full.

9.7 Switch statement

     Section 9.7 applies in full, with the following clarification:

     If the constant expression for a case value evaluates to a value 
greater than the maximum integer (for instance a constant expression of type 
"long" or "unsigned"), it will be truncated to an integer value.  The 
compiler may or may not generate a warning in this case.

9.8 Break statement

     Section 9.8 applies in full.

9.9 Continue statement

     Section 9.9 applies in full.

9.10 Return statement

     Section 9.10 applies with the following modifications:

     The value returned should be assigment-compatible with the type of the 
function in which it appears.  It is incorrect for a function to return an 
expression which is not assignment-compatible with the type of the function, 
and the compiler should generate a warning or error message in this case. 

     A function may return a structure or union, if the function was defined 
as returning a structure or union of that type.

     Statements of the form "return;" are the only form of return statement 
allowed in a function which is declared as returning "void".  It is 
incorrect to return any expression from such a function, and the compiler 
should generate a warning or error message in this case. 

9.11 Goto statement 

     Section 9.11 applies in full.

9.12 Labeled statement 

     Section 9.12 applies in full.

9.13 Null statement 

     Section 9.13 applies in full.

10. External definitions

     Section 10. applies in full.

10.1 External function definitions

     Section 10.1 applies, with the following modifications:

     A structure or union may appear as the type-specifier for a function, 
and may be declared as a formal parameter to a function.

10.2 External data definitions

     Section 10.2 applies in full.

11. Scope rules

     Section 11. applies in full.

11.1 Lexical scope

     Section 11.1 applies in full, with the following clarifications:

     The name space for formal parameters to functions is the same as that 
for typedef names and other global identifiers.  It is invalid to use a 
keyword for a formal parameter name, and the compiler should generate a 
warning or error message in this case.  However, it is legal for a formal 
parameter to be an identifier which is already in use as a globally scoped 
identifier, such as a typedef name, variable, or function.  In this case, 
the parameter will suspend the declaration of the global identifier within 
the lexical scope of the function.

11.2 Scope of externals

     Section 11.2 applies in full, with the following additions:

     The restriction specified, that a multi-module program must contain one 
and only one external definition of an identifier without the keyword 
"extern", generally must be enforced by the linker, not the compiler.  
Because of this, many V7 C compilers are unable to apply this restriction.  
The compiler documentation should describe whether this restriction is 
enforced.

     There are two different approaches taken to the scope of an "extern" 
definition within an inner block.  Most V7 C compilers will take the 
approach defined by Section 11.1: "because all references to the same 
external identifier refer to the same object... their scope is increased to 
the whole file in which they appear."  That is, the scope of an "extern" 
identifier extends from the line on which it appears to the end of the 
source file.  A few V7 C compilers may take the approach later codified by 
ANSI, in which "extern" declarations follow lexical scope; that is, an 
"extern" definition within an inner block is visible only until the end of 
that block.  The compiler documentation should describe which scope rule is 
followed in this case.

12. Compiler control lines

     Section 12. applies in full, with the following addition:

     The "#" character indicating a compiler control, or preprocessor, line 
may be required to be in the first column.  However, it is preferable for 
the compiler or preprocessor to recognize a compiler control line whenever 
the "#" is the first non-blank character of an input line. 

     The "#" character may also be required to immediately precede the 
preprocessor command, with no intervening whitespace.  However, it is 
preferable for the compiler or preprocessor to recognize a preprocessor 
command whenever the command is the first token following the "#" on an 
input line.

     Comments or whitespace on an input line following a preprocessor 
command should be ignored.  In most V7 C implementations, any tokens which are 
not used as the arguments to the preprocessor command are ignored.  However, 
the compiler or preprocessor may issue a warning when there are extraneous 
arguments present.  Example:

     "#undef XYZ  notneeded"  may cause a warning.
     "#undef XYZ /* not needed */" should always compile without warning. 

12.1 Token replacement

     Section 12.1 applies in full, with the following addition:

     If the #define command is used to define an identifier which has been 
previously defined, the result is compiler dependent.  If the two 
definitions are identical, token-for-token ("benign redefinition"), the new 
definition may simply be ignored.  If the definitions are different, it is 
preferable for the compiler to generate a warning message; some V7 C 
compilers will accept the redefinition silently, or will generate an error 
message. 

     The #undef command takes only one parameter; after that parameter, any 
trailing parameters on the input line may be ignored.  This is the most 
common behavior for V7 C compilers; however, the compiler may issue a 
warning for the trailing parameters.

     Note: The "#" and "##" operators for string processing within a 
preprocessor line are additions defined by the ANSI C specification, and are 
not supported in V7 C.

12.2 File inclusion

     Section 12.2 applies in full.

12.3 Conditional compilation

     Section 12.3 applies in full, with the following additions:

     Any trailing parameters on an #else or #endif line may be ignored.  Any 
trailing parameters after the first parameter to an #ifdef or #ifndef line 
may be ignored.  This is the most common behavior for V7 C compilers; 
however, the compiler may issue a warning for trailing parameters.

12.4 Line control

     Section 12.4 applies in full, with the following clarification:

     A #line command containing an identifier or string constant to set the 
file name should appear before any #line command with only a line number.  

     If this does not occur (i.e. if a #line command with only a line number 
occurs first), then the value of the file name which will be used in error 
diagnostics or returned by the __FILE__ macro is implementation dependent.  
A compiler may issue a warning in this case.

12.5 Implicit macros

     The following section is not in Ritchie [1]; it is an addition to this 
specification.

     V7 C compilers normally provide the builtin __FILE__ and __LINE__ 
macros for use by the programmer.  The __FILE__ macro is predefined to be 
the string name of the source file being compiled, and the __LINE__ macro is 
predefined to be the current line number of the source file being compiled; 
the value of the macro is taken dynamically at each point where it is 
evaluated.  These macros can not be undefined (via #undef) or redefined (via 
#define.)

     If the #line directive is used within the program being compiled, the 
__FILE__ and __LINE__ macros will take their values from the values given in 
the #line statement, not from the actual source file name or line number.

     Some V7 C implementations may define additional macros containing 
information about the environment.  For instance, a compiler running under 
UNIX may predefine the macro variable "unix".  The presence of these macros 
is implementation-dependent.

13. Implicit declarations

     Section 13. applies in full, with the following additions:

     As a particular case of the implicit definition rules, if no 
declaration is given for a formal parameter to a function, it implicitly 
defaults to an "int".  (Since it is a parameter, the storage-class specifier 
is not applicable.)  This is common behavior for V7 C compilers.

14. Types revisited

     Section 14. applies in full.

14.1 Structures and unions

     Section 14.1 applies, with the following modifications:

     The simple assignment operator = may be used to assign a structure to 
another structure.  No compound assignment operators may be applied to a 
structure.  The method of structure assignment is not specified; however, 
after an assignment, it is guaranteed that the content of each element of 
the left operand is equal to the content of the corresponding element of the 
right operand.

     A structure may appear as the type-specifier for a function, may be 
declared as a formal parameter to a function, and may be passed as an actual 
parameter to a function.  A function may return a structure, if the function 
was defined as returning a structure of that type.

     A union may appear in all those contexts listed for a structure.  That 
is, it may be assigned to a union of the same type, specified as the type 
for a function, declared as a formal parameter to a function, passed as an 
actual parameter to a function, and returned from a function whose type is a 
union of the same type.

14.2 Functions

     Section 14.2 applies, with the following modifications:

     A function pointer followed by a parenthesized parameter list is 
interpreted as a dereference of the function pointer, followed by a call of 
the function; thus following the declaration "int (*funcp)();" the two 
statements "(*funcp)();" and "funcp();" are equivalent.

14.3 Conditional compilation

     Section 14.3 applies in full, with the following clarifications:

     It is important to note that even though an identifier of array type 
will be converted to a pointer to the first member of an array when it 
appears within an expression, or as an actual or formal parameter to a 
function, a declaration of an array is NOT equivalent to a declaration of a 
compatible pointer.  

     In particular, declaring an external identifier as an array in one 
module and as a pointer in a different module is incorrect and will usually 
lead to serious run-time errors, due to the different levels of indirection 
associated with the name.  

14.4 Explicit pointer conversions

     Section 14.4 applies in full.

15. Constant expressions

     Section 15. applies in full.

16. Portability considerations

     Section 16. applies in full.

17. Anachronisms

     Section 17. applies in full, with the following additions:     

     The normal behavior for V7 C compilers is to accept both of the 
obsolete constructions listed, namely: "=op" for assigment operators and 
"int x 1" for initializers.  The compiler may generate a warning message.  
Compiler documentation should specify whether the obsolete forms are 
accepted.

18. Syntax Summary

     The comment of Section 18. most definitely applies: the syntax given is 
not adequate for use in writing a compiler, but can be used, for example, to 
help check the correctness of an expression.

18.1 Expressions

     18.1 applies in full.

18.2 Declarations

     18.2 applies with the following modifications:

     Add enum-specifier to the type-specifier list.

     Add "signed" and "void" to the type-specifier list.

     enum-specifier:
          "enum" "{" enum-list "}"
          "enum" identifier "{" enum-list "}"
          "enum" identifier
          
     enum-list:
          enumerator
          enum-list "," enumerator

     enumerator:
          identifier
          identifier "=" constant-expression

18.3 Statements

     18.3 applies in full.

18.4 External definitions

     18.4 applies in full.

18.5 Preprocessor

     18.5 applies in full.

REFERENCES

[1]  Dennis M. Ritchie, _The C Programming Language -- Reference Manual_, 
     pp. 247-276 in _UNIX Programmer's Manual, Vol. 2_, 7th edition, Bell 
     Telephone Laboratories, Inc., Holt, Rinehart and Winston, New York, NY 
     [1979,1983] 

[2]  Brian W. Kernighan & Dennis M. Ritchie, _The C Programming Language_, 
     1st ed., Prentice Hall, Englewood Cliffs, NJ [1978]

[3]  Dennis M. Ritchie, _Recent Changes to C; November 15, 1978_, p. 277
     in _UNIX Programmer's Manual, Vol. 2_, 7th edition, Bell Telephone 
     Laboratories, Inc., Holt, Rinehart and Winston, New York, NY 
     [1979,1983] 

[4]  Samuel P. Harbison & Guy L. Steele Jr, _C: A Reference Manual_, 2nd 
     ed., Prentice Hall, Englewood Cliffs, NJ [1987]

[5]  _The Plum Hall Validation Suite_, Plum Hall, Inc. Cardiff, NJ [1990]

[6]  Brian W. Kernighan & Dennis M. Ritchie, _The C Programming Language_, 
     2nd ed., Prentice Hall, Englewood Cliffs, NJ [1988]

henry@zoo.toronto.edu (Henry Spencer) (10/28/90)

In article <2442.272704b8@verifone.com> clifton_r@verifone.com writes:
>   I am posting this article as a first attempt to fill a glaring gap in
>the documentation available for the C language.

Uh, what glaring gap?  Implementors are aiming at ANSI C, which is well
documented.  People who have to use a wide variety of earlier compilers
tend to use H&S as the basic reference.  Attempting to specify a single
standard for pre-ANSI C is pointless:  the users can't rely on it because
the pre-ANSI compilers differ, and the implementors won't care because
ANSI compatibility is their major concern now and they're not interested
in conforming to a pseudo-standard that nobody else conforms to.

Incidentally, I think you are grossly underestimating the labor involved
in producing a high-quality standard.  You would be much better off to
start with ANSI C and specify deletions and modifications to it.

>     To summarize the usual V7 C extensions, this implementation of the C 
>language supports:
>  ...
>  o  the "void" data type (but not the "void *" of ANSI C);
>  o  the "signed char" type;
>  o  the "unsigned char", "unsigned short", and "unsigned long" types;
>  o  and calling of function pointers without an explicit dereference.

Here we already see the "standard" falling apart.  None of these things
were in V7 C, although some (not all!) implementors added them later.
`signed char' is particularly odd, since as far as I know `signed' was
an X3J11 invention and there were *no* pre-ANSI compilers featuring it.

Please don't try to fob off the peculiar specs of your own pet compiler
as a "standard".  Your time would be better spent fixing your compiler to
conform to the standard we already have.
-- 
The type syntax for C is essentially   | Henry Spencer at U of Toronto Zoology
unparsable.             --Rob Pike     |  henry@zoo.toronto.edu   utzoo!henry

gwyn@smoke.brl.mil (Doug Gwyn) (10/29/90)

In article <1990Oct27.230447.5456@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>`signed char' is particularly odd, since as far as I know `signed' was
>an X3J11 invention and there were *no* pre-ANSI compilers featuring it.

Actually, there was existing practice here (Whitesmiths).

I agree with your comments about the impracticality of attempting to
define a "UNIX Version 7 C standard".  Having maintained versions of
BOTH 7th Edition UNIX C compilers, I can add that even the genuine
article came in two not entirely equivalent flavors.  X3J11 took the
UNIX C reference manual (essentially an update to K&R 1st Edition
Appendix A) as the language base document for the eventual C standard.
Changes and additions made during this process were the result of
trying to accommodate important real-world concerns that any such
standard should have addressed.  Anyone who thinks that he can do
better working on his own must be woefully ignorant of the issues
involved.  A large number of the world's most experienced experts in
the use and implementation of the C programming language have finally
produced the first genuine, officially sanctioned standard for C; use
it and be happy.

ccplumb@spurge.uwaterloo.ca (Colin Plumb) (10/30/90)

In article <14270@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
> A large number of the world's most experienced experts in
> the use and implementation of the C programming language have finally
> produced the first genuine, officially sanctioned standard for C; use
> it and be happy.

The computer is your friend.  The computer wants you to be happy.  If
you are not happy, you may be used as reactor shielding.
-- 
	-Colin

gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)

In article <1990Oct29.190452.24098@watdragon.waterloo.edu> ccplumb@spurge.uwaterloo.ca (Colin Plumb) writes:
>In article <14270@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>> A large number of the world's most experienced experts in
>> the use and implementation of the C programming language have finally
>> produced the first genuine, officially sanctioned standard for C; use
>> it and be happy.
>The computer is your friend.  The computer wants you to be happy.  If
>you are not happy, you may be used as reactor shielding.

I don't know about reactor shielding, but this does give me the opportunity
to add something important I left out of my previous posting:

There were three extensive public reviews of the draft proposed C standard,
leading to numerous improvements in the final version.  I doubt that any
attempt at such a standard without a similar amount of public review could
possibly balance the large variety of conflicting demands on the language.