[comp.lang.c] Comments on ANSI public Oct 86 Public review draft.

am@cl-jenny.UUCP (02/25/87)

Comments on the ANSI draft standard X3.159-198x issued for public review
by the X3J11 committee for the programming language 'C' (dated 1-Oct-86).

Alan Mycroft, Computer Laboratory, Cambridge University, UK.  13-Feb-87.

This document was developed with the help of Cambridge University and
Acorn Computers Ltd.  This document collects comments from disparate
sources and there may be duplications or misunderstandings of the draft.
If any of the comments below are incorrect due to the draft having been
misunderstood, then that in itself is an indication that the draft needs
to be clarified.

N.B. Certain comments are omitted from this NEWS message to reduce its size.

1. section 1.6, section 4.1.4:
   section 1.6 line 45 requires that no header define more macros
   than are documented for it.  Section 4.1.4 suggests that NULL may
   be defined in several macros.  Worse still section A.4 suggests
   "least contents of <limits.h>" that headers may contain extra
   macros/types. Clarification is required.  Further:
   may an implementation be conforming if all 13 standard headers
   contain the same text?  I observe also that section 1.6 in fact
   bans (e.g.) <stdio.h> containing "int getw(FILE *)" as such
   may generate a linkage clash.  Suggestion: define EXACTLY
   what each header may and may not contain.

2. omitted.

3*. MAJOR POINT:
    section 3.2.2.1 almost totally duplicates section 3.2.1.1:
    The contexts in which "char (etc.) is widened to int" correspond
    almost exactly to those where a "array or function is converted
    to a pointer".  In particular I find the phrase "in contexts where an
    l-value may occur" as particularly misleading as an l-value may occur
    anywhere an r-value may.  As far as I can discover, the only contexts
    in which JUST ONE of these coercions apply (widening of char) is for
    operands of && || and possibly ','.
         Observe that it does not matter to the semantics whether
    char is widened to int in MOST contexts - certainly not in && context
    or '='-rhs context.  Note that (given char c1,c2,c3)
      c1 = c2+c3 is required to be treated as c1 = (char)((int)c1 + (int)c2)
    Similarly it would SEEM that c1 = c2 is to be treated as
      c1 = (char)(int)c2.  Clearly the compiler will elide the repeated
    cast.  Analogously it does not matter whether (c1 && c2) is treated as
    ((int)c1 && (int)c2) and a decent compiler will just test the
    char directly if it is in memory.  Similarly, given that ','
    does not give an l-value as result, it does not matter whether
    (c1,c2) is treated as (char)((void)c1, c2) or
      ((void)(int)c1, (int)c2) EXCEPT for SIZEOF context (see point 33*).
    Therefore letting all value requiring contexts to be treated AS IF
    both "widening of char" and "conversion of array and function"
    operations are required seems to greatly simplify this clumsy
    part of the language.
        I have long regarded both these coercions
    as part of the "usual unary conversions".  Would ANSI consider
    whether to adopt this idea, or at least explain why these two analogous
    contexts should differ.  As a way of clarifying 3.2.1.1 and 3.2.2.1,
    let me propose:
    "C has two contexts, 'rvalue' and 'object':  The rvalue contexts
     are the immediate subexpressions of expressions whose operators are
       casts, [], (), binary &, binary and unary *, 
       binary and unary -, ~, !, /, %, <<, >>, <, >, <=, >=, ==, !=,
       ^, |, ||, &&, binary ',' and conditional ?:;
     together with the right operand of all assignment operators,
     the left operand of '.' and '->'.
     All expressions which are the largest expression part of declarators,
     commands or initialisers are also in rvalue context
     (except for the special treatment of strings as array initialisers).
     All other (expression) contexts are object contexts, these are:
        Left operand of assignment operators.  (Modifiable lvalue context).
        Operand of unary &.                    (Addressable lvalue context).
        Operand of sizeof.                     (sizeof context).
     In rvalue context all expressions of type enum, char, short and
     bitfields are widened to int and those of type function or array to
     type pointer to function or pointer to array.
     In all other contexts no coercions occur, save that the above
     coercions also occur for the value extraction of the left hand
     side on op= operators (e1 op= e2 is seen as e1 = e1 op e2 save that e1
     is evaluated only once)."
   See related point 93*.

4. sections 2.2.2, 2.1.1.2 and 3.8
   (a) Enough systems use or provide files represented as in the FORTRAN 77
       standard that I consider it essential to provide a defined or prefered
       mapping to and from C's char type.  On many systems writing '\f'
       to a file must be translated to a '1' carriage control character so
       that editors, printers and the like can work.
   (omitted)
   Failing sufficient latitude to choose the effect of '\f' '\v' and '\r'
   many compilers will probably tend to treat them as not being
   "in the source character set".  This will not aid portable programs.

6. section 2.1.2.3:
   a) Sequence point: I do not understand the ramifications in
      (f(),g())+(h(),i());
   b) line 39: but not including any exceptions?  Consider (void)x/0.
   c) page 8, line 41.  "Expressions may be evaluated in greater precision
      than required, provided not precision is lost".  I would think
      that it should not be gained either.  Consider IEEE implementation of
      {  float x,y,z;  z = x*y;   if (z == x*y) ... }.
      I believe that it is unreasonable for the condition to fail, BUT
      the current draft seems to permit it.

7. section 2.2.1:
   I would like the standard to forbid all characters in strings and
   character constants except those for which 'isprint()' is true.
   Programs with "   " containing tabs are a menace as are other non-blank
   white-space.  If newline is banned from strings, why not all not-space
   white-space?  Bells are even worse.

8. section 2.2.3:
   (a) "The signal function is guaranteed to be reentrant".
       The signal() function NEEDS to be non-reentrant if its result is used
       in that it updates some form of shared structure.  Consider the program
         void (*f)(), (*g)();
         f = signal(SIGINT, g);
       Then any SIGINT event during this call to signal() which itself
       calls signal(SIGINT,...) can disturb f from getting the old handler.
   (b) The standard should state that simply that a strictly conforming
       program may only call signal() inside a handler.  (Even exit() may fail
       due to the fact that a SIGINT may have interrupted a logically
       atomic macro expansion of 'putchar' so that fclose fails or writes
       junk.)  For similar reasons longjmp() is forbidden too in a
       strictly conforming program.

18. section 3.2.2.3:
   I fail to see the rationale that ANY integer expression with value
   0 is a null pointer constant.  I was once bitten by a compiler
   failing to notice that
     enum foo { a,b,c} x;
     extern needsptr(enum foo *x);
     needsptr(a);   /* Is 'valid' because a (= int 0) is null-pointer */
   So what is a null pointer currently?
   Hence (2-1-1) is, (1 ? 0:0) is, (f()? 0:0) is not, (1 ? 0:f()) is UNCLEAR.
   Alternative suggestion.  The only null pointer constants are
   0, (type *)0.  I cannot help but think that allowing (2-1-1) to be
   a null pointer constant is merely a way of permitting a (poor) compiler
   to reduce an expression without remembering whence it came.

23. section 3.3.3.2
    Allowing & to be applied to arrays is natural as suggested in the
    rationale, and is needed to support a macro for offsetof().
    On the other hand it needs to be pointed out that
        { extern int f(int); if (f == &f) ... } is valid
    but { extern int v[100]; if (v == &v) ... } is INVALID if I understand
    the draft.  '&f' gets type 'int (*f)(int)', as does 'f' when coerced.
    But v is coerced to '&v[0]' (so that v+1 works) and so gets type
    'int *' in expression context.
    On the other hand 3.3.3.2 requires that '&v' has
    type 'int *[100]'.  This makes sense, but is a real trap for the
    unwary, especially how (&v+1) is guaranteed to point beyond v[99]
    whereas v+1 points to v[1].
    
26. section 3.3.6, 3.3.8 et al.
    "two pointers that have the same type may be subtracted/compared/etc"
    Is "same type" the notion of "type equivalence"?
    Is it unreasonable to do
        f(const char *x, char *y) { return x==y; }?
    Following on, an extreme view of type equivalence as defined in the draft
    would require that
        f(char *const x, char *y) { return x==y; }
    should be faulted - the only disagreement with this concerns the phrase
    "const and volatile ... are meaningful only for an objects which are
    l-values" which is in section 3.5.2.4.  Of course here x and y ARE
    l-values (but not in l-value context).

31. section 3.3.15 lines 21-25.
    (a) Because (void *)0 is a null pointer constant, these contradictarily
        require that (assuming:   int x, int *y)
          x ? y : (void *)0
        both has type (int *) and (void *).
    (b) again the program
        f(int c, char *x) { char y[10]; g(c ? x : y); }
        is illegal as y does not have scalar type.  See comment 3*.
    (c) Further to point (a) I have reservations the fact that the
        rules giving void * as result if possible allow curious
        constructions: E.g.
           extern f(short *);
           g(int *x, void *y) { f(x); }            illegal - pointer mismatch
           h(int *x, void *y) { f(p() ? x : y); }  legal!!!
    (d) The "constraints" do not actually constrain anything.
        Are these intended to be the only cases a conforming compiler may
        support?

32. section 3.3.16.1
    x = x; appears to be forbidden, since by my understanding "x overlaps with
    x".

33*. section 3.3.17
    "The result (of a comma expression) has its (the right operand's) type
    and value".  "The result is not an l-value".
    Since all other binary value-like operators such as + - & && ?:
    (as discussed above) seem to require that arrays are converted
    to pointers to their first element, is it not natural to
    require a comma operand to do this also?  The phrasing
    about "usual conversions" suggests that conversions
    of arrays to pointers MAY be done but widening of chars to ints
    MIGHT NOT be done.
    Since the result is not an l-value the only way this makes any difference
    is sizeof context.
    Question:  assume char c, v[40] and sizeof(char *) = sizeof(int) = 4.
    Is   sizeof (0, c) == 1 or 4?
    Is   sizeof (0, v) == 40 or 4?
    My best guess as to the current intention of ANSI is that the
    answers are 1 and 4, i.e. char is not converted, but arrays are converted.
    Inconsistent.  See comment 3* above as a definition which provides a
    simple possible clarification.

34. section 3.4
    Constant expressions seem to be a little ambiguous, especially
    the different classes.  
    (a) According to my reading, a conforming compiler is required to allow
        "int a[] = (int)3.4;" but to fault "int a[(int)3.4]".
    (b) char c[sizeof 3.4] is not allowed as floating constants may not occur.
    (c) The specification seems to suggest that arbitrary (load-time)
        address arithmetic is to be supported for static initialisation,
        e.g. "extern int b,c; int a = &b-&c/4;"
    (d) There is no guidance about the effect that operators such as
        &&, || and ?: which do not need to evaluate all of their
        operands.  E.g. is (1 ? 0 : 1/0) a constant expression? Is it
        valid at all?  What about (1 ? 0 : f()) - presumably not.
        On the other hand many compilers do reductions (such as
        replacing the above 2 expressions with the value 0 in ALL
        contexts.  Is this conforming?

35. section 3.5 constraint:
    "A declaration shall declare at least ...".  Something along
    the lines of "... at its outermost level" is required - otherwise
      struct { struct a { int b; }; }
      static struct a { int b; };
    seem to be allowed, but the rationale wants to ban them.
    (They defines 'struct a' inside a struct or static which defines nothing).

38. section 3.5.2.2, line 9:
    (a) line 9: "when the size of an object of the specified type is not
        needed".  When is this?  Some compilers may need the size of
        'struct foo' to calculate the stride of 'extern struct foo a[]'
        (i) on the definition of a[] or
        (ii) on its first use:
    (b) line 11: I have had a user claim that the words "declaration of the
        tag in the same scope MAY give the complete definition" allows one
        to make a declaration of the form
          extern f(struct magic *x);     or
          f(struct magic *x) { return g(x); }
        being the only place where tag 'magic' occurs and 'not need to
        define struct magic because the size is not required'.
        Surely all tags referenced need to be defined somewhere in the
        current file (after preprocessing)?  Consider an implementation in
        which (struct {char x}*) and (struct { int y}*) have different sizes.
    (c) Consider "struct foo const b;".  This would appear to be valid
        provided foo is defined elsewhere.
        What about "struct baz const;" is this
        (i) an illegal empty declaration required to be faulted or
        (ii) a "vacuous structure specifier superceding any prior declaration"
        Ditto "volatile struct hohum;". 

41. section 3.5.3.2
    Presumably 'typedef int a[]' is now illegal?
    Should the constraint that there are not arrays of functions appear here?
    The exact positions where [] may occur must be listed.  E.g.
      Is 'static int a[]' a valid tentative definition?
      Is 'int a[]' a valid tentative definition?
    The answers to the last two questions are particularly important in that
    one pass compilers (with data-segment backpatching)
    on machines requiring registers for addressing static store
    wish to allocate space for a tentative definition in the
    module data-segment so that all statics can be addressed w.r.t. a
    single register (consider 370 addressing or 32016 module data).
    Clearly allowing such tentative definitions breaks this for dubious
    benefit.

42. section 3.5.3.3 semantics:
    (a) "the contained identifier has type 'type-specifier function
        returning T'".  The identifiers has to have a type which includes
        its argument types to disallow
           extern int f(double), (*g)(int);
           void init() { g = f;}
    (b) Is the intent that 'typedef void a; extern f(a)'
        treats 'f' as having no arguments?  Is 'typedef void a' legal?
    (c) "the storage class specifier register is ignored".
        Does this mean that (i) calling sequences may not legally
        pass register arguments in different places from auto arguments
        or (ii) the assignment
            extern (*f)(int a), (*g)(register int a);
            void assign() { f = g; }

43. section 3.5.5
    (a) line 29: "but the type (of a redeclared typedef name) may not be
        omitted in the inner declaration".  What does "type" mean here?
        Consider "typedef int a; f() { const a; ... }
        Does it mean "all type-specifiers"?
    (b) line 30: "same ordered set of type specifiers".  What is an ordered
        set?  Does this mean that "const int *a; int const *b" give a and b
        different types?  I hope not!
    (c) One presumes that "signed char *s; unsigned char *t" are
        different types.  So that f() { s = t; } is illegal.
        However, on implementations where char = signed char is
          signed char *s; char *t; f() { s = t;}      legal?
        Or are the 3 types never equal and thus always need casts?
    (d) Again is
            extern (*f)(int a), (*g)(register int a);
            void assign() { f = g; }                        legal?
    (e) Similarly
            extern (*f)(int *a), (*g)(const int *a);
            void assign() { f = g; }                        legal?
        I hope not - it would upset many optimisers.

44. section 3.5.6
    (a) Consider "struct { int a:10, :0, b:10; } foo = { 1,2 };"   
        Does foo.b get initialised to 0 or 2?  I.e. do padding bit
        fields count as members for initialisation?
    (b) line 41-46.  Auto scalar initialisers may be brace enclosed,
        but struct/union ones may not.  Do these special syntax rules
        really make sense?
    (c) line 47.  May the braces which enclose a union initialiser be omitted?
    (d) page 62, line 1: "The rest of this section ... aggregate type"
        Insert "non-union" before "aggregate".

45. section 3.5 and 3.7.1
    (a) 3.5, line 6: there is a misery here for compilers in that at top level
        one cannot police this syntactic requirement without arbitrary
        lookahead since section 3.7.1 allows declaration specifiers to
        be omitted.  Request: please mark the omission of declaration
        specifiers as "deprecated" now that void provides a proper
        way to do it.

47. section 3.7.1
    (a) "If the declarator includes a parameter type list, the declaration
        of each parameter shall include an identifier".
        A special case needs to be made to avoid forbidding
        int f(void) {...}.
    (b) "An identifier declared as a typedef name shall not be redeclared
        as a formal parameter".  Clearly this is needed for K&R style
        definitions.  However, it corrupts the regular syntax for prototype-
        style C.  What is wrong with:
           typedef int a;    f(double a) { ... }

48. Do the scopes of struct a and struct b differ in the following
      extern struct a { int a; } f(struct b { int b;});   ?

49. section 3.8
    Please explain how the pre-processor tokenises '0.1x', is it
    {0.1}{x} or {0.1x}?  Ditto 1.2e4e5.  Ditto 1000_000.  Ditto 1__DATE__.

50. section 3.8.1
    (a) This suggests that '#if defined(__STDC__)' evaluates to 0 since
        "it has not been the subject of a #define directive".  Surely not.
    (b) Consider '#define f(x,y) x
                  #if f(1,
                        2)
                  ...'
        Of the rules that "pre-processor directives are single lines" and
        "macro arguments may span several lines" which wins?
    (c) Personally I would rather that character constant evaluatation
        should be forbidden entirely at pre-processor time than that
        the pre-processor should disagree with the compiler.
    (d) Is '#define void int' valid?
        If not, then how about
        '#ifndef __STDC__
         #define void int
         #endif'

51. section 3.8.2
    Escape sequences are not interpreted in strings representing file names
    according to the example #include "\new\types".  Surely ONE RULE
    'double '\'s in all strings' is simpler?  This current interpretation
    means that some file names are not accessible - e.g. '\' or '\"'
    according to which of #include "\" or #include "\"" are legal (they
    both cannot be).

52. section 3.8.3.5
    This long example seems to be specifying the pre-processor by example.
    The complication of what was intended as a SIMPLE macro processor
    increases with every draft.

53. I would like to be able to use 'C' to write packages which start
    up with messages of the form:
    "Package X compiled by 'acme-C on ibm/370' on 29-feb-88
     Running on 'acme-C wombat-500' on 1-Mar-88"
    These greatly enhance version control.
    One way of achieving this is to add a library function
      char *libver()     (or similar)
    to a header - the intent being that this produces the 'acme-C wombat-500'
    string at run time.  Associated with libver() would be a
    pre-processor string pre-defined identifier __CENV__
    (compile time environment) which, if the
    compiler were written in C, would obtain this string from calling
    libver().  This the above message could be written by
      printf("Package X compiled by '" __CENV__ "' on " __DATE__
             Running on %s on %s\n", libver(), asctime());
    No method is proposed for pre-processor testing of the content of
    this string.
    
54. Now that the pre-processor is no longer allowed to pre-define
    variables of the form 'sun' 'sel' etc., it is natural to pre-define
    at least one macro indicating the machine name beginning with '_'.
    Suppose I choose to pre-define _machine_68000 on a 68K compiler.
    I need a guarantee that no-one else will gratuitously define
    _machine_68000 (for example as an internal macro within a
    header file.).  Therefore would the committee consider reserving
    one or more macro prefixes which MAY NOT BE DEFINED within any header
    file but which MAY BE (no requirement) pre-defined
    "in a manner intimating of the target machine".
    Suggestion: no HEADER in a standard conforming implementation may
    define any macro name of the forms __MC_*__ and __OS_*__.
    One or more of such macro's MAY be defined before execution in the
    manner of __STDC__.

55. section 3.9.3
    Please add "hence so is implicit definition as functions of names
    followed by '('".

56. section 4.1.2 and 3.1.2
    (a) "All external identifiers and macro names beginning with a leading
        underscore are reserved".  'struct' tags need to be so to,
        as <stdio.h> may wish to do
          typedef struct _FILE FILE;
    (b) Similarly, may an implementation reserve an 'identifier' beginning
        with an underscore for a new (internal) keyword in such a way
        that that keyword may no longer be used as a keyword?
        I.e. may an implementation refuse to accept
           f() { int _magic = 1; return _magic; }
    (c) "Headers may be included ... in any given scope".
        This suggests that
          void f() {
          #include <time.h>
          ... }
          void g() {
          #include <time.h>
          ... }
        is legal.  However, either the functions like time() will only
        be defined in one scope (if the usual '#ifdef <magic>' is used)
        or they will be defined to hve two different types in the the
        two scopes (since 'struct tm' will be defined in each scope and
        these are DIFFERENT types).  This would be faulted by a
        well checking compiler.

57. section 4.1.5
    (a) "Any invocation of a ... macro ... evaluates each of its arguments
        only once".  The exceptions getc()/putc() are omitted here, as is
        the NDEBUG version of assert().
    (b) "... protected from interleaving".  It needs also to be
        protected from re-association of associative operators too.
        Consider sqrt(1.0+1e6)

58. section 4.2.1.1
    "if NDEBUG is defined then '#define assert(ignore)' is performed".
    Objection.  This ensures that #assert(...}!!!) is valid.
    Two solutions:
    1. #define assert(x) ((void)(x))
    2. #define assert(x) ((void)(sizeof (x)))
    Both these enable the compile to check that the expression is valid,
    the former evaluates x the latter does not.

59. section 4.2.1.1 et al.
    The standard should clarify that a function-like macro from the standard
    headers may ONLY generate a C expression - definitions
    of (standard) assert by e.g.
      #define assert(x) { if (x) ... }
    cannot be used as a function - consider
      if (magic) assert(x==y); else assert(x==z);
    would be syntactically illegal ('};' may not precede 'else').
    Similarly the proposed definition of assert(ignore)
    fails because the program (magic ? assert(x==y) : (void)0)
    becomes syntactically illegal.
    Without such a restriction it becomes unclear when (= in which contexts)
    a strictly conforming program may use ANY library function.

64. section 4.8.1.2
    Is va_arg(ap, type) an l-value?  Is is modifiable?

66. section 4.9.5.3
    (a) "... append mode ... writes ... to ... end-of-file ... regardless
        of calls to fseek".
        Suppose I do
        { FILE *f = fopen("junk", "ab+"); long i, j, k;
          fwrite("abcd", 4, 1, f);
          i = ftell(f);                     4, I suppose
          fseek(f, 0, SEEK_SET);
          /* POSSIBLY insert here fread(junk,4,1,f); fseek(f,0,SEEK_SET); */
          j = ftell(f);                     0, I suppose
          fwrite("efgh", 4, 1, f);
          k = ftell(f);                     8? 0? Dunno
        }
        Do append-mode files have two file positions indicators - one for
        reading and one for writing?
    (b) "When opened a stream is fully buffered if ... interactive device"
        This means that the system has to have some way of determining
        if a FILE is interactive.  UNIX programmers often stoop to
           if (isatty(fileno(fp)) ...
        which is pretty unportable.  On the other hand there is a real
        requirement to determine in many C programs whether input is
        coming from an interactive file (to issue prompts for example).
        Clearly the committee cannot bless isatty().  Can it offer
          extern int fisatty(FILE *);    ?
          extern int devtype(FILE *); with macros such as
          DEV_TTY, DEV_UNKNOWN, DEV_DISC?
      
67. section 4.9.5.6
    (a) May setvbuf() be used after fopen() and ungetc() but before
        read and write?  Many ungetc() do an implicit setvbuf if called
        before I/O.
    (b) "buffered ... flushed ... when input is requested"
        Input from where?  Any file?  Any terminal?  This terminal?

69. section 4.9.6.1
    (a) Padding is with zeros if the width integer starts with a zero.
        This requires %08.2f to print as e.g. '000-3.14'.
    (b) what does a negative field width do if the '-' flag is set?
        consider printf("%-*d", -20, 345).
    (c) Flag character discusses the 'first' character of a conversion.
        There may not be one - e.g. printf("% 0d", 0);

70. section 4.9.6.4
    If C is becoming a more reliable language it is time to stop
    using routines like sprintf.  (For example the sample definition of
    asctime() in section 4.12.3.1 overwrites its static char[26] array
    if any of its argument components are out of range).
    For reliability, and consistency with strcpy/strncpy please
    may we have  "extern int snprintf(char *, size_t, const char *, ...);"

71. section 4.9.7
    (a) Now that (getc)(f) provides the 'proper' way of doing what fgetc(f)
        was intended for, may fgetc/fputc be marked as "deprecated".
    (b) Ditto fputc().
    (c) gets() should be marked as deprecated due to its unusability in
        reliable code.

72. section 4.9.8.1 and 4.10.5.1
    Why are the 'size' and 'nmemb' arguments in opposite order in
    fread() and bsearch()?
    Section 4.10.3.1 calloc() agrees with bsearch() although does not
    matter.

73. section 4.9.9.4
    Would 'long UNSIGNED int ftell()' not give a greater range?

74. section 4.10.
    <stdlib.h> defines ERANGE but not EDOM.  Why?

77. section 4.11.4.4
    How may one determine from setlocale() ("as appropriate to a program's
    locale") whether strcmp() or memcmp() is appropriate?

78. section 4.12.2.3
    (a) What is the role of isdst in mktime()?  Set? Used?
    (b) Can mktime() affect the static "broken down time structure"
        referenced in 4.12.3
    (c) How badly out of range may the components of mktime() be?
        INT_MAX?  Can I ask for day -1000 of month -235 of year 2000?
    (d) Is this the only function which accepts out of range values in
        a struct tm object?

79. section 4.12.3.4
    What if timer == (time_t)-1?

80. section A.5
    "A value is given to an object of an enumeration type than by
    assigning a variable or constant".  enum foo f(), x = f();
    is surely OK.

81. section A.6.2 (page 178)
    Insert "(promoted)" in "The (promoted) type of the next argument ...
    disagrees with ... the va_arg macro".

82. section A.6.3.5
    "The direction of truncation ...".  Add "or rounding".

83. General:  The real reason that the traditional getc() and putc() 
    macros evaluate their first argument more than once is that C provides
    no way to include a declaration within an expression.
    If C had a 'let' construct (many functional languages) or 'valof'
    (BPCL - C's parent) then such a macro could start
    #define getc(fp) (let FILE *_fp = (fp); ( <usual macro with _fp for fp>)).
    Did the committee ever consider such?
    Suggested syntax:
       expression ::= let <declaration> <expression>.

84. General:  Producing a union value from a component value is C is clumsy
    in the extreme.  E.g.
      union foo { t1 *a; t2 *b; };
      extern f(union foo);
      g(t1 *x) { < what we want is f(x)>; }
    Of course using f(x) is illegal, and the proper way to do this is:
      g(t1 *x) { union foo y; y.a = x; f(y); }
    But this is clumsy so many programmers to
      g(t1 *x) { f((union foo *)x); }
    which is nasty, but works.
    Did the committee consider syntax for the simple mathematical notion of
    "inject component into union"?

85. (synopsis of net mail).
    The program
      struct { int a[2]; } f(void);
      int *f() { return f().a; }
    Appears to be valid according to the syntax and constraints.
    However, f().a has (int [2]) type and so is converted by
    return to (int *) type.  This requires taking the address of
    the 'structure return value'.  This may be in two registers here!
    May a conforming processor fault this?  Can the standard forbid it?    

86. volatile bit fields:
        storing a volatile bit field on many architectures requires storing
        the byte/word containing it.  However, this may overwrite an
        adjacent bitfield which was assynchronously updated.
    (a) Is this behaviour acceptable?  Do processors need to support
        volatile bit fields?
    (b) The same situation possibly arises w.r.t. volatile char fields
        on word oriented machines.  There one solution is to spread
        the volatile char fields so that only one is stored per word.
        Is it acceptable that sizeof(struct { char a,b; }) !=
                              sizeof(struct { volatile char a,b; })?
        Is such a hack required?
    (c) Does (or ought) the draft to place any constraint on the size of
        storage unit which is fetched from a volatile bit field
        with respect to memory mapped I/O?

87. section 3.8.4:
    __TIME__ and __DATE__ should be required to be obtained
    by a single call to asctime().   Otherwise the 'midnight problem'
    occurs in that
      printf("System foo compiled at " __TIME__ " on " __DATE__ "\n");
    May print a message which involves a timestamp 24 hours after
    compilation if midnight occurs between the substitution of
    __TIME__ and __DATE__.

88. Floating constants.
    Consider
    f() {
      static float t1 = 0.123456789123456789f;
      static float t2 = (float)0.123456789123456789;
      auto float t3 = 0.123456789123456789f;
      auto float t4 = (float)0.123456789123456789;
      auto double d = 0.123456789123456789;
      auto float t5 = (float)d;
      ...
    }
    Which of t1 to t5 are required to compare equal?
    Especially consider floating point hardware which truncates (like
    the suggested use of IEEE hardware for C in the rationale).
    Would a high quality compiler give:
    (a) t1 the closest approximation to 0.123456789123456789
        which would fit in float?
    (b) t5 the truncated (toward zero) float below 0.123456789123456789
    If so then either t2, which could be written using implicit conversion
    as
      static float t2 = 0.123456789123456789;
    would either (i) be unequal to t1
              or (ii) be unequal to t5
    Both options seem fairly unintuitive.
    Similarly are compile time t2 and t4 required to match?
    Do ANSI wish to constrain (or even advise) implementers of high
    quality compilers whether
      (i)  (float)0.123456789123456789 == 0.123456789123456789f   or
      (ii) (float)0.123456789123456789 == (float)d
    since not both can obtain.

89. section 3.3.16.1 and 3.3.16.2
    Consider
      static struct { int a:1,b:1; } r;
      static int x;
      r.a = r.b = 1;
      x |= ((x |= 1) & 1) << 1;
    Given r.a and r.b share the same word in r are they considered to overlap?
    If not then one presumes on an appropriate machine that the low order
    two bits of the storage corresponding to r are set.
    On the other hand 'x |= ...' may be considered an equivalent piece of
    code to r.a=r.b=1.  However, it appears that 'x' is fetched twice
    and these may be in the 'wrong' order so that becomes 2 as a
    result of the assignment.  Perhaps r.a and r.b overlap
    and as such the effect of the above may be to set r.a=1, r.b=0.

90. section 4.9
    "reads and writes must be separated by calls to fflush/fseek"
    What happens if I do
       fflush(f); ungetc(f, ch); putc(f, ch2);
    I.e. does ungetc() count as a 'read' routine?
       Similarly is fflush(f); putc(f, ch2); ungetc(f) an error?

92. The rules about header inclusion do not seem to have considered the
    following program:
      #define NULL 1                       (really)
      #include <stddef.h>
    I would expect this to give an error as whatever definition of
    NULL which <stddef.h> gives must be textually not "1".
    On the other hand some (textual) <stddef.h>'s may contain:
      #ifndef NULL
      #  define NULL (void *)0
      #endif
    which would not give an error.  I would like the standard to 
    ban such implementations by saying that no name in any
    header may be #defined by the user in a file which includes
    that header.
    BETTER STILL: require headers to be the first things to be included
    in a source file.  All system headers MUST be included before
    any user files and before any definitions or declarations.

93* Section 3.3.2.2
    (Please also add an index reference of 'ellipsis' to this section.)
    "The default argument promotions" include the integral promotions
    (presumably also conversion of arrays (e.g. strings) to pointers
     see point 3*) and widening of float to double.
    Recently I have built a compiler in which calls to printf/scanf
    are argument type-checked when flow analysis indicates a given
    necessary format association to a printf call.
      This brings to light the following point: 
      Technically the defintion
        f(int *x) { printf("x = %p -> %d", x, *x); }
      is invalid as %p expects a (void *) argument and I have provided
      an (int *) one, which may be of a different length.
      The 'proper' code accordingly to the current draft is to do
        f(int *x) { printf("x = %p -> %d", (void *)x, *x); }
      but how often will this occur in REAL code, given the impoverished
      attitude to checking (and the view that 'all machines are like a vax')
      in C in general?  Worse, it might provoke
      users into sillies like
        f(int *x) { printf("x = %lx -> %d", (long)x, *x); }.
    Suggestion: that the default argument conversions include widening
    of all pointers to (void *).  This means that the above natural code
    works without problem, at the cost of requiring slightly more care
    when using va_arg.  However, users already need to be aware
    that va_arg(ap, char) and va_arg(ap, float) are illegal and need
    to be writing repesectively as (char)va_arg(ap,int) and
    (float)va_arg(ap, double).  This suggestion would require that
    va_arg(ap, char *) in printf %s would have to be coded as
    (char *)va_arg(ap, void *).

95. section 4.9.5.7 and 4.9.9.2
    It should be clarified that feof() does not return true until
    after getc() has been called when the file is at end-of-file.
    feof() is NOT a test for end-of-file, but a test that a PREVIOUS
    input operation failed because of attempt to read beyond the file.

96. section 4.1.4
    "NULL expands to an implementation defined null pointer constant".
    I would like to require that NULL expands to ((void *)0)
    for the following reason:  The standard goes to some lengths
    to require type errors to produce diagnostics in code like
        int *p; sqrt(p);
    Allowing NULL to expand to 0 or 0L allows a compiler to fail to
    detect errors in similar code such as 'sqrt(NULL)' - this works
    only on those processors where NULL is 0 or 0L.
    This is against the spirit of the type-checking imposed EVERYWHERE
    else in the language, e.g. the ANSI prototype forms and calling
    compatibility.

97. section 3.3.16.1 and 3.6.6.4
    "If an object is assigned to an object which overlaps ..."
    Consider the program:
      typedef struct { char c16[16]; } ch16;
      union { ch16 a;
              struct { int b1; ch16 b2; } b; } x;
      static ch16 deref(ch16 *z)
      {
        return *z;
      }
      main() { x.b.b2 = x.a; }
    This copies a ch16-typed field of x to an overlapping one at function
    return.
    I would anticipate that "return ... the value of the expression
    is returned to the caller" means that this copy is NOT considered
    overlapping, but clarification would help.

98. section 3.8.xxx
    What value should __DATE__ give if the the date is unobtainable?
    Consider a C compiler written in C.  It enquires of the date
    with time().  But the user has not set the time and so time()
    returns (time_t)-1.  Does __DATE__ have to always produce a
    string "mmm dd yyyy" or could it produce "<**unset**>" (also
    11 chars) or what?

Schauble@mit-multics.arpa (Paul Schauble) (03/09/87)

This is in response to the comments recently posted by Alan Mycroft. Text
bounded by ====== is his.


===============
   c) page 8, line 41.  "Expressions may be evaluated in greater precision
      than required, provided not precision is lost".  I would think
      that it should not be gained either.  Consider IEEE implementation of
      {  float x,y,z;  z = x*y;   if (z == x*y) ... }.
      I believe that it is unreasonable for the condition to fail, BUT
      the current draft seems to permit it.
===============

Perhaps it seems unreasonable, but there are current machines where
arranging otherwise is awkward and expensive. I have to answer this on
HOneywell hardward every few months.

Honeywell uses a slightly unusual register structure that give a float
or double contained in a register an extra 8 bits precision. This is
lost when the value is stored into memory. The most likely code sequence
generated for this example is

       FLD     X          Get X into register
       FMP     Y          Multiply by Y, giving result in register.
                          Result is *double precision plus extra 8 bits*. 
                          That's right, single*single yields double.
       FST     Z          Store results, truncating to single precision
                          without extra bits.
       FCMP    Z          First of IF statement. Compares truncated value
                          In memory to extra precision value in register.
                          Result is probably not equal.

The only way around this is to have the compiler never generate a
register reuse on a floating point value. This seems a serious
inefficiency.

It seems to me much more in accord with reality to say that == is not
defined for float and double. At least, like it or not, the definition
is machine dependant and compiler dependant. It may seem ureasonable
that (x*y == x*y) should fail, but it can and does on many machines.

    Paul
    Schauble  at boTap co at 1

faustus@ucbcad.UUCP (03/09/87)

In article <4804@brl-adm.ARPA>, Schauble@mit-multics.arpa (Paul Schauble) writes:
> Honeywell uses a slightly unusual register structure that give a float
> or double contained in a register an extra 8 bits precision. This is
> lost when the value is stored into memory....
> 
>        FLD     X          Get X into register
>        FMP     Y          Multiply by Y, giving result in register.
>                           Result is *double precision plus extra 8 bits*. 
>                           That's right, single*single yields double.
>        FST     Z          Store results, truncating to single precision
>                           without extra bits.
>        FCMP    Z          First of IF statement. Compares truncated value
>                           In memory to extra precision value in register.
>                           Result is probably not equal.

I think the problem isn't with the arithmetic operations being
performed, but with the test for equality.  I think the hardware should
be intelligent enough to say that a single in memory is equal to a
double+ in a register, if the MSB's of the register are the same as the
bits of the single.  (Assuming that the FST just truncates.) Otherwise
equality in floating point has no useful meaning.  Are problems like
this common enough that it would be better to just say that it's
undefined, as opposed to complaining to the makers of this sort of
hardware?

	Wayne

braner@batcomputer.tn.cornell.edu (braner) (03/10/87)

[]

In many numerical programs you do not want to repeat some calculation
if some floating-point variable has not changed.  Therefore,

	double x,y;
	x = y;
	...
	if (x==y) ...

SHOULD be available (and work correctly) even in implementations where
(x*y == y*x) fails.  And it should be no problem to implement, since
x and y are two copies of the same number in the same internal format.
(This is very different from the common mistake of expecting infinite
precision when using FP variables in loop termination criteria...)

- Moshe Braner

stuart@bms-at.UUCP (Stuart D. Gathman) (03/19/87)

In article <4804@brl-adm.ARPA>,Schauble@mit-multics.arpa(Paul Schauble) writes:

> It seems to me much more in accord with reality to say that == is not
> defined for float and double. At least, like it or not, the definition
> is machine dependant and compiler dependant. It may seem ureasonable
> that (x*y == x*y) should fail, but it can and does on many machines.

A straight == comparison for float is useless on almost *any* machine.  A better
approach would be to define == to be a higher level contruct, i.e. equal
within a certain "fuzz" factor.  Unfortunately, there are several definitions
of floating fuzz.  This is where operator redefinition would be handy;
everyone could pick his own floating equality operator.
-- 
Stuart D. Gathman	<..!seismo!dgis!bms-at!stuart>

marv@ism780c.UUCP (03/21/87)

In the good old days when I did numerical processing on an IBM-709, I
considered two floating point numbers to be equal when the first N significant
bits of the mantissia were the same (N is of course problem dependent).  This
was easy to do in assembly language.  All that is needed is an unnormalized
subtract operator.

I too wish for a higher level way to specify the "fuzz" for equality.

      Marv Rubinstein  -- Interactive Systems

mouse@mcgill-vision.UUCP (03/24/87)

In article <365@bms-at.UUCP>, stuart@bms-at.UUCP (Stuart D. Gathman) writes:
> A straight == comparison for float is useless on almost *any*
> machine.  A better approach would be to define == to be a higher
> level contruct, i.e. equal within a certain "fuzz" factor.

As long as you provide some way to test for real equality.  I mean the
sort that I'd use in

if (...) x = y; else x = z;
......
if (x == y) ....

> This is where operator redefinition would be handy; everyone could
> pick his own floating equality operator.

Wonderful.  Just what the world needs.  "Now let me see, this is Fred's
program, so == means within 1e-10....but this part is some of Alice's
code, so == really means within one part in 1000000...."  It's bad
enough when people start writing their own functions to replace the
supplied library functions.

Still, I suppose it's as good as any other way of settling the matter.
It *is* an unpleasant situation.

					der Mouse

Smart mailers: mouse@mcgill-vision.uucp
USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!musocs!mcgill-vision!mouse
     think!mosart!mcgill-vision!mouse
ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu

gwyn@brl-smoke.UUCP (03/27/87)

It occurs to me that we should not allow == with floating-point
operands.  I'm serious!

dik@mcvax.UUCP (03/28/87)

In article <5703@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
 > It occurs to me that we should not allow == with floating-point
 > operands.  I'm serious!

It has been thought of before, but than everybody would write
	if ( a <= b && b <= a)
and complain there is no equality operator.
The only solution is education.
(Ever worked on a machine where the hardware comparison is not
commutative?: the Cyber 205)
-- 
dik t. winter, cwi, amsterdam, nederland
INTERNET   : dik@cwi.nl
BITNET/EARN: dik@mcvax

nelson@ohlone.UUCP (03/29/87)

I don't have the Oct. draft, so this may be covered, but...

What is the definition about 'goto' where the target is an enclosed
block?  In particular:
    ...
    a = 3
    goto l;
    ...
    {
        int i = a;
      l: ;
        printf("%d\n", i);
    }
is the initializing assignment to i done or not?  Or is this illegal (as
it should be!)?
-----------------------
Bron Nelson     {ihnp4, lll-lcc}!ohlone!nelson
Not the opinions of Cray Research

drw@cullvax.UUCP (03/30/87)

nelson@ohlone.UUCP (Bron Nelson) writes:
> What is the definition about 'goto' where the target is an enclosed
> block?  In particular:

> is the initializing assignment to i done or not?  Or is this illegal (as
> it should be!)?

Initializations are not done, and it *is* legal.  (The underlying idea
is that automatics of inner blocks are actually allocated in the stack
frame of the containing procedure.)

Dale
-- 
Dale Worley		Cullinet Software
UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw
ARPA: cullvax!drw@eddie.mit.edu
Un*x (a generic name for a class of OS's) != Unix (AT&T's brand of such)

gwyn@brl-smoke.UUCP (03/30/87)

In article <5703@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
> It occurs to me that we should not allow == with floating-point
> operands.  I'm serious!

So far the only valid use for floating == that anyone has suggested to me
is for branching to streamlined handling of cases where one variable is
precisely 0.0 (or perhaps 1.0).

It occurs to me that an example of why C floating == is not very useful
might be in order (apart from the numerical analysis problems):

	#include <stdio.h>

	int SameRatio( a, b, c, d )	/* return non-0 iff a/b == c/d */
		register double	a, b, c, d;
		{
		register double	ab = a / b;	/* `register' honored */
		register double cd = c / d;	/* `register' ignored */

		return ab == cd;
		}

	int main()
		{
		static double	a = 2.0, b = 3.0, c = 2.0, d = 3.0;

		(void)printf( "%s\n", SameRatio( a, b, c, d ) ? "ok" : "oops" );
		return 0;
		}

Now suppose that the machine has 5 available FP registers (other than the
scratch accumulators), that they include some guard bits, and that no real
optimization is done.  Since `cd' will be stored in memory, for the ab == cd
comparison it will be loaded into a scratch register with its guard bits set
to zero, unlike `ab', and very likely the comparison will fail.  "Oops."

msb@sq.UUCP (03/31/87)

> What is the definition about 'goto' where the target is an enclosed
> block?  In particular:
>     ... a = 3;  goto l;
>     ... { int i = a; l: ; printf("%d\n", i); }
> is the initializing assignment to i done or not?

This is covered in Section 3.1.2.4 of the draft, which reads in part:

#   A new instance of an object declared with "automatic storage duration"
#   is created on each normal entry into the block in which it is declared
#   or on a jump to a label in the block or in an enclosed block.  If an
#   initialization is specified, it is performed on each normal entry,
#   but not if the block is entered by a jump to a label.

In other words, your code fragment is legal (they couldn't really make
it illegal, that would break too much existing code), and prints garbage.

I'm glad you made me look at that.  There are two problems with the above.

One is that the words "from outside the block" should be added just after
the first "jump".  They don't mean that a simple if-goto loop within a
block causes new instances of variables in the block to be created on
each iteration!  The second problem is that there is no back-reference
to 3.1.2.4 in section 3.5.6, which is about initialization, nor any index
entry for 3.1.2.4 under initialization.

Mark Brader, utzoo!sq!msb			C unions never strike!