tim@unc.UUCP (Tim Maroney) (04/10/84)
[ This is a proposal for an extension to C; if you start to froth at the mouth and fall over backwards when you hear such things, please don't read. ] C software suffers from the lack of tagged unions in a variety of ways. The primary problem is interactive debugging. Automatic debuggers cannot determine the type of unions, so that the programmer must enter the member name by hand, causing the debugging process to be slowed down. The slowdown is very significant in applications with a large number of unions that must be examined in debugging, such as tree input code. (Debugging is probably the least fruitful programmer activity, since a large fraction of its time tends to be spent waiting on compilers; anything that slows it down further is to be shunned.) The second is that equality comparison on untagged unions cannot be added to the language, although structure equality comparison is likely to be added soon. A tagged union facility more general than that of Pascal could be cleanly added to the language. The syntax would be similar, but the tag could be an arbitrary expression, possibly involving external variables and function calls (both of which must be declared before they appear in a tagged union declaration). It could also be a simple typed identifier, as in Pascal, of course. This allows both the "naive" approach to tags, which tends to waste space in favor of simplicity, or more complex and space-efficient memory-map or typed-page approaches. The actual syntax would probably be something like: <tagged union decl> ::= union <optname> tag <union tag expr> <tagged body> <tagged body> ::= "{" { <tagged member> }+ "}" <tagged member> ::= <constant expr> ":" <member decl> <union tag expr> ::= <C expr> | "[" <decl> "]" where <member decl> is the same as current structure and union member declarations, and <decl> is an identifier declaration as usual, but without a storage class specifier. If the new keyword "tag" did not appear, the union would be the same as an untagged union in the present language, thus preserving backwards compatibility. A new operator, typeof(expr), could be useful in a number of contexts. The operator would conceptually expand into an abstract type corresponding to the type of the expr operand. The abstract type could then be used wherever such a thing can usually be used, such as in sizeof expressions and type casts. It would also be useful with tagged unions, as a run-time operator, provided that abstract type equality comparison is added to the language. One cute thing that could be done with typeof is the creation of better storage allocation macros. As it is now, you have to do something like: type *x; x = (type *)malloc(sizeof(type)); and if you've ever made the mistake of using the size of the pointer type instead, you know how frustrating and impossible to find that mistake can be. (This is not an uncommon error when the pointer type is a typedef name.) In any case, this is pretty long-winded, and one of the major advantages of C is its brevity. However, with typeof, you could say: # define new(ptr) ptr = (typeof(ptr))malloc(sizeof(typeof(*ptr)) new(x); and the "new" macro would work with any pointer type. With abstract type comparison, macros could be overloaded as well, expanding differently depending on the type of their arguments. (In fact, of course, they would expand uniformly and one alternative would be chosen by the semantic phase of the compiler, but the programmer needn't know that.) Unions were a late addition to C, so it is not surprising that there are deficiencies. However, these would be easy to rectify in a backwards compatible way, without loss of run-time or compile-time efficiency. -- Tim Maroney, The Censored Hacker mcnc!unc!tim (USENET), tim.unc@csnet-relay (ARPA) All opinions expressed herein are completely my own, so don't go assuming that anyone else at UNC feels the same way.