[comp.lang.c] ANSI C idea: structure literals

friedl@vsi.UUCP (Stephen J. Friedl) (02/28/88)

Netlanders,

     I've had an idea for C for a long time and would like to see
what you people think of it and whether it is worthy of submis-
sion to the standardardization committee.  I ran this by Doug
Gwyn and he seems to think it's plausible.  Your comments -- pro
or con -- would really be appreciated.  It no strong objections
are raised I'll submit it to the committee.

     This new item might be called "structure literals" (thanks
to Doug for the name).  Basically, let's say that I want to have
a structure foo:

        struct foo {
                char    *name;
                int     age;
        };

     Initializing a table of these is obvious, but what about
statically initializing a table of *pointers* to structures?  I
need to have two tables, the first of which is probably hidden
from the rest of the world.

        struct foo      _table[] = {
                { "Bill",  25 },
                { "Jim",   10 },
                { "Bob",   40 }
        };

        struct foo      *table[] = {
                &_table[0],
                &_table[1],
                &_table[2],
        };

     This is really a bummer to maintain without special kinds of
preprocessors and it strikes me as something that the compiler
should be able to handle in a straightforward manner.  I propose
something like:

        struct foo      *table[] = {
                & { "Bill",  25 },
                & { "Jim",   10 },
                & { "Bob",   40 },
        };

     Where the compiler sees the & before the {initializer}, puts
the {struct} somewhere and throws the pointer into the table.
Note: The & { } notation is just something I thought up off the
top of my head -- alternate ideas are welcome.

     I hope I am not name-dropping or putting Doug on the spot,
but he had some really good ideas on this:

        "I would suggest generalizing the notion to include
        structure literals as the source for assignment state-
        ments, too, which would probably be best done by adding
        'struct-literal' to the a new section 'Structure
        literals' inserted after 3.1.4 ('String literals') that
        describes the syntax and semantics of struct-literal
        (much like the syntax of the { ... } part of a struct de-
        clarator, except contain only constants, string literals,
        and struct literals)."
	[end comments from Doug]

     I guess the general mechanism would allow:

        main()
        {
        struct foo   x, *xp;

                x  = { "Bill", 12 };
                xp = & { "Bill", 12 };

     This undoubtedly bring up a whole host of questions: can we
pass structure-literals as arguments to functions?  Can we cast
them?  Return them from functions?  There are probably lots more.

     I don't have a copy of the current draft but will be picking
one up on Monday and seeing where these all fit in.  I would
really like it if you wizards could comment on this idea: would
it be helpful?  Can it be specified in the language cleanly?  Can
a compiler handle it cleanly?

     Thanks much,
     Steve

-- 
Life : Stephen J. Friedl @ V-Systems, Inc/Santa Ana, CA    *Hi Mom*
CSNet: friedl%vsi.uucp@kent.edu  ARPA: friedl%vsi.uucp@uunet.uu.net
uucp : {kentvax, uunet, attmail, ihnp4!amdcad!uport}!vsi!friedl

flaps@dgp.toronto.edu (Alan J Rosenthal) (02/28/88)

In article <56@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>        struct foo      *table[] = {
>                & { "Bill",  25 },
>                & { "Jim",   10 },
>                & { "Bob",   40 },
>        };
>
>     Where the compiler sees the & before the {initializer}, it puts
>the {struct} somewhere and throws the pointer into the table.

This is good.  It is an analogue of what happens with strings.  It
would be good to extend this to arrays and other things.  For example,
why not be able to write "&3" to get a pointer to int which points to a
3?  (This is currently achievable by the non-portable (int *)"\0\0\0\003".)

The basic advantage of this is it reduces the need for temporary
variables, which are generally a Bad Thing.  This is not as bad an
addition as some of the ANSI things because there is semi-precedent
with strings.

Unfortunately, this doesn't seem to allow creating, say, a short in
this manner, because you can't specify shorts in expressions.  Any
attempt to fix this would probably be a very bad addition (such as
differentiating "&(short)3" from "&3" even though "(short)3" and "3"
usually are identical).  (Of course, they're already different when
sizeof is used on them, but this is already ugly.)

ajr
-- 
If you had eternal life, would you be able to say all the integers?

wesommer@athena.mit.edu (William Sommerfeld) (02/29/88)

In article <56@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>Netlanders,
>
>     I've had an idea for C for a long time ...
>...   This new item might be called "structure literals" (thanks
>to Doug for the name).  

Richard Stallman's GCC implements an extension of this as a language
extension.  This is how he documents it (this is from the file
"internals.texinfo" from the GNU CC distribution; it is covered by the
GNU copyleft); I'm not sure I like the syntax, but it's better than nothing.


   * Constructor expressions are allowed.  A constructor looks like a cast
     containing an initializer.  Its value is an object of the type
     specified in the cast, containing the elements specified in the
     initializer.  The type must be a structure, union or array type.
     As explained above, GNU C does not require the elements of the
     initializer to be constant.
     
     Assume that `struct foo' and `structure' are declared
     as shown:
     
          struct foo {int a; char b[2];} structure;
     
     Here is an example of constructing a `struct foo' with a
     constructor:
     
          structure = ((struct foo) {x + y, 'a', 0});
     
     This is equivalent to writing the following:
     
          {
            struct foo temp = {x + y, 'a', 0};
            structure = temp;
          }
     
     You can also construct an array.  A constructed array is not an lvalue
     and therefore cannot be coerced into a pointer to its first element.
     As a consequence, the only valid way to use a constructed array is to
     subscript it.  Here is an example of constructing an array of three
     elements and then choosing one of them:
     
          output = ((int[]) { 2, x, 28 }) [input];

franka@mmintl.UUCP (Frank Adams) (03/02/88)

In article <1988Feb28.130526.4147@jarvis.csri.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>Unfortunately, this doesn't seem to allow creating, say, a short in
>this manner, because you can't specify shorts in expressions.  Any
>attempt to fix this would probably be a very bad addition.

There is an obviously correct way to permit specification of short
constants: append an s.  Exactly analogous to appending an l for long
constants.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

karl@haddock.ISC.COM (Karl Heuer) (03/02/88)

In article <1988Feb28.130526.4147@jarvis.csri.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>In article <56@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>>        struct foo      *table[] = {
>>                & { "Bill",  25 },
>>                & { "Jim",   10 },
>>                & { "Bob",   40 },
>>        };
>
>This is good.  It is an analogue of what happens with strings.  It would be
>good to extend this to arrays and other things.  For example, why not be able
>to write "&3" to get a pointer to int which points to a 3?

(I'll use backquotes `` in running text to avoid colliding with C notation.)

I like the concept of unnamed remote objects, but I'm uneasy about allowing
`&` to apply to a non-lvalue.  I would prefer a more explicit syntax: let's
postulate a keyword `remote` that looks like a function call with two
arguments, the first of which is a type name, and the second an initializer;
the result is an unnamed lvalue of the specified type and value.  Thus,
instead of `&3` we write `&remote(int, 3)`.  The struct example would use
`&remote(struct foo, {"Bill", 25})`.  The existing notation for string
literals can now be described in terms of remotes: `"foo"` is synonymous with
`&remote(char[4], {'f', 'o', 'o', '\0'})`.  (All remotes, like strings in ANSI
C, should be non-writable and poolable at the implementor's discretion.)

This syntax is less kludgy than the other proposals I've seen, and by making
the type explicit we avoid the ambiguities.  Unfortunately it's a bit verbose.
It could be changed from an alphabetic operator to punctuation, but I suspect
the need is sufficiently rare that the longer name is acceptable.

>Of course, [`3` and `(short)3` are] already different when sizeof is used ...

Only on a broken compiler.  In C, `sizeof((short)3)` returns sizeof(int).
(Because `(short)3` is not an lvalue.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

tada@athena.mit.edu (Michael Zehr) (03/02/88)

In article <1988Feb28.130526.4147@jarvis.csri.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>This is good.  It is an analogue of what happens with strings.  It would be
>good to extend this to arrays and other things.  For example, why not be able
>to write "&3" to get a pointer to int which points to a 3?

Actually, VAX C 2.2 allows this, although it's completely non-portable,
of course.


-------
michael j zehr
"My opinions are my own ... as is my spelling."

ian@puivax.UUCP (Ian Wilson) (03/03/88)

In article <2804@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <1988Feb28.130526.4147@jarvis.csri.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>>In article <56@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>>>        struct foo      *table[] = {
>>>                & { "Bill",  25 },
>>
>
>(I'll use backquotes `` in running text to avoid colliding with C notation.)
>
>I like the concept of unnamed remote objects, but I'm uneasy about allowing
>`&` to apply to a non-lvalue.  I would prefer a more explicit syntax: let's
>postulate a keyword `remote` that looks like a function call with two
>arguments, the first of which is a type name, and the second an initializer;
>the result is an unnamed lvalue of the specified type and value.  Thus,
>Only on a broken compiler.  In C, `sizeof((short)3)` returns sizeof(int).
>(Because `(short)3` is not an lvalue.)

Another place where anonymous objects would be handy is for functions
embedded in data structures. For example:

	struct {
		char *name;
		int (*handler)();
	} table[] =
	{
		{"fred", &remote( int (*)(), { return 6; }},
		-- etc ---
	};

These functions have similar properties to the struct example
originally quoted: their names are never needed, they have to
be constructed somewhere distant from where they are required,
their names must be chosen so as not to conflict with anything
else, and so on.

Presumably, in order to be able to specify formal parameters to
these anonymous functions the `remote' syntax would have to be
extended:

	(*(&remote( (*)(char * x), {printf("%s\n", x);} )));

Perhaps it would be better to acknowledge what `remote' is trying
to be and call it `lambda' .... (:-))

	ian wilson

henry@utzoo.uucp (Henry Spencer) (03/04/88)

The only specific problem I see with this scheme is that there is no
specific indication of the type of the structure literals.  This is unwise;
there are situations where it's not trivial to guess from context.

There is also a more general problem.  To approximate the probable response
of X3J11:  "Need not convincingly shown; existing constructs can be used to
same effect; lack of implementation experience in C."
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/04/88)

In article <1988Mar3.183939.945@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>There is also a more general problem.  To approximate the probable response
>of X3J11:  "Need not convincingly shown; existing constructs can be used to
>same effect; lack of implementation experience in C."

I think a good case could be made for some such facility.  Whether an
argument sufficient to convince X3J11 to change the proposed standard at
this late date could be found is the real question.  If the idea is sent
in as a public comment, it needs to have a convincing argument attached.

decot@hpisod2.HP.COM (Dave Decot) (03/05/88)

> There is also a more general problem.  To approximate the probable response
> of X3J11:  "Need not convincingly shown; existing constructs can be used to
> same effect; lack of implementation experience in C."

Aggregate constants are needed for data abstraction.

The problem of deciding what their type is solved for now (and possibly
always) by stating in the standard that such syntax has no inherent type
and must be cast or assigned to the desired type.

Dave Decot
hpda!decot

henry@utzoo.uucp (Henry Spencer) (03/06/88)

> >... "Need not convincingly shown; existing constructs can be used to
> >same effect; lack of implementation experience in C."
> 
> I think a good case could be made for some such facility...

Um, Doug, please explain how said good case would refute any, let alone
all, of the three objections I cited.  Note that the first objection says
"need", not "wish".  For heaven's sake, have we forgotten that X3J11's
supposed mission was to *standardize* C, not redesign it?!?
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

henry@utzoo.uucp (Henry Spencer) (03/06/88)

> Aggregate constants are needed for data abstraction.

I guess I'm simple-minded; you'll have to explain in more detail.  I use
data abstraction routinely and have never *needed* aggregate constants.
I also find it difficult to envision a situation in which it would be
impossible to write

	static const struct thingie xxx = { ... };
	...
	foo = xxx;

instead of

	foo = { ... };

It is agreed that the latter form is more convenient.  But we were talking
about *needs*, in the context of an existing language, not about a wishlist
for a new language.

> The problem of deciding what their type is solved for now (and possibly
> always) by stating in the standard that such syntax has no inherent type
> and must be cast or assigned to the desired type.

[expletive deleted]  Speaking as a user and an implementor, this is an
abortion if there ever was one.  If one *must* add aggregate constants to
the language -- preferably as an experimental variant and not as part of
the effort to STANDARDIZE THE CURRENT LANGUAGE, DAMMIT! -- then the right
way to do it is probably the GNU compiler's approach, which avoids this
hideous botch entirely.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/07/88)

In article <1988Mar5.213746.12022@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>> >... "Need not convincingly shown; existing constructs can be used to
>> >same effect; lack of implementation experience in C."
>> I think a good case could be made for some such facility...
>Um, Doug, please explain how said good case would refute any, let alone
>all, of the three objections I cited.  Note that the first objection says
>"need", not "wish".  For heaven's sake, have we forgotten that X3J11's
>supposed mission was to *standardize* C, not redesign it?!?

I meant that I think it could be reasonably argued that there IS a need
for such a facility.  In its absence, we've been resorting to kludges
using "awk" scripts and such to help build initializer tables as separate
files which are then #included at compile time.  This requires introducing
a lot of unique struct names that are of no value to the programmer other
than part of the workaround.  It is embarrassing to admit that this could
have been solved by adjusting the language design, but wasn't.

If you haven't run into this problem, consider yourself fortunate.

It isn't absolutely necessary to have had previous implementation
experience with a proposed feature, if it is clear to everyone how
it would be implemented and what the consequences would be.

Certainly, just because a feature is a "nice idea" is not sufficient
reason to toss it into the standard, but when an idea solves an existing
real programming problem in a clean way it is worth considering.  Part
of the charter of X3J11, apart from standardizing the common part of
existing practice, is to find fixes for problems in existing practice.
(If existing practice were free of problems, there would be no need for
a standard!)

At this point my feeling is that X3J11 will not want to add any more
"features" to the language, since they're trying to get the final
standard published as soon as possible.  However, if an extremely strong
case can be made for a new proposal, I'm sure it will still receive
serious consideration.

decot@hpisod2.HP.COM (Dave Decot) (03/07/88)

I write:
> > Aggregate constants are needed for data abstraction.

Henry Spencer responds:
> I guess I'm simple-minded; you'll have to explain in more detail.  I use
> data abstraction routinely and have never *needed* aggregate constants.

In this sense of "need", we obviously don't "need" strong typing
prototypes or "const" specifiers, either, since we have managed to
limp along without them for quite a long time.

See below for my explanation of a situation greatly simplified by the
ability to treat structures and arrays as if they were actually
manipulable objects instead of "special" things deserving special
attention.

I fail to see that the committee has drawn and observed "need" as
a clear boundary for what should be standardized.

> I also find it difficult to envision a situation in which it would be
> impossible to write
> 
> 	static const struct thingie xxx = { ... };
> 	...
> 	foo = xxx;
> 
> instead of
> 
> 	foo = { ... };

It isn't "impossible", obviously, but it makes it impossible for me to
provide a library involving such constants that would permit
applications to use values of that type as initializers without knowing
whether it was a structure type.

The support of aggregate constants would provide the ability to change
the representation of small abstract 'magic cookie' types from integer
types to more complicated types as the system evolves.  For instance, we
could have declared signal's second argument action_t, and later changed
it from an integer to a structure without having to introduce sigvec()
(or sigaction()) to support more complicated arguments.

For another instance, suppose I would like to be able to make clock_t or
dev_t a structure type (since the longest portable integral type is
"long"), and I would like to provide some constants of those types as
part of an interface to application library routines that use them.  At
the moment, these constants would only work correctly in initializers.

> > The problem of deciding what their type is solved for now (and possibly
> > always) by stating in the standard that such syntax has no inherent type
> > and must be cast or assigned to the desired type.
> 
> [expletive deleted]  Speaking as a user and an implementor, this is an
> abortion if there ever was one.

Thank you for the feedback.  But I wonder if you would be so kind as to
elaborate on this comment, unless it would dangerously elevate your
blood pressure.  :-)  I'm afraid it is not clear to me that all usages
of structure constants are self-explanatory.

I intended the above restriction to be a restriction on portable
applications to make it EASIER for implementations to determine what to
do with inline aggregate constants.  It also turns out that casts would
seldom be necessary under the given restriction.  I'm also not really
intending that the standard should preclude implementations that manage
to figure it out.

It would be fine if I didn't have to cast the constants to the
right type and the compiler could figure out that "func({13, 0, 0})"
was passing a constant of type "struct wow {unsigned short int x;
char *foo; char bar;}" without help, but I think code taking advantage
of that relaxation would be harder to read and I'd add the (struct wow)
cast anyway.

> preferably as an experimental variant and not as part of
> the effort to STANDARDIZE THE CURRENT LANGUAGE, DAMMIT!

Whatever you like.  I just thought it would be neat if it would work
the same way in all implementations that decide to provide this
natural extension.

>-- then the right
> way to do it is probably the GNU compiler's approach, which avoids this
> hideous botch entirely.

I wonder if you would mind summarizing that approach for those of us who
don't have access to that compiler's source code.

At any rate, your comment seems to imply that this is existing practice,
so I am having trouble seeing why this is a topic that should not be
standardized.

Thanks.

Dave Decot
hpda!decot

flaps@dgp.toronto.edu (Alan J Rosenthal) (03/07/88)

I, flaps@dgp.toronto.edu (Alan J Rosenthal), wrote:
>>... because you can't specify shorts in expressions.  Any
>>attempt to fix this would probably be a very bad addition.

In article <2743@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>There is an obviously correct way to permit specification of short
>constants: append an s.  Exactly analogous to appending an l for long
>constants.

Not very analogous to appending an L for long constants.  Quite
analogous to appending an F for floating constants, as in the (current)
ansi draft.  It requires doing away with default promotion rules, as
they have for float -> double.  Your method of specifying shorts would
generate ints, just as (short)3 now does.

ajr
-- 
If you had eternal life, would you be able to say all the integers?

karl@haddock.ISC.COM (Karl Heuer) (03/09/88)

In article <139@puivax.UUCP> ian@puivax.UUCP (Ian Wilson) writes:
>Another place where anonymous objects would be handy is for functions
>embedded in data structures. For example: ...
>	(*(&remote( (*)(char * x), {printf("%s\n", x);} )));

Actually, to be consistent with my proposal the first operand of `remote`
should be the type of the object, not of the pointer.  Thus
	remote( void (char *x), {printf("%s\n", x);} )
would be equivalent to mentioning the (non-existent) name of the remote
function, hence this gives you a function pointer without using "&".  (As has
been pointed out to me by e-mail, I made this mistake in my char[] example.)
Using it as above would allow you to stuff it into a function table; or you
could invoke it via
	remote( void (char *x), {printf("%s\n", x);} )("hello, world");

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

flaps@dgp.toronto.edu (Alan J Rosenthal) (03/09/88)

decot@hpisod2.HP.COM (Dave Decot) wrote:
>The problem of deciding what their type is solved for now (and possibly
>always) by stating in the standard that such syntax has no inherent type
>and must be cast or assigned to the desired type.

henry@utzoo.uucp (Henry Spencer) responded:
>[expletive deleted]  Speaking as a user and an implementor, this is an
>abortion if there ever was one.

decot@hpisod2.HP.COM (Dave Decot) responded:
>Thank you for the feedback.  But I wonder if you would be so kind as to
>elaborate on this comment ...
>I'm afraid it is not clear to me that all usages of structure constants
>are self-explanatory.

They're not, but the point is that this is a totally new meaning for a
cast.  Usually, a cast is a unary operator.  Here, it's part of the
description, like a declaration.  To be specific: "(int)f" is an
expression that gets an int-ish value for the floating expression f.
It doesn't change f.  In your proposal, (struct a){...} and (struct b){...}
imply different values for {...}.  This is very inconsistent, even
worse than the inconsistencies between casting pointers and casting
numerical types.  Also, your (struct ..) cast will produce an lvalue,
which is also a big inconsistency.

I think it's clear that a new syntax, such as Karl Heuer's, is required.
I would prefer a symbol rather than a keyword, but that's just me.
(But: how about any of backquote, dollar, at (@), or right square bracket?
Perhaps unary comma?  Perhaps if the `&' is immediately followed by an
opening brace?)

ajr
-- 
If you had eternal life, would you be able to say all the integers?

decot@hpisod2.HP.COM (Dave Decot) (03/11/88)

> decot@hpisod2.HP.COM (Dave Decot) wrote:
> >The problem of deciding what their type is solved for now (and possibly
> >always) by stating in the standard that such syntax has no inherent type
> >and must be cast or assigned to the desired type.
> 
> They're not, but the point is that this is a totally new meaning for a
> cast.  Usually, a cast is a unary operator.  Here, it's part of the
> description, like a declaration.

No, I intend that as far as the language is concerned, {...} is an expression
that has a value, but there are only two operations defined on such values:
assigning it to an lvalue using '=', or casting it to the appropriate
aggregate type.

This concept is somewhat the cousin of (void *), which generates pointers to
values that are not usable directly; such pointers must be cast to some
other pointer type before the object to which they point can be used.

> To be specific: "(int)f" is an
> expression that gets an int-ish value for the floating expression f.
> It doesn't change f.

Depends on what you mean by "f".  This cast changes the bit pattern
used for representing the value, surely.   Pointer casts change the
type of the value to which the expression points, which may give them
a completely different abstract interpretation.

> In your proposal, (struct a){...} and (struct b){...} imply different
> values for {...}.

What do you mean by "different values"?  Of course they are different,
they're of different types!  If you mean "bit patterns", this is also
true of your cast example above.

Anyway, this same syntax represents the same type of typeless value
that could be said is used for initializers; the value of these
depends on the type of the variable which it initializes.

> This is very inconsistent, even worse than the inconsistencies
> between casting pointers and casting numerical types.

It's more similar to casting numerical types than casting pointers,
and I think it should be, in order to make these things look like
constants.

> Also, your (struct ..) cast will produce an lvalue,
> which is also a big inconsistency.

That is a misinterpretation, and I'm not sure how you arrived at it.
Cast only produce rvalues; I assumed this would be obvious.

Dave Decot
hpda!decot

henry@utzoo.uucp (Henry Spencer) (03/12/88)

> I fail to see that the committee has drawn and observed "need" as
> a clear boundary for what should be standardized.

They have tried.  Sometimes they've let their enthusiasm run away with
them; these are reprehensible lapses that should not be considered an
excuse to others to do likewise!

> > [expletive deleted]  Speaking as a user and an implementor, this is an
> > abortion if there ever was one.
> 
> Thank you for the feedback.  But I wonder if you would be so kind as to
> elaborate on this comment...

*Why* introduce a new notion of something that doesn't have a type (actually
it does have a type, some sort of curious mix of the types of the things
inside it) when it is easy to invent a syntax (or borrow the one from the
GNU compiler) that specifies the type?!?

> > ... probably the GNU compiler's approach, which avoids this
> > hideous botch entirely.
> 
> I wonder if you would mind summarizing that approach for those of us who
> don't have access to that compiler's source code.

As I recall it -- I have not studied the GNU compiler closely yet -- the
technique used is a sort of "cast with an initializer".

> At any rate, your comment seems to imply that this is existing practice,
> so I am having trouble seeing why this is a topic that should not be
> standardized.

"Existing practice" means that it has been out there for a while, that
people other than its implementors have used it at some length, and that
it has been used for more than just toy programs.  That does not happen
overnight, the GNU compiler is very new, and the draft standard is
(theoretically) in virtually its final state.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

peter@sugar.UUCP (Peter da Silva) (03/12/88)

In article ... henry@utzoo.uucp (Henry Spencer) writes:
> 	foo = { ... };

> It is agreed that the latter form is more convenient.  But we were talking
> about *needs*, in the context of an existing language, not about a wishlist
> for a new language.

I implemented just this construct in a copy of the Small-C compiler I was
playing around with about 6 years ago. I had just picked up a copy of the
BCPL book and wanted to play with the concepts.

I also implemented this:

	foo = { for(i = 0; i < 10; i++) if(...) break; i; };

in analogy to the BCPL:

	foo = $( ... resultis i; $)

Back to the subject.. the problem of what type an aggregate constant is
is a lot easier in Small-C. It's only got 4 types. But if you need prior art
to consider this, well here's two examples (half-smiley).

> way to do it is probably the GNU compiler's approach, which avoids this
> hideous botch entirely.

What's the GNU compiler's approach?
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

karl@haddock.ISC.COM (Karl Heuer) (03/14/88)

In article <8803091705.AA09183@explorer.dgp.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>I think it's clear that a new syntax, such as Karl Heuer's, is required.
>I would prefer a symbol rather than a keyword, but that's just me.

I'd prefer a less verbose syntax, too, but I can't think of anything that
isn't arbitrary and doesn't overload an existing syntax.

>(But: how about any of backquote, dollar, at (@), or right square bracket?
>Perhaps unary comma?  Perhaps if the `&' is immediately followed by an
>opening brace?)

Void the first three; I don't think we should add to the character set for
something this simple.  Unbalanced brackets would be a big headache (and
possibly an ambiguity; I wouldn't want to bet that this construct can never
appear inside brackets).  With unary comma, or ampersand-brace, where are you
going to put the type information?  And in the latter case, if you want the
aggregate itself rather than a pointer would you have to write `*&{...}`?

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

franka@mmintl.UUCP (Frank Adams) (03/16/88)

>They're not, but the point is that this is a totally new meaning for a
>cast.  Usually, a cast is a unary operator.  Here, it's part of the
>description, like a declaration.  ...  In your proposal, (struct a){...}
>and (struct b){...} imply different values for {...}. ...  Also, your
>(struct ..) cast will produce an lvalue, which is also a big inconsistency.

This is the wrong the interpretation of this syntax.  If such a syntax is
adopted, the correct interpretation runs as follows:

	An expression of the form {...} is of an anonymous struct type,
	whose components are unnamed, and have the types of the
	expressions given.  Casting a struct to another struct results
	in an element by element cast of the components of the first
	struct to the components of the second.  Likewise for casting a
	struct to an array.

	Taking the address of a constant results in a literal copy of
	the constant being allocated, and the result is a constant
	pointer to that literal.

I don't believe that K&R anywhere specify the semantics of casting structs;
perhaps X3J11 does.  I am sure that there is a fair amount of (non-portable)
code out there which assumes that such casts are interpreted as "take as".
Just how much, I don't know.

The main problem with this proposal is the parsing problem.  When do {}'s
enclose a struct literal, and when a compound statement?  The parser
must be able to tell when it sees the first {.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

henry@utzoo.uucp (Henry Spencer) (03/20/88)

>I don't believe that K&R anywhere specify the semantics of casting structs;
>perhaps X3J11 does.  I am sure that there is a fair amount of (non-portable)
>code out there which assumes that such casts are interpreted as "take as".

I doubt it, given that most existing compilers refuse to cast structs at all.
X3J11 says you can't cast structs (except to void).  K&R doesn't mention the
issue at all, since it dates back to a time when there were no struct values
at all.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

john@viper.Lynx.MN.Org (John Stanley) (03/24/88)

In article <2768@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
 >>They're not, but the point is that this is a totally new meaning for a
 >>cast.  Usually, a cast is a unary operator.  Here, it's part of the
 >>description, like a declaration.  ...  In your proposal, (struct a){...}
 >>and (struct b){...} imply different values for {...}. ...  Also, your
 >>(struct ..) cast will produce an lvalue, which is also a big inconsistency.
 >
 >This is the wrong the interpretation of this syntax.  If such a syntax is
 >adopted, the correct interpretation runs as follows:

  While I agree that the original poster had the wrong idea (it's
pretty obvious that a constant, struct or not, can't possibly be an
lvalue), I somewhat resent the attitude that your interpretation is 
"The one and only" correct one.  I happen to disagree with part of your
interpretation.  Does that mean I'm wrong, or just that there's still
discussion necessary?

 >	An expression of the form {...} is of an anonymous struct type,
 >	whose components are unnamed, and have the types of the
 >	expressions given.  Casting a struct to another struct results
 >	in an element by element cast of the components of the first
 >	struct to the components of the second.  Likewise for casting a
 >	struct to an array.

  I think you're on the right track, and I can see some definate
usefulness in the definition as you've defined it, but it looks like
we might run into some problems in implementing it.  What if the
struct(s) have a different number of elements?  What happens if you
cast a struct containing a long into a character array?  This is a
start, but it needs more work.

 >	Taking the address of a constant results in a literal copy of
 >	the constant being allocated, and the result is a constant
 >	pointer to that literal.

  I can't see any usefulness to this part of your proposal.  It
introduces the necessity for dynamic creation of data with the
mechinism compleatly hidden from the programmer.  If the programmer
wants a copy, let the programmer do it.  Let the programmer treat a
constant struct as a constant, not something that's going to move
every time he/she calls a function containing one.
  A constant struct should be allocated as part of the constant data
page same as any string.  Currently, there's no difference between:
   char aba[10] = "abcdefghi";
      and
   char aba[10] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', '\0'};
and I can't see any reason we should change this aspect of the
language...

 >The main problem with this proposal is the parsing problem.  When do {}'s
 >enclose a struct literal, and when a compound statement?  The parser
 >must be able to tell when it sees the first {.

  The parser doesn't need to know anything.  It's not the parsers job
to know anything about compound statements or structs...  The compiler,
on the other hand, should be able to differentiate from context (same 
way it would when compiling a variable definition line like the char 
array one given above.

 >Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
 >Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

--- 
John Stanley (john@viper.UUCP)
Software Consultant - DynaSoft Systems
UUCP: ...{amdahl,ihnp4,rutgers}!meccts!viper!john

franka@mmintl.UUCP (Frank Adams) (03/29/88)

In article <744@viper.Lynx.MN.Org> john@viper.Lynx.MN.Org (John Stanley) writes:
>In article <2768@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
> >[Quoting Henry Spencer, if memory serves.  Henry, just to set the record
> > straight, was commenting on someone else's proposal, not mine.]
> >>They're not, but the point is that this is a totally new meaning for a
> >>cast.  Usually, a cast is a unary operator.  Here, it's part of the
> >>description, like a declaration.  ...  Also, your (struct ..) cast will
> >>produce an lvalue, which is also a big inconsistency.
> >
> >This is the wrong the interpretation of this syntax.  If such a syntax is
> >adopted, the correct interpretation runs as follows:
>
>  While I agree that the original poster had the wrong idea, I somewhat
>resent the attitude that your interpretation is "The one and only"
>correct one.

The attitude you resent is a figment of your imagination.  I was, of course,
just expressing my opinion.  You are free to agree or disagree.  I've even
been wrong once or twice in my life (:-).

> >Casting a struct to another struct results in an element by element cast
> >of the components of the first struct to the components of the second.
> >Likewise for casting a struct to an array.
>
>What if the struct(s) have a different number of elements?

To match the K&R initialization syntax as closely as possible, when casting
to a struct with more elements, set the excess to zeros.  When casting to a
struct with fewer elements, one could either ignore the extra ones, or
report an error from the compiler.  I would lean toward the latter.

This does mean that extra braces in initializations become non-optional.
That is, one can no longer write:

int x[2][2] = {1, 2, 3, 4};

but must instead use

int x[2][2] = {{1, 2}, {3, 4}};

This is a non-trivial change.  It may be possible to adjust the definition
so that this does not happen; I'll have to think about it.

>What happens if you cast a struct containing a long into a character array?

The long gets converted to a character.  C remains an industrial-strength
language.

> >	Taking the address of a constant results in a literal copy of
> >	the constant being allocated, and the result is a constant
> >	pointer to that literal.
>
>  I can't see any usefulness to this part of your proposal.  It
>introduces the necessity for dynamic creation of data with the
>mechinism compleatly hidden from the programmer.

You misunderstand.  I want the *compiler* to allocate the literal copy, not
the *program*.  The same way it does now for statements like:

char * p = "This is a string.";

>Currently, there's no difference between:
>   char aba[10] = "abcdefghi";
>      and
>   char aba[10] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', '\0'};
>and I can't see any reason we should change this aspect of the language...

Right.  Under this proposal, there is also no difference between:
   char *abc = "abcdefghi";
      and
   char *abc = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', '\0'};

Whereas the latter is currently illegal.

> >The main problem with this proposal is the parsing problem.  When do {}'s
> >enclose a struct literal, and when a compound statement?  The parser
> >must be able to tell when it sees the first {.
>
>It's not the parsers job to know anything about compound statements or
>structs...  The compiler, on the other hand, should be able to
>differentiate from context (same way it would when compiling a variable
>definition line like the char array one given above).

Guess again.  "The part of the compiler which differentiates things from
context" is a fairly good definition of the parser.  You are perhaps
confusing it with the scanner, which divides the source code into tokens.
(This terminology is not completely standard, but very nearly so.)

Roughly, the problem is that one must look arbitrarily far ahead in the
source code to disambiguate the two constructs in some cases.  Modern
parsing techniques depend on being able to do so relatively quickly.

Compare, for example, the following statements:

   {1, 2, 3, 4, 5, 6, 7;}

vs

   {1, 2, 3, 4, 5, 6, 7};

Now, neither of these statements actually does anything, but both are legal
C statements (the first already, the second with the proposed enhancement).
The first is a compound statement, containing no declarations and a single
enclosed statement.  That statement is an expression, involving 7 constants,
and 6 "," operators.  The second is an expression statement; the expression
is anonymous structure constant, with 7 components.  Yet, we cannot tell
them apart until we get practically to the end.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108