[comp.std.c++] Randomly ordered fields !?!?

rfg@NCD.COM (Ron Guilmette) (08/04/90)

In article <56268@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>In article <56165@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>>Proposed:
>>
>>We need a way to "pack" objects over inheritence boundaries, and possibly
>>between labeled fields of a declaration.  I do not propose here exactly
>>what the "right" way to do this is.
>
>I should clarify that I was not proposing that every compiler actually has
>to do the packing, nor that there need to be a standard way to do that 
>packing.  What I was proposing is that we need a way to turn off the standard
>C++ field ordering restrictions, such that a compiler can choose to lay out
>those fields in any order it feels is appropriate.  Perhaps the easiest way 
>to accomplish this is just throw away the traditional restrictions on field
>ordering.

There already exists a defined way within the language to "turn off the
standard C++ field ordering restrictions".

Section 9.2 (page 173, middle of the page) in E&S says:

	"The order of allocation of nonstatic data members {which are}
	separated by an access specifier is implementation dependent (11.1)".

Therefore, the members of the following type, may legally be laid out in
*any* arbitrary ordering by a "standard conforming" compiler.

	struct S {
		int f1:22;
	public:
		int f2:10;
	public:
		int f3:3;
	public:
		int f4:5;
	public:
		char f5;
	public:
		short f6;
	};

I personally find the rule cited above intensely offensive because it gives
the compiler licence to thwart the will of the programmer (who is supposed
to be in control after all).  I don't like that one little bit!

(I tried to point out this problem here once before, but either nobody was
listening or else nobody was cared).

I am *not* in the camp that believes that the compiler should be given even
more leway to mess around with my intentions, and to change my intended
meaning.  Quite the opposite.

I'm a big boy now.  I can establish a good and proper ordering for the data
members within my structs and classes all by myself, and without any help
from the compiler, thank you.  In fact I would *prefer* to do it myself,
because I believe that it is far more likely that I will understand where
things out to go (or must go) than the compiler will.

Consider the following:

	class C {
		unsigned private_member_1:22;
	public:
		unsigned public_member_2:10;
	private:
		unsigned private_member_3:7;
	public:
		unsigned public_member_4:25;

		C ();
		// ... other member functions
	};

Now I wrote the fields in the order shown so that they would fit perfectly
into two words *only* as long as they are actually allocated in the order
shown.  Unfortunately, the rule cited above allows "standard conforming"
compilers to thwart my intentions and to reorder the fields in any damn
order they please.

The implication is that I simply cannot make use of access specifiers (and
the benefits they can provide) for any class or struct where field ordering
is critical (either for reasons of space economy or because I am dealing
with some hardware-defined layout, such as commonly arises in cases of
memory-mapped I/O devices with multiple control/data registers).

So what I'm trying to say is that we DO NEED to have some way of telling
compilers "pack the last field of the parent class into the same word
(if possible) as the first field of the derived class" but what we DON'T
NEED is the current rule that allows the compiler to rearrange fields
any way it pleases.

These issue are mostly separate, until you see that the achievment of
"good" packing between a base class and a derived class may depend upon
the ability of the programmer to force a particular (packable) field
to be allocated *last* within a struct or class.

Right now, if you want to (or need to) use access-specifiers within the
base class, it is IMPOSSIBLE select (or to force) the proper allocation.

-- 
// Ron Guilmette
// C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

jimad@microsoft.UUCP (Jim ADCOCK) (08/17/90)

In article <1070@lupine.NCD.COM> rfg@ncd.com writes:
>There already exists a defined way within the language to "turn off the
>standard C++ field ordering restrictions".
>
>Section 9.2 (page 173, middle of the page) in E&S says:
>
>	"The order of allocation of nonstatic data members {which are}
>	separated by an access specifier is implementation dependent (11.1)".

Yes.  So given that access specifiers and/or inheritence can bust people's
expectations on packing order, why maintain restrictions on packing order
at all?  Or why not require special actions, such as adding a "extern "C"'
caveat, be used by people who have the unusual need of matching machine
registers?  What percentage of C++ programmers need to diddle machine
registers, and how often?

>I personally find the rule cited above intensely offensive because it gives
>the compiler licence to thwart the will of the programmer (who is supposed
>to be in control after all).  I don't like that one little bit!

Actually, the will of this programmer is that C++ compilers generate the
fastest and most compact code possible from my class declarations.  It 
is not C++ compilers that are tharting this will, but rather people who
want to continue using C++ as a C assembly language.  I want to be able
to use C++ as a fast, efficient object oriented language.  This requires
compilers that have great optimization, and great freedom to optimize.

There continues this division between C/C++ people who want a compiler
to "do what I say" verses "do you best optimizations."  I think this
issue should be addressed in the marketplace -- not in the language.
If you want a C++ compiler that acts just like C that acts just like
assembler, then surely the marketplace will offer such a compiler from
some vendor.  If you want to use C++ to play with machine registers,
then portability is not an issue to you anyway.  -- Or alternatively, 
probably just about any vendors compiler with all optimizations turned 
off will fullfill your requirements.

Again, why should everyone's C++ compiler have to generate slower,
more memory intensive code in order to meet the needs of people who
feel they want to maintain intimate control over their compiler?
Why can't they just buy a compiler that gives them that control?

>So what I'm trying to say is that we DO NEED to have some way of telling
>compilers "pack the last field of the parent class into the same word
>(if possible) as the first field of the derived class" but what we DON'T
>NEED is the current rule that allows the compiler to rearrange fields
>any way it pleases.

I disagree.  A common C++ hack I've seen in a number of libraries is
an attempt to inherit partial bit fields, so that a base class can
use a couple bits for flags, and subsequent derived classes can steal
a couple bits for their usages.  You can't do this with bit-fields right
now, and have to use macro/enum hacks instead.

rfg@NCD.COM (Ron Guilmette) (08/18/90)

In article <56638@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>In article <1070@lupine.NCD.COM> rfg@ncd.com writes:
>>There already exists a defined way within the language to "turn off the
>>standard C++ field ordering restrictions".
>>
>>Section 9.2 (page 173, middle of the page) in E&S says:
>>
>>	"The order of allocation of nonstatic data members {which are}
>>	separated by an access specifier is implementation dependent (11.1)".
>
>Yes.  So given that access specifiers and/or inheritence can bust people's
>expectations on packing order, why maintain restrictions on packing order
>at all?

Perhaps you missed my point.  I was suggesting that the rule quoted above
be REVOKED and DELETED from the final ANSI standard.

>...  What percentage of C++ programmers need to diddle machine
>registers, and how often?

What percentage of C++ programmers *need* to use default parameters?  What
percentage of C++ programmers *need* to use multiple inheritance?

In the case of each of these language features a good case can be made that
0% of C++ programmers *need* these features.  Should we therefore chuck them
out of the language?  I think not.

The question really should be "How much harder would doing some reasonably
common programming task, X, be if feature (or constraint) F were ommited
from the language?" and also "How difficult is it for a typical implementor
to implement feature (or constraint) F?"

In the case of a constraint which would require compilers to maintain the
programmer-specified ordering of data fields, the implementation is trivially
easy.  (In fact, it is actually *harder* to do anything else!)  Also, the
usefulness of allowing the programmer to precisely specify his desired
field ordering is (in my opinion) quite large.  As I noted in a prior
posting, programmers may often need to do this when dealing with data
formats dictated by outside constraints (e.g. existing disk formats, existing
datacomm protocols, memory mapped I/O devices, and (last but not least)
layouts mandated by external routines (e.g. from libc.a) written in other
languages such as C).

>>I personally find the rule cited above intensely offensive because it gives
>>the compiler licence to thwart the will of the programmer (who is supposed
>>to be in control after all).  I don't like that one little bit!
>
>Actually, the will of this programmer is that C++ compilers generate the
>fastest and most compact code possible from my class declarations.

Are you suggesting that it may be possible to generate more efficient
executable code if arbitrary field re-ordering (by the compiler) is allowed?
If so, are you prepared to give realistic (not contrived) examples of such
improvements?

If (on the other hand) you are only suggesting that a compiler could possibly
pack fields in some better order than I can, let me assure you that you are
wrong.  If you would care to organize a test, I'll be glad to pit my brain
against your compiler in a "structure packing" test any day of the week.
As a matter of fact, I surmize that the intelligence necessary to do good
structure packing is not even beyond the reach of your common garden slug.
This fact makes me all the more confused about your insistance that you
want compilers to do this for you.  Are you low-rating yourself? :-)

>... There continues to be a division between those who want a compiler
>to "do what I say" verses "do you best optimizations."  I think this
>issue should be addressed in the marketplace -- not in the language.
>If you want a C++ compiler that acts just like C that acts just like
>assembler, then surely the marketplace will offer such a compiler from
>some vendor.  If you want to use C++ to play with machine registers,
>then portability is not an issue to you anyway.

You have made a rash generalization which is untrue.  I may be concerned with 
data formats (on disk or in a communications channel) and I may also be
concerned with portability.  Furthermore, I may even be diddling machine
registers and I may *still* be concerned with portability!!!  Obviously,
you have never seen any C code which drives some specialized peripherial
chip (e.g. an ethernet controller) which gets ported from one CPU to
another as that peripheral chips gets used in different kinds of systems.
I have seen this though.

Also, independent of the portability issue, there is also the issue of
"vendor independence".  Even if my code stays forever on this one system,
if I find a new compiler that I think generates better code, and I buy it
and install it and recompile my old code with it, I'd kinda like that old
code to continue working.  I don't want to see it start to core dump just
because compiler writers were given the freedom to rearrange my field
orderings willy-nilly and because they all choose to do it differently.

Right now, if I have the following C code:

	struct s {
		unsigned	field1:24;
		unsigned	field2:8;
	};

and this is in a program running on a 68000, I can get C compilers from at
least a dozen vendors that will lay this out *exactly* the same way.  Thus,
I have vendor independence.  That (and portability) are worth money to me.

Now I find out that I've lost this with C++, and I've gotten virtually
nothing in return. :-(

>... Or alternatively, 
>probably just about any vendors compiler with all optimizations turned 
>off will fullfill your requirements.

Not if it is not *required* by the standard!!!

>Again, why should everyone's C++ compiler have to generate slower,
>more memory intensive code in order to meet the needs of people who
>feel they want to maintain intimate control over their compiler?
>Why can't they just buy a compiler that gives them that control?

Again, you have presented no evidence whatsoever that obeying the programmer's
wishes with regard to field ordering will result in slower (or "more memory
intensive") code in any real programs.  Until you do, this argument should
be viewed as simply a red herring.  I believe that this argument is bogus,
and that it has no basis in fact.  Programmers need not suffer any performance
penalty whatsoever in order to have the compiler obey their wishes.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

marc@dumbcat.sf.ca.us (Marco S Hyman) (08/19/90)

jimad> = jimad@microsoft.UUCP (Jim ADCOCK)
	 article <56638@microsoft.UUCP>

jimad> If you want to use C++ to play with machine registers, then
jimad> portability is not an issue to you anyway.  -- Or alternatively,
jimad> probably just about any vendors compiler with all optimizations
jimad> turned off will fullfill your requirements.

Let's make a distinction between code that is portable to different machines
and code that is portable between different compilers on the same machine.
Needing to play with machine registers will certainly limit machine
portability.  But I certainly want to compile my machine dependent code with
*any* C++ compiler and have the code work.

This means that any special optimizations to make the code faster or tighter
must not be the compiler's default.  Once I've got things working then I'll
profile the code, check out better algorithms, and look at the optimizations
available with the compiler.

// marc
-- 
// marc@dumbcat.sf.ca.us
// {ames,decwrl,sun}!pacbell!dumbcat!marc

amanda@iesd.auc.dk (Per Abrahamsen) (08/19/90)

>>>>> On 18 Aug 90 07:46:50 GMT, rfg@NCD.COM (Ron Guilmette) said:

rfg> As a matter of fact, I surmize that the intelligence necessary to
rfg> do good structure packing is not even beyond the reach of your
rfg> common garden slug.  This fact makes me all the more confused
rfg> about your insistance that you want compilers to do this for you.
rfg> Are you low-rating yourself? :-)

If it is that easy, then even a computer should be able to do it.  Why
should the programmer waste his time doing work, which the compiler
could do just as easily?  Maybe he prefers to spend his time arranging
the fields in an order, which makes it easy for humans to understand?

jimad@microsoft.UUCP (Jim ADCOCK) (08/21/90)

In article <195@dumbcat.sf.ca.us> marc@dumbcat.sf.ca.us (Marco S Hyman) writes:
>jimad> = jimad@microsoft.UUCP (Jim ADCOCK)
>	 article <56638@microsoft.UUCP>
>
>jimad> If you want to use C++ to play with machine registers, then
>jimad> portability is not an issue to you anyway.  -- Or alternatively,
>jimad> probably just about any vendors compiler with all optimizations
>jimad> turned off will fullfill your requirements.
>
>Let's make a distinction between code that is portable to different machines
>and code that is portable between different compilers on the same machine.
>Needing to play with machine registers will certainly limit machine
>portability.  But I certainly want to compile my machine dependent code with
>*any* C++ compiler and have the code work.

You've never had anything close to capability in the past, so the changes
I am suggesting aren't taking anything away from you.  Nor is there
anything in the standardization effort I can think of that could even
possibly give you these capabilities.  What might emerge -- eventually --
is defacto standards where vendors of compilers on a particular machine
agree on how things should be done on that machine.  But I claim the
marketplace is the right place to make such decisions.  Let's not ham-string
compilers from the start.

>This means that any special optimizations to make the code faster or tighter
>must not be the compiler's default.  Once I've got things working then I'll
>profile the code, check out better algorithms, and look at the optimizations
>available with the compiler.

Some special optimizations to make code faster or tighter are the default
even in today's C compilers.  Even if you turn off all optimizations, 
typical compilers still perform some particularly 'safe' optimizations.

Historically, optimizations have primarily been applied to callee code --
the body of functions.  In OOP the problem shifts towards caller optimizations
-- the way a function gets called.  So C optimization problems don't carry
over very well to C++ optimization problems.  Also, inline functions and
templates generate pretty 'bad' code -- overly general code, that needs
to be customized by an optimizer to meet the particular needs of the 
calling environment.

jimad@microsoft.UUCP (Jim ADCOCK) (08/21/90)

In article <1229@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>In article <56638@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>>In article <1070@lupine.NCD.COM> rfg@ncd.com writes:
>>>There already exists a defined way within the language to "turn off the
>>>standard C++ field ordering restrictions".
>>>
>>>Section 9.2 (page 173, middle of the page) in E&S says:
>>>
>>>	"The order of allocation of nonstatic data members {which are}
>>>	separated by an access specifier is implementation dependent (11.1)".
>>
>>Yes.  So given that access specifiers and/or inheritence can bust people's
>>expectations on packing order, why maintain restrictions on packing order
>>at all?
>
>Perhaps you missed my point.  I was suggesting that the rule quoted above
>be REVOKED and DELETED from the final ANSI standard.

Gee, you mean we agree?

>Are you suggesting that it may be possible to generate more efficient
>executable code if arbitrary field re-ordering (by the compiler) is allowed?

Yes.

>If so, are you prepared to give realistic (not contrived) examples of such
>improvements?

Yes and no.  Clearly, I am not free to talk about MS future plans.
Hopefully, even a few obvious examples should demonstrate the plausability of
changes in packing order leading to faster code.

1) Fields used frequently together could be moved together, speeding code
on machines with bus prefetch.  "Fields frequently used together" could
be automatically determined via compiler profiling, and might actually 
be different for different usages of the same class in different programs.

2) Vtable pointers could be moved close to the "this" address, allowing
machine codes with short offsets be generate in these important cases.
Again, one class used differently in different programs might lead to
different lay-out choices.

3) char "booleans" or shorts could be packed-into a base class, leading
to smaller total memory usage, less page turns, and thus faster execution
in practice.

>If (on the other hand) you are only suggesting that a compiler could possibly
>pack fields in some better order than I can, let me assure you that you are
>wrong.  If you would care to organize a test, I'll be glad to pit my brain
>against your compiler in a "structure packing" test any day of the week.
>As a matter of fact, I surmize that the intelligence necessary to do good
>structure packing is not even beyond the reach of your common garden slug.
>This fact makes me all the more confused about your insistance that you
>want compilers to do this for you.  Are you low-rating yourself? :-)

Not at all.  I don't want to have to act like an assembler for the 
compiler.  Let the compiler do the low stuff, and I'll do the high
stuff.  In the face of inheritence, only compilers can do packing-in
optimizations without violating encapsulation.  Joe ought to be able
to write an efficient base class, and Sue ought to be able to derive
from that base class, and the derivation should make good use of the
available space in the base class.  Even if you could explicitly state
how you want compilers to pack-in fields, then the resultingg code would
not be portable, because differing machines are going to have differing
alignment constraints, leading to differing "holes" in the original
base class.

>
>Right now, if I have the following C code:
>
>	struct s {
>		unsigned	field1:24;
>		unsigned	field2:8;
>	};
>
>and this is in a program running on a 68000, I can get C compilers from at
>least a dozen vendors that will lay this out *exactly* the same way.  Thus,
>I have vendor independence.  That (and portability) are worth money to me.
>
Agreed. That's why I suggested that one approach might be that "struct"
means the same layout as C, as long as you don't derive.  Then all your
C code continues to work.  One can think up reasons why one might want
to use inheritence in matching machine registers, but that seems pushing it,
to my mind.
>
>Not if it is not *required* by the standard!!!

I wasn't proposing any new requirements on compilers.  I was proposing
relaxing requirements on compilers that were intended as a weak 
restriction towards making C++ classes work like C structs.

---

Ron continues on, arguing strongly for the advantages on manually clustering
over automatic clustering.  I disagree.  I think performance in other
OOPLs and OODBs show that manual clustering doesn't work, because patterns
of usage do not remain constant.  Thus clustering needs to be done 
automatically, adapting to changing patterns of usage.

roland@ai.mit.edu (Roland McGrath) (08/21/90)

In article <1229@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:

   The question really should be "How much harder would doing some reasonably
   common programming task, X, be if feature (or constraint) F were ommited
   from the language?" and also "How difficult is it for a typical implementor
   to implement feature (or constraint) F?"

I disagree.  I don't see why difficulty of implementation should be an issue in
the language standard.  It's not a list of implementation guidelines; it's a
standard for how the language behaves.  You seem to presume that the people
contributing to the standard will be at least as clever as any implementor of
the language.  I find this a completely ridiculous presumption.  Just because
you can't think of a way to do something, you want to disallow others from
attempting it.  I can't think of a way to do it either, but I would fight the
the right of others to make the attempt.

   In the case of a constraint which would require compilers to maintain the
   programmer-specified ordering of data fields, the implementation is trivially
   easy.  (In fact, it is actually *harder* to do anything else!)

So obviously lazy implementors will do what you want, and clever/hardworking
implementors will do something more.

   Also, the usefulness of allowing the programmer to precisely specify his
   desired field ordering is (in my opinion) quite large.

Although I think I consider this less useful than you do, I am in favor of
retaining a way to force a given field order.  Using:

extern "C"
{
  struct foo
    {
      char elt0;
      int elt1;
    };
}

does this quite well.

   As I noted in a prior posting, programmers may often need to do this when
   dealing with data formats dictated by outside constraints (e.g. existing
   disk formats, existing datacomm protocols, memory mapped I/O devices, and
   (last but not least) layouts mandated by external routines (e.g. from
   libc.a) written in other languages such as C).

Indeed true.  This is why I advocate giving them a way to do so, shown above.

   >>I personally find the rule cited above intensely offensive because it gives
   >>the compiler licence to thwart the will of the programmer (who is supposed
   >>to be in control after all).  I don't like that one little bit!
   >
   >Actually, the will of this programmer is that C++ compilers generate the
   >fastest and most compact code possible from my class declarations.

   Are you suggesting that it may be possible to generate more efficient
   executable code if arbitrary field re-ordering (by the compiler) is allowed?
   If so, are you prepared to give realistic (not contrived) examples of such
   improvements?

I'm not so prepared.  Are you prepared to prove it impossible?

   If (on the other hand) you are only suggesting that a compiler could possibly
   pack fields in some better order than I can, let me assure you that you are
   wrong.  If you would care to organize a test, I'll be glad to pit my brain
   against your compiler in a "structure packing" test any day of the week.

Given that all present compilers are bound by the present rules, this idea is
somewhat ridiculous.  I expect the standard to outlive present or near-future
implementations, however.  Otherwise, what's the point of having a standard?

   As a matter of fact, I surmize that the intelligence necessary to do good
   structure packing is not even beyond the reach of your common garden slug.
   This fact makes me all the more confused about your insistance that you
   want compilers to do this for you.  Are you low-rating yourself? :-)

Where do you find these garden slugs with absolute prescience of all possible
future computer architectures and C++ implementations?

   Right now, if I have the following C code:

	   struct s {
		   unsigned	field1:24;
		   unsigned	field2:8;
	   };

   and this is in a program running on a 68000, I can get C compilers from at
   least a dozen vendors that will lay this out *exactly* the same way.  Thus,
   I have vendor independence.  That (and portability) are worth money to me.

This is an excellent example.  Since we are speaking of C++, and the example is
C code, you can simply put it within `extern "C"' and have all the guarantees
you now have with the C code.

   Now I find out that I've lost this with C++, and I've gotten virtually
   nothing in return. :-(

You have not lost this functionality.  You simply must use a slightly different
construct.  Since you are using a significantly different language, you should
consider yourself lucky that the necessary change is so minor.

By my standards, the possibility of many new optimizations as yet unconceived
of is not "virtually nothing".

   >... Or alternatively, 
   >probably just about any vendors compiler with all optimizations turned 
   >off will fullfill your requirements.

   Not if it is not *required* by the standard!!!

Quite true.  An excellent argument for why field reordering should not be
allowed within `extern "C"'.

   >Again, why should everyone's C++ compiler have to generate slower,
   >more memory intensive code in order to meet the needs of people who
   >feel they want to maintain intimate control over their compiler?
   >Why can't they just buy a compiler that gives them that control?

   Again, you have presented no evidence whatsoever that obeying the
   programmer's wishes with regard to field ordering will result in slower (or
   "more memory intensive") code in any real programs.  Until you do, this
   argument should be viewed as simply a red herring.

You again presume foreknowledge of all possible future implementations.  I find
such arrogance surprising from an otherwise reasonable person.

   I believe that this argument is bogus, and that it has no basis in fact.

I consider it a fact that some C++ compiler implementor may in the future come
up with a way to improve efficiency through field reordering.  Until such a way
is proven impossible, this remains a fact.

   Programmers need not suffer any performance penalty whatsoever in order to
   have the compiler obey their wishes.

This is true if and only if your definition of "performance" includes only
run-time program speed.  While this type of performance is of great concern to
me, of equal concern is programmer performance.  Programmer performance
degrades greatly when the programmer is forced by constraints of the language
to do work that the compiler could be doing.
--
	Roland McGrath
	Free Software Foundation, Inc.
roland@ai.mit.edu, uunet!ai.mit.edu!roland

stt@inmet.inmet.com (08/22/90)

Re: Allowing compilers to reorder fields "at will".

Having used a language without strict field ordering
rules, I can testify that it is a nightmare.  If you
want the compiler to reorder your fields, then you
should have to indicate it explicitly, perhaps via
a "pragma" or equivalent.  Otherwise, you lose
interoperability between successive versions of
the same compiler, let alone interoperability between
different compilers.

S. Tucker Taft
Intermetrics, Inc.
Cambridge, MA  02138

rfg@NCD.COM (Ron Guilmette) (08/24/90)

In article <56744@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>In article <1229@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>
>>Are you suggesting that it may be possible to generate more efficient
>>executable code if arbitrary field re-ordering (by the compiler) is allowed?
>
>Yes.
>
>>If so, are you prepared to give realistic (not contrived) examples of such
>improvements?
>
>Yes and no.  Clearly, I am not free to talk about MS future plans.
>Hopefully, even a few obvious examples should demonstrate the plausability...

OK. I'm willing to admit that it is just barely *plausable* that one of these
fine days I may wakeup and find out that everyone is taking their personal
fusion-powered helicopters to work, that the Jetsons are my next door
neighbors, and that some whiz-bang compiler will be smart enough to reorder
fields for me such that I end up with faster code.  (Note however that,
to the best of my knowledge anyway, no such compilers for *any* language
that provides "record types" now exists.  Still, that's besides the point.
If men can walk on the moon, who knows.)

>Agreed. That's why I suggested that one approach might be that "struct"
>means the same layout as C, as long as you don't derive.  Then all your
>C code continues to work.  One can think up reasons why one might want
>to use inheritence in matching machine registers, but that seems pushing it,
>to my mind.

Now that I've admited that future compilers should not be *unduly* constrained
by my own antiquated way of thinking about them, I'd like to say that it
still seems that user should have an (ANSI standard) way of getting layouts
exactly the way they want them (for all the reasons which have already been
mentioned) when necessary.  If the default is that the compiler has control
of the layout (rather than the programmer) I don't even mind that, so long
as the programmer can get control when necessary (via the proper syntactic
incantations).

I'm not keen on the idea that the use of the word "struct" (as opposed to
"class") should be taken as the indicator of the programmer's desires in
this regard.  Somebody suggested that nesting the class/struct declaration
within an `extern "C" {}' section might be a nice way to indicate that good
old fashioned programmer-controlled layout is being requested.  That sounds
reasonable to me.  Furthermore, it seems to be consistant with the
existing uses for `extern "C" {}' and it also allows for the possibility
of getting an entire set of related classes layed out the way you want/need
them.  Note howver that just using `extern "C"' as a prefix, as in:

	extern "C" struct s { unsigned i:7; unsigned j:9; };

will not work well because there could be some confusion in cases like:

	extern "C" struct s {...} foobar (int i);

In that case people might get confused about whether the `extern "C"' applied
only to the declared function or to both the function and the struct type
definition.

>Ron continues on, arguing strongly for the advantages on manually clustering
>over automatic clustering.  I disagree.  I think performance in other
>OOPLs and OODBs show that manual clustering doesn't work, because patterns
>of usage do not remain constant.  Thus clustering needs to be done 
>automatically, adapting to changing patterns of usage.

I was only arguing that "manual" clustering is simple to implement, simple
to understand, and that it is actually *needed* in some cases.  I stand
be these assertions, however I'm willing now to admidt that Jim A. (and
Microsoft) or some other compiler developer or vendor may be able to
conjure up some magic in the future which will give me faster code if
the compilers is allowed to take control of this "implementation detail"
in most cases.  I just don't want the control to be taken away from the
programmer irrevocably and in all cases.
-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

rfg@NCD.COM (Ron Guilmette) (08/24/90)

In article <ROLAND.90Aug20192216@wookumz.ai.mit.edu> roland@ai.mit.edu (Roland McGrath) writes:
+In article <1229@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
+
+   In the case of a constraint which would require compilers to maintain the
+   programmer-specified ordering of data fields, the implementation is trivially
+   easy.  (In fact, it is actually *harder* to do anything else!)
+
+So obviously lazy implementors will do what you want, and clever/hardworking
+implementors will do something more.

In a separate posting, I have scaled back my desires based upon your persuasive
arguments.  Let the clever/hardworking implementors do their best!  As long
as I have some way to get "as written" field ordering, I'll be happy.

+Although I think I consider { "as written" field ordering } less useful
+than you do, I am in favor of retaining a way to force a given field order.
+Using:
+
+extern "C"
+{
+  struct foo
+    {
+      char elt0;
+      int elt1;
+    };
+}
+
+does this quite well.

I agree 100%.  I hope the x3j16 membership is listening.

+   Are you suggesting that it may be possible to generate more efficient
+   executable code if arbitrary field re-ordering (by the compiler) is allowed?
+   If so, are you prepared to give realistic (not contrived) examples of such
+   improvements?
+
+I'm not so prepared.  Are you prepared to prove it impossible?

Nope.

+Where do you find these garden slugs with absolute prescience of all possible
+future computer architectures and C++ implementations?

In the garden, of course. :-) :-) :->

+   Right now, if I have the following C code:
+
+	   struct s {
+		   unsigned	field1:24;
+		   unsigned	field2:8;
+	   };
+
+   and this is in a program running on a 68000, I can get C compilers from at
+   least a dozen vendors that will lay this out *exactly* the same way.  Thus,
+   I have vendor independence.  That (and portability) are worth money to me.
+
+This is an excellent example.  Since we are speaking of C++, and the example is
+C code, you can simply put it within `extern "C"' and have all the guarantees
+you now have with the C code.

Well, I don't believe that I saw that stated in E&S, but I certainly would
*like* to see it in the ANSI C++ standard.

+You again presume foreknowledge of all possible future implementations.  I find
+such arrogance surprising from an otherwise reasonable person.

I have been accused of a lot of things, but never of being "otherwise
reasonable".  That's a first for me. :-)
-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

pcg@cs.aber.ac.uk (Piercarlo Grandi) (08/25/90)

On 16 Aug 90 19:20:16 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:

jimad> Actually, the will of this programmer is that C++ compilers generate the
jimad> fastest and most compact code possible from my class declarations.  It 
jimad> is not C++ compilers that are tharting this will, but rather people who
jimad> want to continue using C++ as a C assembly language.

These include Stroustrup, apparently. See the introduction to his first
C++ book... C++ *has been designed* as an OO MOHLL/SIL.

jimad> I want to be able to use C++ as a fast, efficient object oriented
jimad> language.  This requires compilers that have great optimization,
jimad> and great freedom to optimize.

Why don't you use Self? Objective C or Eiffel? Just because C++ is
fashionable?

IMNHO C++ is just about the only OO around with widely available
compilers designed as a SIL/MOHLL . Let's keep it that way. People who
want a more applications oriented language should choose something like
one of the languages above. As soon as GNU CC 2.0 will be released I
think that Objective C will grow very much in popularity, IMNHO.

jimad> There continues this division between C/C++ people who want a compiler
jimad> to "do what I say" verses "do you best optimizations."  I think this
jimad> issue should be addressed in the marketplace -- not in the language.

This of course kills any idea of portability. You programs start to have
semantics that depend, by construction, both on the compiler and the
language. This is IMNHO *very* bad, vide the abominable pragma pandora
box of Ada and Ansi C. It may be very good for compiler vendors, though,
especially those that have or think will have a large market share, and
want to get entrenched into it. Thank goodness AT&T does not reason that
way, because they are too large and diverse an organization themselves.

jimad> If you want to use C++ to play with machine registers, then
jimad> portability is not an issue to you anyway.

Ah no, not so easy. One can provide, thanks to the wonders of separate
compilation or conditional preprocessing, both a slow portable version
and a fast one that will not work on every implementation, but is still
expressed as much as possible in the base language.

jimad> Again, why should everyone's C++ compiler have to generate slower,
jimad> more memory intensive code in order to meet the needs of people who
jimad> feel they want to maintain intimate control over their compiler?

Maybe because the language has been designed for the latter group...

And because many times the compiler cannot understand the program better
than they do anyhow, or requiring the compiler to do it is not a win-win
situation, e.g. results in larger, slower, less reliable compilers.

A language _definition_ should be independent of compiler implementation
issues, but the language _design_ should well take into account such
pragmatics. "advanced" features are easily botched; consider Ansi C and
'volatile', whose _omission_ makes a program undetectably erroneous, and
that otherwise exists almost only to assist the optimizer. This
particular problem does not apply to field reordering. There is another
problem though: that it makes the binary output of programs compiled
with different compilers or releases of the same compiler potentially
incompatible. Do we really want to force using XDR or ASN.1 throughout?

Consider also the conclusion that you and Guilmette seem to be reaching:
the compiler may reorder at will by default, as long as there is some
mechanism to override it. This seems the best solution, as long as you
do not see the additional cost in complexity/unreliability of the
manual, of the compiler, of the program build process. You have to weigh
these things carefully, and if you do not, you go down the slippery path
Ansi C has gone.

So, are today's C/C++ multimegabyte compilers significantly more
reliable and faster and generate better quality code than those that run
in 64KB[+64KB]? IMNHO no, or at least not enough that the changed
pragmatics should influence for design of the language (as has happened,
for the worst, with Ansi C).

But maybe I am wrong, and the relative cost of compile time resources
(time, space, programmer confusion, reliability) vs.  runtime speed have
changed so much that a modest increase of the latter can be economically
bought at the price of a large increase of the former.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

dl@g.g.oswego.edu (Doug Lea) (08/25/90)

Perhaps among the best arguments for allowing compilers to at least
sometimes reorder fields is to simply make conceivable someday the use
of C++ adaptations of the clever MI layout algorithms described in
Pugh & Weddell's SIGPLAN '90 conference paper.

For what it's worth, I think that the suggestion that compilers obey
programmer field ordering only inside `extern "C"' sounds fine.
--
Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2688
email: dl@g.oswego.edu            or dl@cat.syr.edu
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl

jimad@microsoft.UUCP (Jim ADCOCK) (08/28/90)

In article <259400004@inmet> stt@inmet.inmet.com writes:

|Re: Allowing compilers to reorder fields "at will".
|
|Having used a language without strict field ordering
|rules, I can testify that it is a nightmare.  If you
|want the compiler to reorder your fields, then you
|should have to indicate it explicitly, perhaps via
|a "pragma" or equivalent.  Otherwise, you lose
|interoperability between successive versions of
|the same compiler, let alone interoperability between
|different compilers.

Maybe or maybe not, but C++ compilers already have the right to reorder
your field orderings.  Or are you proposing that the rules should be
changed to prevent compilers from reordering?  The only restriction
right now is that within a labeled section fields must be at increasing
addresses.  Which neither requires nor prevents compilers from changing
field orderings from your expectations, but rather leaves those decisions
to the sensibilities of the compiler vendor.  If some future C++ compiler
supported persistence and schema evolution, would you then demand a
user specified field ordering?

jimad@microsoft.UUCP (Jim ADCOCK) (08/28/90)

In article <DL.90Aug25124455@g.g.oswego.edu> dl@g.oswego.edu writes:
|
|Perhaps among the best arguments for allowing compilers to at least
|sometimes reorder fields is to simply make conceivable someday the use
|of C++ adaptations of the clever MI layout algorithms described in
|Pugh & Weddell's SIGPLAN '90 conference paper.

Also, at the other end of the performance spectrum, imagine creating
an interactive C++ "interpreter" or actually p-coded compiler.  The goal is to
have minimal recompilation and relinking, in order to minimize the
response time to changes.  One simple scheme to accomplish this would
be to not embed objects, inherit via pointer, and always tack new fields
to the end of an "object's" structure.  This would violate today's field
ordering constraints [unless the user always added new fields to the end
of a labeled section.]

peterson@csc.ti.com (Bob Peterson) (08/28/90)

In article <56940@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>       ...
>                       Or are you proposing that the rules should be
>changed to prevent compilers from reordering?  The only restriction
>right now is that within a labeled section fields must be at increasing
>addresses.  Which neither requires nor prevents compilers from changing
>field orderings from your expectations, but rather leaves those decisions
>to the sensibilities of the compiler vendor.  If some future C++ compiler
>supported persistence and schema evolution, would you then demand a
>user specified field ordering?

  A frequently stated goal of object-oriented database (OODB) systems
developers is to support storage of objects over long periods of time
and concurrent access to the OODB by concurrently executing programs.
This implies that an object stored several years ago be fetched today. 
This, in turn, implies that the OODB support evolution of a stored
object, should the definition of that object change.  Sharing implies
that different applications may request access to the same object at
roughly the same time.

  If a C++ compiler is able to reorder the fields of an object at will,
the object evolution problem faced by OODB developers is substantially
more complex than if field reordering is restricted.  Not only can an
object definition change as a result of an explicit user action, but
simply because the user has upgraded to a new version of the compiler
or added a compiler from a different vendor.  Requiring a recompilation
of the all programs may solve the problem for programs that don't save
objects.  However, for environments in which objects are retained for
substantial periods of time, or where objects are shared among
concurrently executing program compiled with compilers using different
ordering rules, a recompilation requirement doesn't seem a viable
solution, IMHO.

  In this organization there are pieces of some programs too big to
compile with cfront 2.1 and Sun's C compiler.  We use g++ for these
programs.  Other parts of the system don't require use of g++.  I would
like to think that different pieces of this system would be able to
access the OODB without incurring the performance penalty of converting
the stored data based not only on machine architecture but based on
which compiler is in use. At the very least, this is an additional, and
in my opinion unnecessary, constraint.

  If C++ standard continues to specify that, by default, fields can be
reordered, the standard should also require that the rules governing
such reordering be made explicit and public.  An additional requirement
I'd like to see is that a conforming compiler have available a warning
stating that reordering is happening.  If these two requirements are
included OODB vendors, as well as applications that write objects as
objects, would have some hope of understanding what a compiler is
doing.  Allowing compiler vendors to hide such details will be costly
in the long term.

    Bob

-- 
   Hardcopy    and       Electronic Addresses:        Office:
Bob Peterson           Compuserve: 70235,326          NB 2nd Floor CSC Aisle C3
P.O. Box 861686        Usenet: peterson@csc.ti.com
Plano, Tx USA 75086    (214) 995-6080 (work) or (214) 596-3720 (ans. machine)

howell@bert.llnl.gov (Louis Howell) (08/28/90)

In article <1990Aug27.212649.16101@csc.ti.com>, peterson@csc.ti.com
(Bob Peterson) writes:
|> [...]
|>   A frequently stated goal of object-oriented database (OODB)
|> systems developers is to support storage of objects over long
|> periods of time and concurrent access to the OODB by
|> concurrently executing programs.  This implies that an object
|> stored several years ago be fetched today.  This, in turn,
|> implies that the OODB support evolution of a stored object,
|> should the definition of that object change.  Sharing implies
|> that different applications may request access to the same
|> object at roughly the same time.
|> 
|>   If a C++ compiler is able to reorder the fields of an object
|> at will, the object evolution problem faced by OODB developers
|> is substantially more complex than if field reordering is
|> restricted.  Not only can an [...]

I don't buy this.  The representation of objects within a program
should have nothing to do with the representation of those objects
within files.  If an object needs a read/write capability, the
OODB developer should write explicit input and output functions
for that class, thus taking direct control of the ordering of
class members within a file.  For a large database this is
desirable anyway, since even in C the compiler is permitted to
leave space between members of a struct, and you would not want
to store this padding for every object in a file.

I really don't see why reordering of fields in C++ is any worse
than the padding between fields which already existed in C.
Both make the binary layout of a struct implementation-dependent.
I can sympathise with those who occasionally want direct control
over member arrangement, and it might be nice to provide some
kind of user override to give them the control they need, but
such control never really existed in an absolute sense even in C.
It certainly should not be the default, since most users never
need to worry about this type of detail.

Louis Howell

#include <std.disclaimer>

jimad@microsoft.UUCP (Jim ADCOCK) (08/29/90)

In article <PCG.90Aug25160256@athene.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>On 16 Aug 90 19:20:16 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:
>
>jimad> Actually, the will of this programmer is that C++ compilers generate the
>jimad> fastest and most compact code possible from my class declarations.  It 
>jimad> is not C++ compilers that are tharting this will, but rather people who
>jimad> want to continue using C++ as a C assembly language.
>
>These include Stroustrup, apparently. See the introduction to his first
>C++ book... C++ *has been designed* as an OO MOHLL/SIL.

I think it would be best to let Bjarne speak for himself directly on this 
issue.  My impression is that his second book is carefully and deliberately
worded to allow wide latitude in the implementation of the language.

>jimad> I want to be able to use C++ as a fast, efficient object oriented
>jimad> language.  This requires compilers that have great optimization,
>jimad> and great freedom to optimize.
>
>Why don't you use Self? Objective C or Eiffel? Just because C++ is
>fashionable?

Certainly the fact that C++ is by far the most popular OOPL language has
something to do with it.  [I wouldn't call C++ "fashionable" though -- I'd
call it pragmatic.  Eiffel and Self would be the "fashionable" languages,
to my mind.]  But also the fact remains, if one really wants to write 
real world commercial OO software,  C++ is really the only language
up to the task. [IMHO]

>IMNHO C++ is just about the only OO around with widely available
>compilers designed as a SIL/MOHLL . Let's keep it that way. People who
>want a more applications oriented language should choose something like
>one of the languages above. As soon as GNU CC 2.0 will be released I
>think that Objective C will grow very much in popularity, IMNHO.

Clearly I disagree with your assessment of Obj-C.  And C++.

>jimad> There continues this division between C/C++ people who want a compiler
>jimad> to "do what I say" verses "do you best optimizations."  I think this
>jimad> issue should be addressed in the marketplace -- not in the language.
>
>This of course kills any idea of portability. You programs start to have
>semantics that depend, by construction, both on the compiler and the
>language. This is IMNHO *very* bad, vide the abominable pragma pandora
>box of Ada and Ansi C. It may be very good for compiler vendors, though,
>especially those that have or think will have a large market share, and
>want to get entrenched into it. Thank goodness AT&T does not reason that
>way, because they are too large and diverse an organization themselves.

I think you have this backwards.  In the absense of good optimizers, people
have to program around the limitations of their individual compilers, and
hand optimize code for a particular situation, keeping it from being
reusable.  I don't particularly like pragmas, but I think they may be an 
acceptable compromise to warn a compiler when people insist on writing hacks
that rightfully would fall outside the language.  Companies that design
portable compilers should tend to show a bias towards keeping the language
such that making portable compilers is easy.  Companies that design compilers
for a particular CPU or OS should tend to show a bias towards making the
language highly optimizable.  We should not be surprised if a tension 
between these two goals shows up as part of the standardization process.
[standard disclaimer]

>jimad> If you want to use C++ to play with machine registers, then
>jimad> portability is not an issue to you anyway.
>
>Ah no, not so easy. One can provide, thanks to the wonders of separate
>compilation or conditional preprocessing, both a slow portable version
>and a fast one that will not work on every implementation, but is still
>expressed as much as possible in the base language.

Agreed.  But does such code actually party based on the layout of a particular
CPU's registers or ports?  -- I think not.

>jimad> Again, why should everyone's C++ compiler have to generate slower,
>jimad> more memory intensive code in order to meet the needs of people who
>jimad> feel they want to maintain intimate control over their compiler?
>
>Maybe because the language has been designed for the latter group...

Hm.  I would think that now that C++ is the leading OOPL it needs to try
to meet the needs of the majority of its users.  I would not think that
the majority of people need or want to maintain intimate control of their
compiler.  I'd think the average OO programmer would be working on a much
higher level, using libraries others provide to do the bit twiddling.
Once these low level libraries are written, the many others can build on
these efforts.

>And because many times the compiler cannot understand the program better
>than they do anyhow, or requiring the compiler to do it is not a win-win
>situation, e.g. results in larger, slower, less reliable compilers.

If this opinion is correct, why would it not win in the marketplace?
Why cannot C++ programmers just say, hm, FuzzCo's Ultra-C++ compiler isn't
worth the time/size/money or complexity.  I'll just use this TinCo Tiny C++
compiler instead?  Whats wrong with letting the marketplace decide these
tradeoffs?  Why is it necessary to force your particular engineering 
design tradeoff beliefs into the language definition itself?

>Consider also the conclusion that you and Guilmette seem to be reaching:
>the compiler may reorder at will by default, as long as there is some
>mechanism to override it. This seems the best solution, as long as you
>do not see the additional cost in complexity/unreliability of the
>manual, of the compiler, of the program build process. You have to weigh
>these things carefully, and if you do not, you go down the slippery path
>Ansi C has gone.

Actually, my conclusion is not that compilers may reorder at will.  My
conclusion is that compiler designers should have the freedom to decide
these issues for themselves, and that C++ programmers should have the
freedom to choose the compiler that then best serve their needs.
Nothing I have proposed forces a vendor to offer a large, complicated 
compiler.

Again, compilers already have the right to almost completely reorder a 
structure at will.  The *only* constraint in E&S is that within a labeled
section, member variables declared later will be places at "higher" addresses.
The reason given for this constraint is backwards C compatibility.  Well,
the constraint is so weak as to offer little in backwards C compatibility.
So my suggestion is to remove it, or to relegate it to those features of
the language specifically addressing backwards compatibility.  To my eye,
having both the extern "C" construct, and a separate field ordering 
restriction applied accross the language, looks like a simple historical
oversight, that should be fixed.

>So, are today's C/C++ multimegabyte compilers significantly more
>reliable and faster and generate better quality code than those that run
>in 64KB[+64KB]? IMNHO no, or at least not enough that the changed
>pragmatics should influence for design of the language (as has happened,
>for the worst, with Ansi C).

Then don't buy or use such compilers.  Use one that suits your fancy.
I don't propose to try to force a compiler you don't like on you.
Why do you want to force your particular compiler design beliefs onto
the compiler I would use?  Again, my proposal is to leave this issues
to the sensibilities of the compiler designers, and the power of the 
marketplace.  Over the near term, I'd be very surprised if any compiler
diverges too far from the cfront object model, because the cfront object
model represents the only object model that many programmers know. So there
would be great market skepticism to a compiler that introduced a very 
different object model.  However, as time goes on, other OOPLs are going
to demonstrate very different object models, and I claim we want C++ to
be specified with the freedom to allow compilers to incorporate the best
of those new object models.  And/or to allow C++ compilers be designed
simply to allow compatibility to these other object models.

>But maybe I am wrong, and the relative cost of compile time resources
>(time, space, programmer confusion, reliability) vs.  runtime speed have
>changed so much that a modest increase of the latter can be economically
>bought at the price of a large increase of the former.

I guess from my perspective, there is clearly not one answer to this question.
The tradeoff between development costs verses runtime costs is very much
a function of the number of units of your program you intend to produce.
If only a couple copies of a program are to be created, development cost
is paramount, and runtime cost is typically unimportant.  If a company
makes millions of copies of a programmer however, runtime costs outweight
all other factors.  A company that failed to shave the last few seconds
from their software in that situation would be costing society thousands,
if not millions of dollars in wasted user productivity.

In the first scenerio, a C++ programmer might want to use an p-coded
C++ implementation that follows a minimal recompilation strategy.  This
might be accomplished by inheriting via reference, not embedding objects,
and tagging any newly declared members on the end of an objects' section.
This is one object model totally different than the cfront model.

In the second scenerio, a highly optimizing compiler might be used.  Pragmas
might be used to indicate when constants aren't.  Pragmas might be used to
indicate when a class is a leaf class.  Compilations might be optimized on
a module global basis.  Smart linkers might winnow out vtable indirect calls.
Fat tables, or tricks might be used.  Fields might be reorganized based
on profiling information.  This would also lead to an object model very
different from cfront.

Thus, I don't believe one object model spans the needs of all C++ users
[nor all systems C++ might run on -- such as a Rekursiv-like machine,
or a Lisp-like machine, or a Smalltalk-like machine]  So I believe it 
would be very short sighted to try to specify the C++ language strictly
around a cfront-like object model.  Let's leave our options for the future
open.

jimad@microsoft.UUCP (Jim ADCOCK) (08/29/90)

In article <1990Aug27.212649.16101@csc.ti.com> peterson@csc.ti.com (Bob Peterson) writes:
>  If C++ standard continues to specify that, by default, fields can be
>reordered, the standard should also require that the rules governing
>such reordering be made explicit and public.  An additional requirement
>I'd like to see is that a conforming compiler have available a warning
>stating that reordering is happening.  If these two requirements are
>included OODB vendors, as well as applications that write objects as
>objects, would have some hope of understanding what a compiler is
>doing.  Allowing compiler vendors to hide such details will be costly
>in the long term.

Hm, I thought that OODBs were addressing this issue already.  If one has
software that is going to be used for a number of years, object formats
are going to change -- if not by automatic compiler intervention, then
by programmer manual intervention.  OODBs have to be designed to allow
for the evolution of the schemes used in the software, and the corresponding
object layouts.  Compilers have always hid field layouts from users --
details of packing have never been exposed.  If some other tool, like an
OODB needs to know these details, traditionally they are made available
via auxiliary files or information the compiler generates, such as debugging
information.  If your OODB strategy insists that the layout of objects
not change from now to eternity, then I don't understand how you can have
a reasonable software maintainance task.

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (08/29/90)

howell@bert.llnl.gov (Louis Howell) writes:
> peterson@csc.ti.com (Bob Peterson) writes:
>|> [...]
>|>   If a C++ compiler is able to reorder the fields of an object
>|> at will, the object evolution problem faced by OODB developers
>|> is substantially more complex than if field reordering is
>|> restricted.  Not only can an [...]
>
>I don't buy this.  The representation of objects within a program
>should have nothing to do with the representation of those objects
>within files. [...]

But the problem is more pervasive than just storage in _files_; you
have the identical situation in shared memory architectures, across
communication links, etc.  If you allow each compiler to make its
own decisions about how to lay out a structure, then you force _any_
programs sharing data across comm links _or_ time _or_ memory space
_or_ file storage to be compiled with compilers having the same
structure layout rules designed in, or to pack and unpack the data
with every sharing between separately compiled pieces of code, surely
an unreasonable requirement compared to the simpler one of setting
the structure layout rules once for all compilers?

The _data_ maintenance headaches probably overwhelm the _code_
maintanance headaches with the freely reordered structures paradigm.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

howell@bert.llnl.gov (Louis Howell) (08/29/90)

In article <1990Aug28.211752.24905@zorch.SF-Bay.ORG>,
 xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
|> howell@bert.llnl.gov (Louis Howell) writes:
|> > peterson@csc.ti.com (Bob Peterson) writes:
|> >|> [...]
|> >|>   If a C++ compiler is able to reorder the fields of an object
|> >|> at will, the object evolution problem faced by OODB developers
|> >|> is substantially more complex than if field reordering is
|> >|> restricted.  Not only can an [...]
|> >
|> >I don't buy this.  The representation of objects within a program
|> >should have nothing to do with the representation of those objects
|> >within files. [...]
|> 
|> But the problem is more pervasive than just storage in _files_; you
|> have the identical situation in shared memory architectures, across
|> communication links, etc.  If you allow each compiler to make its
|> own decisions about how to lay out a structure, then you force
|> _any_ programs sharing data across comm links _or_ time _or_ memory
|> space _or_ file storage to be compiled with compilers having the
|> same structure layout rules designed in, or to pack and unpack the
|> data with every sharing between separately compiled pieces of code,
|> surely an unreasonable requirement compared to the simpler one of
|> setting the structure layout rules once for all compilers?
|> 
|> The _data_ maintenance headaches probably overwhelm the _code_
|> maintanance headaches with the freely reordered structures
|> paradigm.

In short, you want four types of compatibility: "comm links", "time",
"memory space", and "file storage".  First off, "time" and "file
storage" look like the same thing to me, and as I said before I don't
think objects should be stored whole, but rather written and read
member by member by user-designed routines.  As for "memory space",
I think it reasonable that every processor in a MIMD machine, whether
shared memory or distributed memory, should use the same compiler.
This, and a requirement that a compiler should always produce the
same memory layout from a given class definition, even in different
pieces of code, give enough compatibility for MIMD architectures.

Finally, the issue of communication over comm links strikes me as
very similar to that of file storage.  If compatibility is essential,
design the protocol yourself; don't expect the compiler to do it for
you.  Pack exactly the bits you want to send into a string of bytes,
and send that.  You wouldn't expect to send structures from a Mac
to a Cray and have them mean anything, so why expect to be able to
send structures from an ATT-compiled program to a GNU-compiled
program?  If you want low-level compatibility, write low-level code
to provide it, but don't handicap the compiler writers.

Louis Howell

#include <std.disclaimer>

rae@gpu.utcs.toronto.edu (Reid Ellis) (08/30/90)

In <1313@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>I'm not keen on the idea that the use of the word "struct" (as opposed to
>"class") should be taken as the indicator of the programmer's desires in

If you actually do this, how about using "volatile"?  e.g.:

		volatile class foo {
			int a;
			char b;
			someType c;
		};

It even looks like what you mean.  Not that I think this is a good
idea either, however. :-)  Just suggesting a "nicer" syntax.

					Reid
--
Reid Ellis  264 Broadway Avenue, Toronto ON, M4P 1V9               Canada
rae@gpu.utcs.toronto.edu || rae%alias@csri.toronto.edu || +1 416 487 1383

diamond@tkou02.enet.dec.com (diamond@tkovoa) (08/31/90)

In article <259400004@inmet> stt@inmet.inmet.com writes:

>Re: Allowing compilers to reorder fields "at will".
>... you lose
>interoperability between successive versions of
>the same compiler, let alone interoperability between
>different compilers.

You lose nothing, because you never had such interoperability.
In C, padding can change from one release to the next, or depending on
optimization level.  In Pascal, a compiler can decline to do packing.
If I'm not mistaken, in Ada, bit-numbering does not have to match the
intuitive (big-endian or little-endian) numbering.

>Having used a language without strict field ordering
>rules, I can testify that it is a nightmare.

Surely you have such nightmares in every language you have used.
Except maybe assembly.  In order to avoid such nightmares, you have to
impose additional quality-of-implementation restrictions on the choice
of implementations that you are willing to buy.

-- 
Norman Diamond, Nihon DEC       diamond@tkou02.enet.dec.com
We steer like a sports car:  I use opinions; the company uses the rack.

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (09/01/90)

howell@bert.llnl.gov (Louis Howell) writes:
> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>|> howell@bert.llnl.gov (Louis Howell) writes:
>|> > peterson@csc.ti.com (Bob Peterson) writes:
>|> >|> [...]
>|> >|>   If a C++ compiler is able to reorder the fields of an object
>|> >|> at will, the object evolution problem faced by OODB developers
>|> >|> is substantially more complex than if field reordering is
>|> >|> restricted.  Not only can an [...]
>|> >
>|> >I don't buy this.  The representation of objects within a program
>|> >should have nothing to do with the representation of those objects
>|> >within files. [...]
>|> 
>|> [...] If you allow each compiler to make its own decisions about how
>|> to lay out a structure, then you force _any_ programs sharing data
>|> across comm links _or_ time _or_ memory space _or_ file storage to
>|> be compiled with compilers having the same structure layout rules
>|> designed in, or to pack and unpack the data with every sharing
>|> between separately compiled pieces of code, [...]

>In short, you want four types of compatibility: "comm links", "time",
>"memory space", and "file storage".  First off, "time" and "file
>storage" look like the same thing to me,

Not so.  If a program compiled with compiler A stores data in a file,
and a program compiled with compiler B can't extract it, that is one
type of compatibility problem to solve, and it can be solved with the
compilers at hand.

But if a program compiled with compiler A revision 1.0 stores data in
a file, and a program compiled with compiler A revision 4.0 cannot
extract it, that is a compatibility problem to solve of a different
type.  Mandating no standard for structure layout forces programmers
in both these cases to anticipate problems, unpack the data, and store
it in some unstructured format.  Tough on the programmer who realizes
this only when compiler A revision 4.0 can no longer read the structures
written to the file with compiler A revision 1.0; it may not be around
any more to allow a program to be compiled to read and rewrite that data.

>and as I said before I don't
>think objects should be stored whole, but rather written and read
>member by member by user-designed routines.

That is a portability versus time/space efficiency choice.  By refusing
to accept mandatory structure layout standards, compiler writers would
force that choice to be made in one way only.

>As for "memory space",
>I think it reasonable that every processor in a MIMD machine, whether
>shared memory or distributed memory, should use the same compiler.

That isn't good enough.  I've worked in shops with several million lines
of code (about 7.0) in executing software.  By mandating _no_ standards
for structure layout, you force that _all_ of this code be recompiled with
every new release of the compiler, if the paradigm of data sharing is a
shared memory environment.  Again, by refusing to make one choice, you
force several other choices in ways perhaps unacceptable to the compiler
user.  In this situation, that might well involve several man-years of
effort, and it is sure to invoke every bug in the new release of the
compiler simultaneously, and would very likely bring operations to a
standstill.  With no data structure layout standard, you have removed the
user's choice to recompile and test incrementally, or else forced him to
pack and unpack data even to share it in memory.

>This, and a requirement that a compiler should always produce the
>same memory layout from a given class definition, even in different
>pieces of code, give enough compatibility for MIMD architectures.

Not if you don't mandate that compatibility across time, it doesn't.

>Finally, the issue of communication over comm links strikes me as
>very similar to that of file storage.  If compatibility is essential,
>design the protocol yourself; don't expect the compiler to do it for
>you.  Pack exactly the bits you want to send into a string of bytes,
>and send that.  You wouldn't expect to send structures from a Mac
>to a Cray and have them mean anything, so why expect to be able to
>send structures from an ATT-compiled program to a GNU-compiled
>program?  If you want low-level compatibility, write low-level code
>to provide it, but don't handicap the compiler writers.

Same comments apply.  In a widespread worldwide network of communicating
hardware, lack of a standard removes the option to send structures intact.
One choice (let compiler writers have free reign for their ingenuity in
packing structures for size/speed) removes another choice (let programmers
have free reign for their ingenuity in accomplishing speedy and effective
communications).  Somebody loses in each case, and I see the losses on
the user side to far outweigh in cost and importance the losses on the
compiler vendor side.

Then again, I write application code, not compilers, which could
conceivably taint my ability to make an unbiased call in this case. ;-)

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

stephen@estragon.stars.flab.Fujitsu.JUNET (Stephen P Spackman) (09/03/90)

In article <1990Sep1.131041.15411@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
   howell@bert.llnl.gov (Louis Howell) writes:
   > xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
   >|> [...] If you allow each compiler to make its own decisions about how
   >|> to lay out a structure, then you force _any_ programs sharing data
   >|> across comm links _or_ time _or_ memory space _or_ file storage to
   >|> be compiled with compilers having the same structure layout rules
   >|> designed in, or to pack and unpack the data with every sharing
   >|> between separately compiled pieces of code, [...]
   >In short, you want four types of compatibility: "comm links", "time",
   >"memory space", and "file storage".  First off, "time" and "file
   >storage" look like the same thing to me,
   [...]
   But if a program compiled with compiler A revision 1.0 stores data in
   a file, and a program compiled with compiler A revision 4.0 cannot
   extract it, that is a compatibility problem to solve of a different
   type.  Mandating no standard for structure layout forces programmers
   in both these cases to anticipate problems, unpack the data, and store
   it in some unstructured format.  Tough on the programmer who realizes
   this only when compiler A revision 4.0 can no longer read the structures
   written to the file with compiler A revision 1.0; it may not be around
   any more to allow a program to be compiled to read and rewrite that data.

I feel a little odd posting here, being radically anti-c++ myself, but
I *did* drop in and something about this thread is bothering me. (-:
(honestly!) Look, if you're going to take an ugly language and then
bolt on a kitchen sink and afterburners :-), why not put in the one
thing that is REALLY missing from C:

                    * STANDARD BINARY TRANSPUT *

Can't we have the compiler provide the trivial little shim function
that will read and write arbitrary data structures to binary streams?
Most operating systems now seem to have standard external data
representations, which are intended for precisely this kind of
purpose, and a "default" version would not be hard to cook up (or
appropriate). The runtime overhead is usually dwarfed by transput
cost, and you get better RPC support as a freebie. That way only the
*external* representation format needs to be specified, and the
runtime image can be as zippy as it please.

You can even read and write code if you're willing to interpret it -
or call the compiler, if you've got it kicking around.

   >As for "memory space",
   >I think it reasonable that every processor in a MIMD machine, whether
   >shared memory or distributed memory, should use the same compiler.
   >[...]

Ahem. Offhand I can think of any number of HETEROGENEOUS shared-memory
machines. Mainframes have IOPs. Amigae share memory between bigendian,
word-align 680x0s and littlendian, byte-align 80x86es (-: amusingly
enough you can do things about bytesex by negating all the addresses
on one side and offsetting all the bases, but... :-).  Video
processors with very *serious* processing power and their own
(vendor-supplied) C compilers abound. Now, I'll grant, a *real*
operating system, (-: if someone would only write one for me, :-)
would mandate a common intermediate code so that vendors only supplied
back-end, but even then I'm NOT going to pay interpretation overhead
on all variable accesses on, say, my video processor, my database
engine, by graph reduction engine and my vector processor, just
because my "main" CPU wants a certain packing strategy!

There *must* be a mechanism for binary transmission of binary data,
this must *not* be the programmer's responsibility, and while
localising all code generation in the operating system's caching and
replication component is the obvious thing to do, (a) that day hasn't
arrived yet and (b) the compiler will STILL have to get involved and
generate the appropriate type descriptors and shim functions.

stephen p spackman  stephen@estragon.uchicago.edu  312.702.3982

howell@bert.llnl.gov (Louis Howell) (09/05/90)

In article <1990Sep1.131041.15411@zorch.SF-Bay.ORG>,
xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
|> howell@bert.llnl.gov (Louis Howell) writes:
|> >In short, you want four types of compatibility: "comm links",
"time",
|> >"memory space", and "file storage".  First off, "time" and "file
|> >storage" look like the same thing to me,
|> 
|> Not so.  If a program compiled with compiler A stores data in a
file,
|> and a program compiled with compiler B can't extract it, that is one
|> type of compatibility problem to solve, and it can be solved with
the
|> compilers at hand.
|> 
|> But if a program compiled with compiler A revision 1.0 stores data
in
|> a file, and a program compiled with compiler A revision 4.0 cannot
|> extract it, that is a compatibility problem to solve of a different
|> type.  Mandating no standard for structure layout forces programmers
|> in both these cases to anticipate problems, unpack the data, and
store
|> it in some unstructured format.  Tough on the programmer who
realizes
|> this only when compiler A revision 4.0 can no longer read the
structures
|> written to the file with compiler A revision 1.0; it may not be
around
|> any more to allow a program to be compiled to read and rewrite that
data.

I don't want to reduce this discussion to finger-pointing and
name-calling, but I think this hypothetical programmer deserved
what he got.  I think it's a useful maxim to NEVER write anything
important in a format that you can't read.  This doesn't
necessarily mean ASCII---there's nothing wrong with storing
signed or unsigned integers, IEEE format floats, etc., in binary
form, since you can always read the data back out of the file
in these formats.  If a programmer whines because he depended on
some nebulous "standard structure format" and got burned, then I
say let him whine.  Now if there actually were a standard---IEEE,
ANSI, or whatever---then the compilers should certainly support
it.  Recent comments in this newsgroup show, however, that there
isn't even a general agreement on what a standard should look like.
Let's let the state of the art develop to that point before we
start mandating standards.

|> [...]

|> >As for "memory space",
|> >I think it reasonable that every processor in a MIMD machine,
whether
|> >shared memory or distributed memory, should use the same compiler.
|> 
|> That isn't good enough.  I've worked in shops with several million
lines
|> of code (about 7.0) in executing software.  By mandating _no_
standards
|> for structure layout, you force that _all_ of this code be recompiled
with
|> every new release of the compiler, if the paradigm of data sharing is
a
|> shared memory environment.  Again, by refusing to make one choice,
you
|> force several other choices in ways perhaps unacceptable to the
compiler
|> user.  In this situation, that might well involve several man-years
of
|> effort, and it is sure to invoke every bug in the new release of the
|> compiler simultaneously, and would very likely bring operations to a
|> standstill.  With no data structure layout standard, you have removed
the
|> user's choice to recompile and test incrementally, or else forced him
to
|> pack and unpack data even to share it in memory.

This is the only one of your arguments that I can really sympathize
with.  I've never worked directly on a project of anywhere near that
size.  As a test, however, I just timed the compilation of my own
current project.  4500 lines of C++ compiled from source to
executable in 219 seconds on a Sun 4.  Scaling linearly to 7 million
lines gives 3.41e5 seconds or about 95 hours of serial computer
time---large, but doable.  Adding in the human time required to
deal with the inevitable bugs and incompatibilities, it becomes
clear that switching compilers is a major undertaking that should
not be undertaken more often than once a year or so.

The alternative, though, dealing with a multitude of different
modules each compiled under slightly different conditions, sounds
to me like an even greater nightmare.  Imagine a code that only
works when module A is compiled with version 1.0, module B only
works under 2.3, and so on.  Much better to switch compilers very
seldom.  If you MUST work that way, though, note that you would
not expect the ordering methods to change with every incremental
release.  Changes like that would constitute a major compiler
revision, and would happen only rarely.

You can still recompile and test incrementally if you maintain
separate test suites for each significant module of the code.  If
the only test is to run a single 7 million line program and see if
it smokes, your project is doomed from the start.  (1/2 :-) )

Again, most users don't work in this type of environment.  A
monolithic code should be written in a very stable language to
minimize revisions.  (Fortran 66 comes to mind. :-)  The price is
not using the most up to date tools.  C++ just isn't old enough
yet to be very stable.  If I suggested changing the meaning of
a Fortran format statement, I'd be hung from the nearest tree,
and I'd deserve it, too.

|> [...]

|> >Finally, the issue of communication over comm links strikes me as
|> >very similar to that of file storage.  If compatibility is
essential,
|> >design the protocol yourself; don't expect the compiler to do it
for
|> >you.  Pack exactly the bits you want to send into a string of
bytes,
|> >and send that.  You wouldn't expect to send structures from a Mac
|> >to a Cray and have them mean anything, so why expect to be able to
|> >send structures from an ATT-compiled program to a GNU-compiled
|> >program?  If you want low-level compatibility, write low-level code
|> >to provide it, but don't handicap the compiler writers.
|> 
|> Same comments apply.  In a widespread worldwide network of
communicating
|> hardware, lack of a standard removes the option to send structures
intact.
|> One choice (let compiler writers have free reign for their ingenuity
in
|> packing structures for size/speed) removes another choice (let
programmers
|> have free reign for their ingenuity in accomplishing speedy and
effective
|> communications).  Somebody loses in each case, and I see the losses
on
|> the user side to far outweigh in cost and importance the losses on
the
|> compiler vendor side.

I think Stephen Spackman's suggestion of standarizing the stream
protocol, but not the internal storage management, is the proper
way to go here.

|> Then again, I write application code, not compilers, which could
|> conceivably taint my ability to make an unbiased call in this case.
;-)

Hey, I'm a user too!  I do numerical analysis and fluid mechanics.
What I do want is the best tools available for doing my job.  If
stability were a big concern I'd work in Fortran---C++ is considered
pretty radical around here.  I think the present language is a
big improvement over alternatives, but it still has a way to go.
If we clamp down on the INTERNAL details of the compiler now, we
just shut the door on possible future improvements, and the action
will move on to the the next language (D, C+=2, or whatever).  C++
just isn't old enough yet for us to put it out to pasture.

As a compromise, why don't we add to the language the option of
specifying every detail of structure layout---placement as well
as ordering.  This will satisfy users who need low-level control
over structures, without forcing every user to painfully plot
out every structure.  Just don't make it the default; most people
don't need this capability, and instead should be given the best
machine code the compiler can generate.

Louis Howell

#include <std.disclaimer>

peter@objy.objy.com (Peter Moore) (09/09/90)

In article <1990Sep1.131041.15411@zorch.SF-Bay.ORG>,
xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:

<< I will paraphrase, since the >>'s were getting too deep:  
       If the standard doesn't mandate structure layout, then compiler
	writers will be free to change structure layout over time and
	render all existing stored binary data obsolete
>>

Now hold on.  There may be arguments for enforcing structure layout,
but this sure isn't one of them. If the different releases of the
compiler change the internal representation of structures, then all old
binaries and libraries will become incompatible.  This is immediately
unacceptable to me, without any secondary worries about old binary
data.  If a vendor tried to do that, I would simple change vendors, and
never come back.  And no sane vendor would try such a change.  The
vendor himself has linkers, debuggers, libraries, and compilers for
other languages that all would change, not to mention thousands of
irate customers burning the vendor in effigy.

There are so many things that a vendor could change that would cause
incompatibility:  calling formats, floating point formats, object file
formats, etc..  The vendor could take 3 years to upgrade to ANSI or
insist on supplying non-standard features that can't be turned off.
Structure layout is just a small part.  By your argument, ANSI should
legislate them all, and that is unreasonable and WAY too restrictive on
implementations.

The standard can never protect you from an incompetent or malicious
vendor.  It can only act as a common ground for well intentioned
vendors and customers to meet.

	Peter Moore
	peter@objy.com

stephen@estragon.uchicago.edu (Stephen P Spackman) (09/10/90)

[Apologies in advance for the temperature level - I think my
thermostat is broke]

In article <1990Sep8.154622@objy.objy.com> peter@objy.objy.com (Peter Moore) writes:
   In article <1990Sep1.131041.15411@zorch.SF-Bay.ORG>,
   xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:

   << I will paraphrase, since the >>'s were getting too deep:  
	  If the standard doesn't mandate structure layout, then compiler
	   writers will be free to change structure layout over time and
	   render all existing stored binary data obsolete
   >>

   Now hold on.  There may be arguments for enforcing structure layout,
   but this sure isn't one of them. If the different releases of the
   compiler change the internal representation of structures, then all old
   binaries and libraries will become incompatible.  This is immediately
   unacceptable to me, without any secondary worries about old binary
   data.  If a vendor tried to do that, I would simple change vendors, and
   never come back.  And no sane vendor would try such a change.  The
   vendor himself has linkers, debuggers, libraries, and compilers for
   other languages that all would change, not to mention thousands of
   irate customers burning the vendor in effigy.

Now you hold on. Since the binary layout of structures is NOT defined,
any code that relies on it is BROKEN. Your old binaries, if they are
properly written, are not "damaged" by a binary representation
upgrade, any more than installing a new machine with a different
architecture on your local net breaks existing applications: if the
applications WEREN'T broken, they still aren't, because they do not
rely on undefined behaviour.

As for the libraries, I'm sure the vendor will be glad to supply new
copies of the ones he provides with the upgrade (and I assure you that
his recompiling them will not be half such a chore as you imply), and
your own you can rebuild with make(1).

Furthermore, you notice that your "solution" is insane: changing
vendors regularly will make the "problem" happen more often and more
severely, unless you ultimately succeed in finding one who has a
strict bug-maintenance policy and effectively never upgrades his
product. A more sane solution would be never to install upgrades, and
just learn to work around the limitations you encounter (like not
being able to generate code for a new architecture, for example).

   The standard can never protect you from an incompetent or malicious
   vendor.  It can only act as a common ground for well intentioned
   vendors and customers to meet.

Or, in this case, competent and well-intentioned vendors who try to
keep up with the technology.

I don't know if I'll ever end up writing commercial compilers, but if
I do, Mr. Moore, please do us both a favour and don't buy them.
Because you can *bet* that more than just binary layouts are going to
change as the optimisation library is extended.

stephen p spackman  stephen@estragon.uchicago.edu  312.702.3982