[comp.lang.c++] 64 bit architectures and C/C++

shap@shasta.Stanford.EDU (shap) (04/27/91)

Several companies have announced or are known to be working on 64 bit
architectures. It seems to me that 64 bit architectures are going to
introduce some nontrivial problems with C and C++ code.

I want to start a discussion going on this topic.  Here are some seed
questions:

1. Do the C/C++ standards need to be extended to cover 64-bit
environments, or are they adequate as-is?

2. If a trade-off has to be made between compliance and ease of
porting, what's the better way to go?

3. If conformance to the standard is important, then the obvious
choices are

	short	16 bits
	int	32 bits
	long	64 bits
	void *	64 bits

How bad is it for sizeof(int) != sizeof(long). 

4. Would it be better not to have a 32-bit data type and to make int
be 64 bits?  If so, how would 32- and 64- bit programs interact?

Looking forward to a lively exchagne...

torek@elf.ee.lbl.gov (Chris Torek) (04/27/91)

In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>How bad is it for sizeof(int) != sizeof(long). 

This has been the case on PDP-11s for over 20 years.

It does cause problems---there is always software that makes invalid
assumptions---but typically long-vs-int problems, while rampant, are
also easily fixed.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

sarima@tdatirv.UUCP (Stanley Friesen) (04/28/91)

In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>Several companies have announced or are known to be working on 64 bit
>architectures. It seems to me that 64 bit architectures are going to
>introduce some nontrivial problems with C and C++ code.

>1. Do the C/C++ standards need to be extended to cover 64-bit
>environments, or are they adequate as-is?

They are adequate as is.  They make only minimum requirements for a
conforming implementation.  In particular, there is no reason why you
cannot have 64 bit long's (which is what I would do) or even 64 bit int's.
And you could easily have 64 and 128 bit floating point types (either
as 'float' and 'double' or as 'double' and 'long double' - both approaches
are standard conforming).

>2. If a trade-off has to be made between compliance and ease of
>porting, what's the better way to go?

In writing a new compiler from scratch (or even mostly from scratch) there
is no question, full ANSI compliance is absolutely necessary.
[An ANSI compiler may itself be less portable, but it allows the application
programers to write portable code more easily].

In porting an existing compiler, do whatever seems most practical.

>3. If conformance to the standard is important, then the obvious
>choices are

>	short	16 bits
>	int	32 bits
>	long	64 bits
>	void *	64 bits

OR:
	short	32 bits
	int	64 bits
	long	64 bits

OR:
	short	32 bits
	int	32 bits
	long	64 bits

Any one of the above may be the most appropriate depending on the
instruction set.  If there are no instructions for 16 bit quantities
then using 16 bit short's is a big loss.  And if it really is set up
as a hybrid 32/64 bit architecture, even the last may be useful.
[For instance, the Intel 80X86 series chips are hybrid 16/32 bit
architectures, so both 16 and 32 bit ints make sense].

>How bad is it for sizeof(int) != sizeof(long). 

Not particularly.  There are already millions of machines where this is
true - on PC class machines running MS-DOS sizeof(int) == sizeof(short)
for most existing compilers, (that is the sizes are 16, 16, 32).
[And on Bull mainframes, unless things have changed, all three are the
same size].

>4. Would it be better not to have a 32-bit data type and to make int
>be 64 bits?  If so, how would 32- and 64- bit programs interact?

Programs on different machines should not talk to each other in binary.
[See the long, acrimonious discussions about binary I/O right here].
And as long as you use either ASCII text or XDR representation for data
exchange, there is no problem.

Howver, I would be more likely to skip the 16 bit type than the 32 bit
type.  (Of course if the machine has a 16 bit add and not a 32 bit one ...).

In short the idea is that C should translate as cleanly as possible into
the most natural data types for the machine in question.  This is what the
ANSI committee had in mind.
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

bhoughto@pima.intel.com (Blair P. Houghton) (04/28/91)

In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>It seems to me that 64 bit architectures are going to
>introduce some nontrivial problems with C and C++ code.

Nope.  They're trivial if you didn't assume 32-bit architecture,
which you shouldn't, since many computers still have 36, 16, 8,
etc.-bit architectures.

>I want to start a discussion going on this topic.  Here are some seed
>questions:

Here's some fertilizer (but most of you consider it that way, at
any time :-) ):

>1. Do the C/C++ standards need to be extended to cover 64-bit
>environments, or are they adequate as-is?

The C standard allows all sorts of data widths, and specifies
a scad of constants (#defines, in <limits.h> to let you use
these machine-specific numbers in your code, anonymously.

>2. If a trade-off has to be made between compliance and ease of
>porting, what's the better way to go?

If you're compliant, you're portable.

>3. If conformance to the standard is important, then the obvious
>choices are
>
>	short	16 bits
>	int	32 bits
>	long	64 bits
>	void *	64 bits

The suggested choices are:

	short	<the shortest integer the user should handle; >= 8 bits> 
	int	<the natural width of integer data on the cpu; >= a short>
	long	<the longest integer the user should handle; >= an int>
	void *  <long enough to specify any location legally addressable>

There's no reason for an int to be less than the full
register-width, and no reason for an address to be limited
to the register width.

An interesting side-effect of using the constants is that
you never need to know the sizes of these things on your
own machine; i.e., use CHAR_BIT (the number of bits in a char)
and `sizeof int' (the number of chars in an int) and you'll
never need to know how many bits an int contains.

>How bad is it for sizeof(int) != sizeof(long). 

It's only bad if you assume it's not true. (I confess:  I peeked.
I saw Chris' answer, and I'm not going to disagree.)

>4. Would it be better not to have a 32-bit data type and to make int
>be 64 bits?  If so, how would 32- and 64- bit programs interact?

Poorly, if at all.  Data transmission among architechures
with different bus sizes is a hairy issue of much aspirin.
The only portable method is to store and transmit the data
in some width-independent form, like morse-code or a text
format (yes, ascii is a 7 or 8 bits wide, but it's a
_common_ form of data-width hack, and if all else fails,
you can hire people to read and type it into your
machine).

>Looking forward to a lively exchagne...

				--Blair
				  "Did anyone NOT bring potato salad?"

marc@dumbcat.sf.ca.us (Marco S Hyman) (04/29/91)

In article <224@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
 > In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
 > >	short	16 bits
 > >	int	32 bits
 > >	long	64 bits
 > >	void *	64 bits
 >  
 > OR:
 > 	short	32 bits
 > 	int	64 bits
 > 	long	64 bits
 > 
 > OR:
 > 	short	32 bits
 > 	int	32 bits
 > 	long	64 bits

I hope not -- at least not without some other way of describing a 16-bit
value.  64-bit architecture machines will still have to communicate with other
machines that do support 16-bit values.  Swapping bytes between big and little
endian machines is bad enough. Think about the overhead of converting a pair
of bytes to a 16-bit value.  

 > Any one of the above may be the most appropriate depending on the
 > instruction set.  If there are no instructions for 16 bit quantities
 > then using 16 bit short's is a big loss.

Hmmm.  How would such a processor communicate with hardware devices requiring
16-bit I/O?  How would a structure that maps an external device's registers be
coded if the registers are 16-bits wide?  If there is a way to do these things
then a 16-bit wide data type is probably necessary.

-- 
// marc
// home: marc@dumbcat.sf.ca.us		pacbell!dumbcat!marc
// work: marc@ascend.com		uunet!aria!marc

shap@shasta.Stanford.EDU (shap) (04/29/91)

In article <295@dumbcat.sf.ca.us> marc@dumbcat.sf.ca.us (Marco S Hyman) writes:
>In article <224@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:

> > If there are no instructions for 16 bit quantities
> > then using 16 bit short's is a big loss.
>
>Hmmm.  How would such a processor communicate with hardware devices requiring
>16-bit I/O?

The answer, given the machine he poses, is that it doesn't, since it
doesn't support 16 bit ops.  You could build such a machine, provided
you were willing to do all the peripherals yourself (I'm not
recommending).

However, there is a more compelling argument for 16 bit types in RPC,
extended precision math routines, etc.  'short' has the (un?)fortunate
property that it has been 16 bits on almost every machine known to
man, and a depressing amount of code appears to make the 
sizeof(short) == 16 bits assumption.

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (04/29/91)

shap@shasta.Stanford.EDU (shap) writes:

>2. If a trade-off has to be made between compliance and ease of
>porting, what's the better way to go?

User selectable.

>3. If conformance to the standard is important, then the obvious
>choices are

>	short	16 bits
>	int	32 bits
>	long	64 bits
>	void *	64 bits

That depends on the natural address size of the machine.  If the
machine uses 32 bit addresses, then (void *) should be 32 bits.
I would not want my address arrays taking up more memory than is
needed.

Is it really necessary that sizeof(void *) == sizeof(long)?

>How bad is it for sizeof(int) != sizeof(long). 

Would not bother me as long as sizeof(int) <= sizeof(long)

>4. Would it be better not to have a 32-bit data type and to make int
>be 64 bits?  If so, how would 32- and 64- bit programs interact?

Again it would depend on the machine.  If the machine has both 32 bit
and 64 bit operations, then do include them.  If a 32 bit operation
is unnatural to the machine, then don't.  If it has 16 bit operations
then that makes sense for short.
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

jac@gandalf.llnl.gov (James A. Crotinger) (04/29/91)

  Anyone want to comment on experiences with Crays? I believe the C
compilers have sizeof(int) = sizeof(short) = sizeof(long) == 46 or 64
bits, depending on a compile time flag (Crays can do 46 bit integer
arithmetic using the vectorizing floating point processors, so that is
the default). 

  Binary communications between Crays and other computers is something
I haven't done, mostly because Cray doesn't support IEEE floating point.

  Jim



--
-----------------------------------------------------------------------------
James A. Crotinger     Lawrence Livermore Natl Lab // The above views 
jac@moonshine.llnl.gov P.O. Box 808;  L-630    \\ // are mine and are not 
(415) 422-0259         Livermore CA  94550      \\/ necessarily those of LLNL

campbell@redsox.bsw.com (Larry Campbell) (04/29/91)

In article <1991Apr29.050715.22968@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
->3. If conformance to the standard is important, then the obvious
->choices are
-
->	short	16 bits
->	int	32 bits
->	long	64 bits
->	void *	64 bits
-
-That depends on the natural address size of the machine.  If the
-machine uses 32 bit addresses, then (void *) should be 32 bits.
-I would not want my address arrays taking up more memory than is
-needed.
-
-Is it really necessary that sizeof(void *) == sizeof(long)?

Of course not.

We're currently porting a largish (150K lines) program to a machine
on which:

	short	(dunno, not at work now so I can't check)
	int	32 bits
	long	32 bits
	void *	128 bits

Thank *god* it has a fully-compliant ANSI compiler.

For extra credit:  can you guess what machine this is?
-- 
Larry Campbell             The Boston Software Works, Inc., 120 Fulton Street
campbell@redsox.bsw.com    Boston, Massachusetts 02109 (USA)

mvm@jedi.harris-atd.com (Matt Mahoney) (04/29/91)

When I need to specify bits, I'm usually forced to make the 
following assumptions:

	char	8 bits
	short	16 bits
	long	32 bits

since this is true on most machines.  Anything else would probably break
a lot of code.

-------------------------------
Matt Mahoney, mvm@epg.harris.com
#include <disclaimer.h>

warren@cbnewsh.att.com (warren.a.montgomery) (04/29/91)

There are probably a lot of hidden assumptions in programs that
sizeof(short)==2 and sizeof(long)==4 at this point, so any
assignment of meanings on a 64 bit machine (unless you invent type
"long long" for 64 bits and leave long and short at 32 and 16)
will cause some pain.

All this raises questions about C/C++'s philosophy of integer type
specification, which no doubt dates from the days where all
machines had only 2 and 4 byte integers.  C is unusual in my
experience at least in allowing the programmer explicit control
over how large integer variables will be, but not tying the
specification down to a specific size or value.  In PL/1, for
example, you explicitly state how many bits you want in a binary
integer.  This gives you much more predictability than "short" and
"long", which are more like hints to the compiler, but I suspect
that the extra precision frequently gets in the way.  (Both the
programmer and the machine wind up doing extra work when an
integer that needs to be big enough to hold 1,000,000 gets
declared to be 36 bits on a 36 bit machine (simply because the
programmer knew that was a magic number, and then ported to a 32
bit machine.)  I don't even know if it is possible in C to write a
declaration guaranteed to produce an integer of a specific size
(in bits or bytes) in a machine-independent way.

There are lots of ways a programmer may want to declare an integer:

1	Any convenient and reasonable size.
	
2	Any convenient size large enough to represent X.
	
3	The smallest convenient size large enough to represent X.

4	Exactly Y bits or bytes long.
	
5	Exactly the same size as datatype Z

1 seems to be the intent of "int".  "short", "long", and "char"
can be used to accomplish 4 for certain values of Y (which differ
on each machine).  The number of functions written to manipulate
24 bit integers, though, is evidence that this is an imperfect
solution.  2,3, and 5 aren't directly expressable in C, but people
frequently mean this and use hidden assumptions about how big
short and long are in order to do it.  There are probably other
ways that would be useful to define the size of some integer type,
but these will do for a start.  How are things like this best
expressed in C or C++?  Do other languages provide better overall solutions?

-- 

	Warren Montgomery
	att!ihlpf!warren

robertk@lotatg.lotus.com (Robert Krajewski) (04/29/91)

Actually, there's one thing that people didn't mention -- the
feasibility of bigger lightweight objects. I'd assume that any
processor that advertised a 64-bit archtitecture would be able to
efficiently move around 64 bits at a time, so it would be very cheap
to move 8-byte (excuse me, octet) objects around by copying.

barmar@think.com (Barry Margolin) (04/29/91)

In article <295@dumbcat.sf.ca.us> marc@dumbcat.sf.ca.us (Marco S Hyman) writes:
>In article <224@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
> > Any one of the above may be the most appropriate depending on the
> > instruction set.  If there are no instructions for 16 bit quantities
> > then using 16 bit short's is a big loss.
>
>Hmmm.  How would such a processor communicate with hardware devices requiring
>16-bit I/O?  How would a structure that maps an external device's registers be
>coded if the registers are 16-bits wide?  If there is a way to do these things
>then a 16-bit wide data type is probably necessary.

You could map each register into the bottom (or top, or middle, or
whatever) 16 bits of adjacent memory words or half-words.

Or, if you want to be really perverse, you make every fourth bit in the
word map into a bit in the register. :{)

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

turk@Apple.COM (Ken "Turk" Turkowski) (04/30/91)

shap@shasta.Stanford.EDU (shap) writes:

>3. If conformance to the standard is important, then the obvious
>choices are

>	short	16 bits
>	int	32 bits
>	long	64 bits
>	void *	64 bits

>4. Would it be better not to have a 32-bit data type and to make int
>be 64 bits?  If so, how would 32- and 64- bit programs interact?

It is necessary to have 8, 16, and 32-bit data types, in order to be able
to read data from files.  I would suggest NOT specifying a size for the int
data type; this is supposed to be the most efficient integral data type
for a particular machine and compiler.

A lot of programs rely on the fact that nearly all C implementations
have a 32-bit long int.

I would suggest:

short	16 bits
long	32 bits
long long	64 bits
int	UNSPECIFIED
void *	UNSPECIFIED

This is patterned after ANSI floating-point extensions to accommodate
an extended format (i.e. "long double").

How about "double long", because it really is two longs?  (Donning flame
retardent suit).

What then would 128-bit ingeter be?  long long long?  double double long?
long double long?  quadruple long?  How about the Fortran method: int*8?

Another proposal would be to invent a new word, like "big", "large",
"whopper", "humongous", "giant", "extra", "super", "grand", "huge",
"jumbo", "broad", "vast", "wide", "fat", "hefty", etc.

Whatever chaice is made, there should be ready extensions to 128 and 256
bit integers, as well as 128 and 256-bit floating point numbers.

P.S. By the way is there an analagous word for floating-point numbers as
int does for integers?
-- 
Ken Turkowski @ Apple Computer, Inc., Cupertino, CA
Internet: turk@apple.com
Applelink: TURK
UUCP: sun!apple!turk

john@sco.COM (John R. MacMillan) (04/30/91)

shap <shap@shasta.Stanford.EDU> writes:
|Several companies have announced or are known to be working on 64 bit
|architectures. It seems to me that 64 bit architectures are going to
|introduce some nontrivial problems with C and C++ code.

In a past life I did a fair amount of work with C on a 64 bit
architecture, the C/VE compiler on NOS/VE.  C/VE was a pre-ANSI
compiler, but many of the comments I think still apply.

C/VE had 64 bit ints and longs, 32 bit shorts, 8 bit chars, and 48 bit
pointers (as an added bonus, the null pointer was not all bits zero,
but that's another headache entirely; ask me how many times I've
wanted to strangle a programmer who used bzero() to clear structures
that have pointers in them).

|I want to start a discussion going on this topic.  Here are some seed
|questions:
|
|1. Do the C/C++ standards need to be extended to cover 64-bit
|environments, or are they adequate as-is?

The C standard certainly is; it's obvious a lot of effort went into
making sure it would be.

|2. If a trade-off has to be made between compliance and ease of
|porting, what's the better way to go?

I don't think there's any reason to trade off compliance with
standards; if you want to trade off ease of porting versus exploiting
the full power of the architecture that's another question.  The
answer is going to be different for different people.

If you make the only 64 bit data type be ``long long'' or some such,
it will make life much easier on porters, but most of the things you
port then won't take advantage of having 64 bit data types...

|3. If conformance to the standard is important, then the obvious
|choices are
|
|	short	16 bits
|	int	32 bits
|	long	64 bits
|	void *	64 bits

This isn't really a conformance issue.  The idea is to make the
machine's natural types fit the C types well.  However, if the
architecture can support all these sizes easily, then for ease of
porting it would be nice to have all of the common sizes available.
Many (often poorly written) C programs depend on, say, short being 16
bits, and having a 16 bit data type is the easiest way to port such
programs.

One problem with 32 bit ints and 64 bit pointers is that a lot of
(bad) code assumes you can put a pointer into an int, and vice versa.

As an aside, using things like ``typedef int int32'', is not always
the answer, especially if you don't tell the porter (or are
inconsistent about) whether this should be an integer data type that is
_exactly_ 32 bits or _at least_ 32 bits.

|How bad is it for sizeof(int) != sizeof(long).

The C/VE compiler had sizeof(int) == sizeof(long) so I can't comment
on that one in particular, but...

|4. Would it be better not to have a 32-bit data type and to make int
|be 64 bits?  If so, how would 32- and 64- bit programs interact?

...there is a lot of badly written code out there, and no matter what
you do, you'll break somebody's bogus assumptions.  In particular a
lot of code makes assumptions about pointer sizes, whether they'll fit
in ints, and whether you can treat them like ints.

Expect porting to 64 bit architectures to be work, because it is.

sarima@tdatirv.UUCP (Stanley Friesen) (04/30/91)

In article <295@dumbcat.sf.ca.us> marc@dumbcat.sf.ca.us (Marco S Hyman) writes:
>I hope not -- at least not without some other way of describing a 16-bit
>value.  64-bit architecture machines will still have to communicate with other
>machines that do support 16-bit values.  Swapping bytes between big and little
>endian machines is bad enough. Think about the overhead of converting a pair
>of bytes to a 16-bit value.  

Binary communication between machines should use XDR *not* C structures.

As for how to do it, see below.

>Hmmm.  How would such a processor communicate with hardware devices requiring
>16-bit I/O?  How would a structure that maps an external device's registers be
>coded if the registers are 16-bits wide?  If there is a way to do these things
>then a 16-bit wide data type is probably necessary.

Well, two seperate points.

First, if the machine does not have 16-bit operations in its instruction set,
then 16-bit I/O registers will probably be wired into 32-bit pseudo-words
(with the high-order 16-bits wired into the bit-bucket).  Remember, the lack
of 16-bit instructions would mean that 16-bit bus transfers are not possible.

Secondly, there are two possible implementations of this sort of thing.

First, C has this concept called bit fields, which can be of any size you
want - so if you *must* have a 16-bit quantity, use a 16-bit bit field.
(This also answers the previous objection - if you *must* be machine
dependent, at least make it stand out).

Alternatively, for the I/O register case, just send two sequential bytes.
(Unless the machine does not have byte adressing - which is quite possible
in a 64-bit super-computer).

Remember, binary compatibility between machines is a very shaky proposition
at best, so it is better not to depend on it at all.
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

wmm@world.std.com (William M Miller) (04/30/91)

bhoughto@pima.intel.com (Blair P. Houghton) writes:
> The suggested choices are:
>
>         short   <the shortest integer the user should handle; >= 8 bits>

Actually, ANSI requires at least 16 bits for shorts (see SHRT_MIN and
SHRT_MAX in <limits.h>, X3.159-1989 2.2.4.2.1).

-- William M. Miller, Glockenspiel, Ltd.
   wmm@world.std.com

sarima@tdatirv.UUCP (Stanley Friesen) (04/30/91)

In article <1991Apr29.140256.27605@cbnewsh.att.com> warren@cbnewsh.att.com (warren.a.montgomery) writes:
<There are probably a lot of hidden assumptions in programs that
<sizeof(short)==2 and sizeof(long)==4 at this point, so any
<assignment of meanings on a 64 bit machine 

True, but such code is lkely to cause pain at any time anyway, Too many
currently *existing* machines violate this assumption.

<  I don't even know if it is possible in C to write a
<declaration guaranteed to produce an integer of a specific size
<(in bits or bytes) in a machine-independent way.

Well you can come rather close, though you need to do some special stuff
to do arithmetic on them:

struct {
	signed long value:SIZE;
};

will will give you SIZE bits on almost all machines.  
[To do arithmetic you must explicitely access the value member].

Remember if *bits* are important use bitfields

<There are lots of ways a programmer may want to declare an integer:
<
<1	Any convenient and reasonable size.

Declare this an int.

<2	Any convenient size large enough to represent X.

Use either short or long depending on whether 16 bits is sufficient or not.
[Note that a minimum requirement of a certain number of bits *is* portable,
short's must be at least 16 bits in size, though they may be larger].	

<3	The smallest convenient size large enough to represent X.

Harder to do portably.  I would suggest using a typedef and a machine-specific
header file for each architecture.

<4	Exactly Y bits or bytes long.

The only portable way to do this is bitfields.
[On a *given* machine you may use a basic integer type if it is the right size].

<5	Exactly the same size as datatype Z
<
<  2,3, and 5 aren't directly expressable in C, but people
<frequently mean this and use hidden assumptions about how big
<short and long are in order to do it.

But 2. *is* possible.  You can assume that a short is *at* *least* 16 bits,
and a long is *at* *least* 32 bit.  You just may get more than you absolutely
need.  (And of course char is at least 8 bits).

Just as long as you do not assume that any type is exactly some size you
are safe.
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/01/91)

mvm@jedi.harris-atd.com (Matt Mahoney) writes:

>When I need to specify bits, I'm usually forced to make the 
>following assumptions:

>	char	8 bits
>	short	16 bits
>	long	32 bits

>since this is true on most machines.  Anything else would probably break
>a lot of code.

What would break if you did:

	char	8 bits
	short	16 bits
	int	32 bits
	long	64 bits

where any pointer or pointer difference would fit in 32 bits?
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/01/91)

john@sco.COM (John R. MacMillan) writes:

>One problem with 32 bit ints and 64 bit pointers is that a lot of
>(bad) code assumes you can put a pointer into an int, and vice versa.

>...there is a lot of badly written code out there, and no matter what
>you do, you'll break somebody's bogus assumptions.  In particular a
>lot of code makes assumptions about pointer sizes, whether they'll fit
>in ints, and whether you can treat them like ints.

For how long should we keep porting code, especially BAD CODE?  This sounds
a lot like school systems that keep moving failing students up each year
and we know what that results in.

IMHO, no code older than 8 years should be permitted to be ported and if
it is found to be "bad" code then it must have been written more than 8
years ago.
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

peter@llama.trl.OZ.AU (Peter Richardson - NSSS) (05/01/91)

In article <4068@inews.intel.com>, bhoughto@pima.intel.com (Blair P.
Houghton) writes:
> In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
> >It seems to me that 64 bit architectures are going to
> >introduce some nontrivial problems with C and C++ code.
> 
> Nope.  They're trivial if you didn't assume 32-bit architecture,
> which you shouldn't, since many computers still have 36, 16, 8,
> etc.-bit architectures.
> 
> >I want to start a discussion going on this topic.  Here are some seed
> >questions:

Hmmm. As I understand it. if you want to write truly portable code, you
should never make assumptions about sizeof any integral types. We have
a local header file on each machine type defining Byte, DoubleByte etc.
For example, on sun4:

typedef unsigned char Byte;             // always a single byte
typedef unsigned short DoubleByte;      // always two bytes
typedef unsigned long QuadByte;         // always four bytes

If you want to use an int, use an int. If you want to use a 16 bit
quantity, use a DoubleByte. To port to new machines, just change the
header file. Purists may prefer "Octet" to "Byte".

It is up to the platform/compiler implementation to determine the
appropriate sizeof integral types. It should not be part of the
language.

> 
> Poorly, if at all.  Data transmission among architechures
> with different bus sizes is a hairy issue of much aspirin.
> The only portable method is to store and transmit the data
> in some width-independent form, like morse-code or a text
> format (yes, ascii is a 7 or 8 bits wide, but it's a
> _common_ form of data-width hack, and if all else fails,
> you can hire people to read and type it into your
> machine).

There is an international standard for doing this, called Abstract
Syntax Notation One (ASN.1), defined by ISO. It is based on the CCITT
standards X.208 and X.209 (I think). It is more powerful than either of
the proprietary standards XDR or NDR. Compilers are used to translate
ASN.1 data descriptions into C/C++ structures, and produce
encoder/decoders.  

---
Peter Richardson 		         Phone: +61 3 541-6342
Telecom Research Laboratories 	           Fax: +61 3 544-2362
					 Snail: GPO Box 249, Clayton, 3168
                                                Victoria, Australia
Internet: p.richardson@trl.oz.au
X400: g=peter s=richardson ou=trl o=telecom prmd=telecom006 admd=telememo c=au

bhoughto@pima.intel.com (Blair P. Houghton) (05/01/91)

In article <1991Apr30.140217.7065@world.std.com> wmm@world.std.com (William M Miller) writes:
>bhoughto@pima.intel.com (Blair P. Houghton) writes:
>>         short   <the shortest integer the user should handle; >= 8 bits>
>Actually, ANSI requires at least 16 bits for shorts (see SHRT_MIN and
>SHRT_MAX in <limits.h>, X3.159-1989 2.2.4.2.1).

I had my brain packed-BCD mode that day, apparently :-/...
The minimum sizes for the four integer types are:

	char	8 bits
	short	16
	int	16
	long	32

Other than that, one need only ensure that short, int, and
long are multiples of the size of a char, e.g., 9, 27, 36, 36.

				--Blair
				  "Hike!"

dlw@odi.com (Dan Weinreb) (05/01/91)

In article <1991May1.012242.26211@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:

   From: phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN)
   Date: 1 May 91 01:22:42 GMT
   References: <168@shasta.Stanford.EDU> <1991Apr29.211937.10865@sco.COM>
   Organization: University of Illinois at Urbana

   IMHO, no code older than 8 years should be permitted to be ported 

I see from the Organization field in your mail header that you're from
a university.

turk@Apple.COM (Ken "Turk" Turkowski) (05/01/91)

jac@gandalf.llnl.gov (James A. Crotinger) writes:

>  Anyone want to comment on experiences with Crays? I believe the C
>compilers have sizeof(int) = sizeof(short) = sizeof(long) == 46 or 64
>bits, depending on a compile time flag (Crays can do 46 bit integer
>arithmetic using the vectorizing floating point processors, so that is
>the default). 

>  Binary communications between Crays and other computers is something
>I haven't done, mostly because Cray doesn't support IEEE floating point.

Crays are a pain.  Structures are padded to 64 bits, so you waste a lot
of memory in arrays if your structures only have 16 bits.

I thought the number was 48 bits, but you might be right.  I seem to recall
that logical operations couldn't work with 64 bits, but arithmetic operations
could, or vice versa.

If you want to write out a number between 8 and 64 bits, you need to break
it into bytes.  As long as you're doing this, you may as well write the
proper endian.

It's not too bad to write a cray FP to IEEE converter.  One of the basic
rules of machine independent I/O, though, is to not write structures out
directly, but rather to go through a procedure for each element.  Same goes
for input.
-- 
Ken Turkowski @ Apple Computer, Inc., Cupertino, CA
Internet: turk@apple.com
Applelink: TURK
UUCP: sun!apple!turk

jerry@talos.npri.com (Jerry Gitomer) (05/01/91)

mvm@jedi.harris-atd.com (Matt Mahoney) writes:

:When I need to specify bits, I'm usually forced to make the 
:following assumptions:

:	char	8 bits
:	short	16 bits
:	long	32 bits

:since this is true on most machines.  Anything else would probably break
:a lot of code.

	We are caught between the rock (wanting to take *full* advantage of
	the new wider register and memory data path machines 64 bits today,
	128 tomorrow, and 256 the day after tomorrow) and the hard place 
	(wanting to preserve code that was handcrafted to the
	idiosyncracies of prior generation hardware).  Our choices are
	simple -- throw the performance of the new machines out the window
	or spend the time and money required to fix up the code so that it
	complies with the standard.  (Sure, standards aren't cast in
	concrete, but their life exptancy exceeds that of today's typical
	computer system).

	IMHO (now isn't that an arrogant phrase? :-) ) it is better to fix
	up the offending programs now than to do it later.  I say this
	because I presume that salaries will continue to increase, which
	will make it more expensive to fix things up later, and because
	staff turnover leads to a decrease over time in knowledge of the
	offending programs.

-- 
Jerry Gitomer at National Political Resources Inc, Alexandria, VA USA
I am apolitical, have no resources, and speak only for myself.
Ma Bell (703)683-9090  (UUCP:  ...uunet!uupsi!npri6!jerry )

cadsi@ccad.uiowa.edu (CADSI) (05/01/91)

From article <13261@goofy.Apple.COM>, by turk@Apple.COM (Ken "Turk" Turkowski):
> jac@gandalf.llnl.gov (James A. Crotinger) writes:
> 
> 
>>  Anyone want to comment on experiences with Crays? I believe the C
>>compilers have sizeof(int) = sizeof(short) = sizeof(long) == 46 or 64
>>bits, depending on a compile time flag (Crays can do 46 bit integer
>>arithmetic using the vectorizing floating point processors, so that is
>>the default). 
> 
>>  Binary communications between Crays and other computers is something
>>I haven't done, mostly because Cray doesn't support IEEE floating point.
> 
> Crays are a pain.  Structures are padded to 64 bits, so you waste a lot
> of memory in arrays if your structures only have 16 bits.
> 
> I thought the number was 48 bits, but you might be right.  I seem to recall
> that logical operations couldn't work with 64 bits, but arithmetic operations
> could, or vice versa.
>...

to further these comments note that on a CRAY, a (char *) is a bit-packed
structure itself.  In particular, the lower 48 bits are a pointer to
the word in memory where the character exists.  The top 3 bits of the
64-bit pointer are an index to the byte of the word of the actual
character.  Thus, an (int *) cannot be conveniently cast to a (char *).
I hate it when that happens.

|----------------------------------------------------------------------------|
|Tom Hite					|  The views expressed by me |
|Manager, Product development			|  are mine, not necessarily |
|CADSI (Computer Aided Design Software Inc.	|  the views of CADSI.       |
|----------------------------------------------------------------------------|

rli@buster.stafford.tx.us (Buster Irby) (05/02/91)

turk@Apple.COM (Ken "Turk" Turkowski) writes:

>It is necessary to have 8, 16, and 32-bit data types, in order to be able
>to read data from files.  I would suggest NOT specifying a size for the int
>data type; this is supposed to be the most efficient integral data type
>for a particular machine and compiler.

You assume a lot about the data in the file.  Is it stored in a specific
processor format (ala Intel vs Motorolla)?  My experience has been that
binary data is not portable anyway.

daves@ex.heurikon.com (Dave Scidmore) (05/02/91)

In article <6157@trantor.harris-atd.com> mvm@jedi.UUCP (Matt Mahoney) writes:
>When I need to specify bits, I'm usually forced to make the 
>following assumptions:
>
>	char	8 bits
>	short	16 bits
>	long	32 bits
>
>since this is true on most machines.  Anything else would probably break
>a lot of code.

I'm supprised nobody has mentioned that the real solution to this kind
of portability problem is for the original programmer to use the
definitions in "types.h" that tell you how big chars, shorts, ints,
and longs are. I know that a lot of existing code does not take advantage
of the ability to use typdefs or #defines to alter the size of key
variables or adjust for numbers of bits for each, but doing so would
help prevent the kinds of portability problems mentioned. I always urge
people when writing their own code to be aware of size dependent code
and either use the existing "types.h", or make their own and use it to
make such code more portable. This won't help you when porting someone
elses machine dependant (and dare I say poorly written) code, but the next
guy who has to port your code will have an easier time of it.
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

daves@ex.heurikon.com (Dave Scidmore) (05/02/91)

warren@cbnewsh.att.com (warren.a.montgomery) writes:
>There are probably a lot of hidden assumptions in programs that
>sizeof(short)==2 and sizeof(long)==4 at this point, so any
>assignment of meanings on a 64 bit machine (unless you invent type
>"long long" for 64 bits and leave long and short at 32 and 16)
>will cause some pain.
>
>All this raises questions about C/C++'s philosophy of integer type
>specification, which no doubt dates from the days where all
>machines had only 2 and 4 byte integers.

I doubt such a day ever existed. The early K&R book refered to machines
with 9 bit chars and 36 bit integers when describing the number of bits
in various quantities.

>...  How are things like this best
>expressed in C or C++?
>Do other languages provide better overall solutions?

The important factor you have to keep in mind when dealing with C
or C++ is that they both attempt to keep the match between C expressions
and machine instructions as close as possible. This results in more
efficient code and faster execution, one of the strong points of C. In
this respect C has often been refered to as a glorified assembly language
since it allows you to do use level language constructs, but still allows
you to use types and constructs which have a high degree of coorelation
with the machine instruction set.

This approach places the burden for portability on the programmer rather than
on the compiler with the end result that the compiler need not waste time
dealing with variables that are of a size that is not a good match to
types supported by the the underlying machine. I have yet to see a piece
of code that depends on sizeof(built in type) == X that could not be written
to be portable across almost any machine (except where no machine type is
large enough to hold the range of values needed). One approach is to isolate
dependancies using typedefs like:

typedef Int32Bit long;
typedef Int16Bit short;
typedef Int8Bit char;

then use the appropriate type for the situation. This approach fails when
working on machines that are not based on 8 bit bytes, or that have odd
size integers. In this case using the header file "types.h" can help since
it tells you the size of all built in types. You can then use preproccesor
directives to choose the appropriate size for the situation.

The answer to your question is that other languages do provide a "better"
overall solution, if the performance of the compiled code is not critical.
But where the need to translate high level language statements into as few
and as fast a set of machine language instructions as possible is the goal,
C offers a reasonable approach, and maybe even the "best solution" for that
situation.
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

daves@ex.heurikon.com (Dave Scidmore) (05/02/91)

turk@Apple.COM (Ken "Turk" Turkowski) writes:

>It is necessary to have 8, 16, and 32-bit data types, in order to be able
>to read data from files.

Bad practice!!!! This works fine if the one reading the data is always the
same as the one writing it, but you are implying that these data sizes
are important for having a machine read files written by another machine,
then storing structures as binary images can result in severe problems. Byte
ordering is a more fundamental problem than the size of types when trying to
read and write binary images.

The world of microcomputers is divided into two camps: those who store the least
significant byte of a 16 or 32 bit quantity in the lowest memory location
(as in Intel processors), and those which store the most significant byte in
the lowest memory location (as in Motorola processors). Given the value
0x12345678 each stores 32 bit quantities as follows:

Memory address LSB
0	1	2	3
0x78	0x56	0x34	0x12	LSB in lowest address (Intel convention)
0x12	0x34	0x56	0x78	MSB in lowest address (Motorola convention)

From this you can see that if a big-endian processor writes a 32 bit int
into memory a little endian processor will read it back backwards. The end
result is the need to swap all bytes within 16 and 32 bit quantites. When
reading structures from a file, this can only be done if you know the size
of each component of the structure and swap it after reading. In general
this is usualy sufficient reason not to store binary images of data in files
unless you can assure that the machine reading the values will always follow
the same size and byte ordering convention.

>I would suggest NOT specifying a size for the int
>data type; this is supposed to be the most efficient integral data type
>for a particular machine and compiler.

I agree.

>A lot of programs rely on the fact that nearly all C implementations
>have a 32-bit long int.

The precident for non-32 bit ints predates the microprocessor and anyone who
writes a supposedly "portable" program assuming long ints are 32 bits is
creating a complex and difficult mess for the person who has to port the
code to untangle.

>I would suggest:
>
>short	16 bits
>long	32 bits
>long long	64 bits
>int	UNSPECIFIED
>void *	UNSPECIFIED

I would suggest not making assumptions about the size of built in types
when writing portable code.
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/02/91)

dlw@odi.com (Dan Weinreb) writes:

>In article <1991May1.012242.26211@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:

>   From: phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN)
>   Date: 1 May 91 01:22:42 GMT
>   References: <168@shasta.Stanford.EDU> <1991Apr29.211937.10865@sco.COM>
>   Organization: University of Illinois at Urbana

>   IMHO, no code older than 8 years should be permitted to be ported 

>I see from the Organization field in your mail header that you're from
>a university.

I see from the domain in your return address you are from a commercial
organization.

SO............
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

john@sco.COM (John R. MacMillan) (05/02/91)

|>4. Would it be better not to have a 32-bit data type and to make int
|>be 64 bits?  If so, how would 32- and 64- bit programs interact?
|
|It is necessary to have 8, 16, and 32-bit data types, in order to be able
|to read data from files.

It's not necessary, but it does make it easier.

|I would suggest NOT specifying a size for the int
|data type; this is supposed to be the most efficient integral data type
|for a particular machine and compiler.
|
|[...]
|
|short	16 bits
|long	32 bits
|long long	64 bits
|int	UNSPECIFIED
|void *	UNSPECIFIED

Problem with this is that I don't think sizeof(long) is allowed to be
less than sizeof(int) which would constrain your ints to 32 bits.

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/02/91)

jerry@talos.npri.com (Jerry Gitomer) writes:

>	IMHO (now isn't that an arrogant phrase? :-) ) it is better to fix
>	up the offending programs now than to do it later.  I say this
>	because I presume that salaries will continue to increase, which
>	will make it more expensive to fix things up later, and because
>	staff turnover leads to a decrease over time in knowledge of the
>	offending programs.

Also, what about staff MORALE?  I don't know about a lot of other programmers,
but I for one would be much happier at the very least cleaning up old code and
making it work right (or better yet rewriting it from scratch the way it SHOULD
have been done in the first place) than perpetuationg bad designs of the past
which translate into inefficiencies of the future.

But if you are interested in getting things converted quickly, then just make
TWO models of the compiler.  You then assign a special flag name to make the
compiler work in such a way that it will avoid breaking old code.  Programs
written AFTER the compiler is ready should be required to compile WITHOUT that
flag.  You could call the flag "-badcode".  I think that might be a fair
compromise between getting all the old bad code to work now under the new
machine, while still promoting better programming practices for the present
and future (and flagging examples of what NOT to do).
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

john@sco.COM (John R. MacMillan) (05/02/91)

|For how long should we keep porting code, especially BAD CODE?  This sounds
|a lot like school systems that keep moving failing students up each year
|and we know what that results in.

Whether or not it's a good idea, people will keep porting bad code as
long as other people are willing to pay for it.

Users are really buying Spiffo 6.3.  They don't care how it's written;
they just like it.  So Monster Hardware, in an effort to boost sales,
wants to be able to sell their boxes as a platform for running Spiffo
6.3.  They don't care how it's written, they just want it.  The core
of Spiffo 6.3 is from Spiffo 1.0, and was written 10 years ago by 5
programmers who are now either VPs or no longer there, and who never
considered it would have to run on an MH1800 where the chars are 11
bits and ints are 33.

It happens.  Honest.  I suspect many of us know a Spiffo or two.

|IMHO, no code older than 8 years should be permitted to be ported and if
|it is found to be "bad" code then it must have been written more than 8
|years ago.

The first part simply won't happen if there's demand, and I'm not sure
I understand the second part.

cadsi@ccad.uiowa.edu (CADSI) (05/02/91)

From article <1991May01.172042.5214@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby):
> turk@Apple.COM (Ken "Turk" Turkowski) writes:
> 
>>It is necessary to have 8, 16, and 32-bit data types, in order to be able
>>to read data from files.  I would suggest NOT specifying a size for the int
>>data type; this is supposed to be the most efficient integral data type
>>for a particular machine and compiler.
> 
> You assume a lot about the data in the file.  Is it stored in a specific
> processor format (ala Intel vs Motorolla)?  My experience has been that
> binary data is not portable anyway.

Binary isn't in general portable.  However, using proper typedefs in
a class one can move binary read/write classes from box to box.  I think
the solution the the whole issue of sizeof(whatever) is to simply assume
nothing.  Always typedef.  It isn't that difficult, and code I've done this
runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without
code (barring the typedef header files) changes.

|----------------------------------------------------------------------------|
|Tom Hite					|  The views expressed by me |
|Manager, Product development			|  are mine, not necessarily |
|CADSI (Computer Aided Design Software Inc.	|  the views of CADSI.       |
|----------------------------------------------------------------------------|

steve@taumet.com (Stephen Clamage) (05/02/91)

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:

>But if you are interested in getting things converted quickly, then just make
>TWO models of the compiler.  You then assign a special flag name to make the
>compiler work in such a way that it will avoid breaking old code.

Some compilers already do this.  For example, our compilers (available from
Oregon Software) have a "compatibility" switch which allows compilation of
old-style code, including old-style preprocessing.  In this mode, ANSI
features (including ANSI preprocessing and function prototypes) are still
available, allowing gradual migration of programs from old-style to ANSI C.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

bright@nazgul.UUCP (Walter Bright) (05/03/91)

In article <12563@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
/In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
/>How bad is it for sizeof(int) != sizeof(long). 
/It does cause problems---there is always software that makes invalid
/assumptions---but typically long-vs-int problems, while rampant, are
/also easily fixed.

The most aggravating problem we have is it seems we (Zortech) are the only
compiler for which:
	char
	signed char
	unsigned char
are all distinct types! For example,
	char *p;
	signed char *ps;
	unsigned char *pu;
	p = pu;			/* syntax error */
	p = ps;			/* syntax error */
It seems we are the only compiler that flags these as errors.
A related example is:
	int i;
	short *ps;
	*ps = &i;		/* syntax error, for 16 bit compilers too */
I think a lot of people are in for a surprise when they port to 32 bit
compilers... :-)

jac@gandalf.llnl.gov (James A. Crotinger) (05/03/91)

cadsi@ccad.uiowa.edu (CADSI) writes:
> to further these comments note that on a CRAY, a (char *) is a bit-packed
> structure itself.  In particular, the lower 48 bits are a pointer to
> the word in memory where the character exists.  The top 3 bits of the
> 64-bit pointer are an index to the byte of the word of the actual
> character.  Thus, an (int *) cannot be conveniently cast to a (char *).
> I hate it when that happens.

  And in Fortran, they also pack the string length into the word
(only 32 bits are used for the word address). And the really awful thing
is that this packing scheme is different on the X/Y-MP class machines
than on the CRAY-2 machines. (I think the format for char *'s is different
as well). Ack! 

  As an aside, I wrote a nifty C++ class called FCD (Fortran character
descriptor) which makes calling Fortran routines from C++ much cleaner.
You prototype the fortran functions as taking FCD's rather than char *'s:

   void FORTRANFUNCTION( int &num, FCD string ); 

Then you just call it with your char *:

   main()
   {
      char * foo = "Hello";
      FORTRANFUNCTION( 3, foo + 1 ); // Would break if this had been prototyped
				     // to take a char *. 
   }

The FCD class has an inline constructor which converts the char * to an
FCD. Much nicer than the hoops you have to jump through when using C.

Unfortunately, the mechanism used by Sun (and other UNIXes?) for passing
string lengths in Fortran is completely different. Rather than passing
a structure containing the pointer and the length, they push the lengths
onto the stack after pushing all the other args. This makes writing 
portable C++ code which interacts with Fortran libraries a pain.

  Jim
--
-----------------------------------------------------------------------------
James A. Crotinger     Lawrence Livermore Natl Lab // The above views 
jac@moonshine.llnl.gov P.O. Box 808;  L-630    \\ // are mine and are not 
(415) 422-0259         Livermore CA  94550      \\/ necessarily those of LLNL

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/03/91)

steve@taumet.com (Stephen Clamage) writes:

>Some compilers already do this.  For example, our compilers (available from
>Oregon Software) have a "compatibility" switch which allows compilation of
>old-style code, including old-style preprocessing.  In this mode, ANSI
>features (including ANSI preprocessing and function prototypes) are still
>available, allowing gradual migration of programs from old-style to ANSI C.

Good.

But is the rate of "gradual migration" catching up with, or falling behind,
all the porting of old-style?  From the sounds of some responses, it is
falling behind fast.
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu   |  Guns don't aim guns at  \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks  |  people; CRIMINALS do!!  /
 \***************************************************************************/

shap@shasta.Stanford.EDU (shap) (05/03/91)

In article <13229@goofy.Apple.COM> turk@Apple.COM (Ken "Turk" Turkowski) writes:
>I would suggest:
>
>short	16 bits
>long	32 bits
>long long	64 bits
>int	UNSPECIFIED
>void *	UNSPECIFIED
>
>This is patterned after ANSI floating-point extensions to accommodate
>an extended format (i.e. "long double").
>
>Another proposal would be to invent a new word, like "big", "large",
>"whopper", "humongous", "giant", "extra", "super", "grand", "huge",
>"jumbo", "broad", "vast", "wide", "fat", "hefty", etc.
>
>Whatever chaice is made, there should be ready extensions to 128 and 256
>bit integers, as well as 128 and 256-bit floating point numbers.

Actually, that's what you did.  The 'long long' data type does not
conform to the ANSI standard.

The advantage to the approach
	short		16
	int		32
	long		32
	long long	64

Is that fewer datatypes change size (this approach leaves only
pointers changing), and the code could conceivably have the same
integer sizes in 32- and 64-bit mode.

But isn't ANSI conformance a requirement?

shap@shasta.Stanford.EDU (shap) (05/03/91)

In article <4068@inews.intel.com> bhoughto@pima.intel.com (Blair P. Houghton) writes:
>In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>
>>2. If a trade-off has to be made between compliance and ease of
>>porting, what's the better way to go?
>
>If you're compliant, you're portable.

While I happen to agree with this sentiment, there is an argument that X
hundred million lines of C code can't be wrong.  The problem with
theology is that it's not commercially viable.

Reactions?

Jonathan

shap@shasta.Stanford.EDU (shap) (05/03/91)

In article <1991May01.172042.5214@buster.stafford.tx.us> rli@buster.stafford.tx.us writes:
>turk@Apple.COM (Ken "Turk" Turkowski) writes:
>
>>It is necessary to have 8, 16, and 32-bit data types, in order to be able
>>to read data from files.  I would suggest NOT specifying a size for the int
>>data type; this is supposed to be the most efficient integral data type
>>for a particular machine and compiler.
>
>You assume a lot about the data in the file.  Is it stored in a specific
>processor format (ala Intel vs Motorolla)?  My experience has been that
>binary data is not portable anyway.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/03/91)

In article <1991May1.023356.8048@trl.oz.au>, peter@llama.trl.OZ.AU (Peter Richardson - NSSS) writes:
> Hmmm. As I understand it. if you want to write truly portable code, you
> should never make assumptions about sizeof any integral types. We have
> a local header file on each machine type defining Byte, DoubleByte etc.
> For example, on sun4:
> 
> typedef unsigned char Byte;             // always a single byte
> typedef unsigned short DoubleByte;      // always two bytes
> typedef unsigned long QuadByte;         // always four bytes
> 
> If you want to use an int, use an int. If you want to use a 16 bit
> quantity, use a DoubleByte. To port to new machines, just change the
> header file. Purists may prefer "Octet" to "Byte".

Sorry.  You have just made a non-portable assumption, namely that there
*is* an integral type which holds an octet and that there *is* an
integral type which holds two octets, and so on.  If you want
"at least 8 bits", then use {,{,un}signed} char, and if you want
"at least 16 bits", then use {,unsigned} short.  The ANSI standard
guarantees those.  There is no need to introduce your own private
names for them.  If you want "exactly 8 bits" or "exactly 16 bits",
you have no reason to expect that such types will exist.

I am greatly disappointed that C++, having added so much to C, has not
added something like int(Low,High) to the language, which would stand
for the "most efficient" available integral type in which both Low and
High were representable.  The ANSI C committee were right not to add
such a construct to C, because their charter was to standardise, not
innovate.

An anecdote which may be of value to people designing a C compiler for
64-bit machines:  there was a UK company who built their own micro-coded
machine, and wanted to put UNIX on it.  Their C compiler initially had
char=8, short=16, int=32, long=64 bits, sizeof (int) == sizeof (char*).
They changed their compiler in a hurry, so that long=32 bits; it was
less effort to do that than to fix all the BSD sources.  It also turned
out to have market value in that many of their customers had been just
as sloppy with VAX code.

sizeof (char) is fixed at 1.  However, it should be quite easy to set up
a compiler so that the user can specify (whether in an environment variable
or in the command line) what sizes to use for short, int, long, and (if you
want to imitate GCC) long long.  Something like
	setenv CINTSIZES="16,32,32,64"	# short,int,long,long long.
The system header files would have to use the default types (call them
__int, __short, and so on) so that only one set of system libraries would
be needed, and this means that using CINTSIZES to set the sizes to something
other than the defaults would make the compiler non-conforming.
Make the defaults the best you can, but if you let people over-ride the
defaults then the task of porting sloppy code will be eased.  Other vendors
have found the hard way that customers have sloppy code.

-- 
Bad things happen periodically, and they're going to happen to somebody.
Why not you?					-- John Allen Paulos.

rli@buster.stafford.tx.us (Buster Irby) (05/03/91)

cadsi@ccad.uiowa.edu (CADSI) writes:

>From article <1991May01.172042.5214@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby):
>> turk@Apple.COM (Ken "Turk" Turkowski) writes:
>> 
>>>It is necessary to have 8, 16, and 32-bit data types, in order to be able
>>>to read data from files.  I would suggest NOT specifying a size for the int
>> 
>> You assume a lot about the data in the file.  Is it stored in a specific
>> processor format (ala Intel vs Motorolla)?  My experience has been that
>> binary data is not portable anyway.

>Binary isn't in general portable.  However, using proper typedefs in
>a class one can move binary read/write classes from box to box.  I think
>the solution the the whole issue of sizeof(whatever) is to simply assume
>nothing.  Always typedef.  It isn't that difficult, and code I've done this
>runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without
>code (barring the typedef header files) changes.

What kind of typedef would you use to swap the high and low bytes
in a 16 bit value?  An Intel or BIG_ENDIAN machine stores the
bytes in reverse order, while a Motorolla or LITTLE_ENDIAN
machine stores the bytes in normal order (High to low).  There is
no way to fix this short of reading the file one byte at a time
and stuffing them into the right place.  The point I was trying
to make is that reading and writing a data file has absolutely
nothing to do with data types.  As we have already seen, there
are a lot of different machine types that support C, and as far
as I know, all of them are capable of reading binary files,
independent of data type differences.

The only sane way to deal with this issue is to never assume
anything about the SIZE or the ORDERING of data types, which is
basically what the C standard says.  It tells you that a long >=
int >= short >= char.  It says nothing about actual size or byte
ordering within a data type.  

Another trap I ran accross recently is the ordering of bit
fields.  On AT&T 3B2 machines the first bit defined is the high
order bit, but on Intel 386 machines the first bit defined is the
low order bit.  This means that anyone who attempts to write this
data to a file and transport it to another platform is in for a
surprise, they are not compatible.  Again, the C standard says
nothing about bit ordering, and in fact cautions you against
making such assumptions.

cadsi@ccad.uiowa.edu (CADSI) (05/03/91)

From article <471@heurikon.heurikon.com>, by daves@ex.heurikon.com (Dave Scidmore):
> warren@cbnewsh.att.com (warren.a.montgomery) writes:
>>There are probably a lot of hidden assumptions in programs that
>>sizeof(short)==2 and sizeof(long)==4 at this point, so any
>>assignment of meanings on a 64 bit machine (unless you invent type
>>"long long" for 64 bits and leave long and short at 32 and 16)
>>will cause some pain.
>>
>>All this raises questions about C/C++'s philosophy of integer type
>>specification, which no doubt dates from the days where all
>>machines had only 2 and 4 byte integers.
> 
> I doubt such a day ever existed. The early K&R book refered to machines
> with 9 bit chars and 36 bit integers when describing the number of bits
> in various quantities.
> 
>>...  How are things like this best
>>expressed in C or C++?
>>Do other languages provide better overall solutions?

I add another fly to the ointment.  What about bit sliced machines.
At the low level, they really are 2 and 4 bit processors.  At the
high level (most human interaction levels) they are more like
32 and/or 64 bit machines.  So, is this always a seperate issue from
the compiler?  I'm a little ignorant about the middle level of these
machines, this is why I ask.

|----------------------------------------------------------------------------|
|Tom Hite					|  The views expressed by me |
|Manager, Product development			|  are mine, not necessarily |
|CADSI (Computer Aided Design Software Inc.	|  the views of CADSI.       |
|----------------------------------------------------------------------------|

sarima@tdatirv.UUCP (Stanley Friesen) (05/04/91)

In article <699@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>Some compilers already do this.  For example, our compilers (available from
>Oregon Software) have a "compatibility" switch which allows compilation of
>old-style code, including old-style preprocessing.

So does the System V release 4 compiler.  It has *three* switch setting.
An 'old code' mode that compiles old style C but complains about non-ANSI
code.  A hybrid mode which mostly compiles ANSI C, but still accepts non-
conflicting old-style stuff, and a *srict* ANSI mode, that enforces pure
ANSI compliance.
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

marc@dumbcat.sf.ca.us (Marco S Hyman) (05/05/91)

In article <226@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
 > Binary communication between machines should use XDR *not* C structures.

Agreed.  (But please, not ASN.1/BER as someone else suggested :-)

 > First, C has this concept called bit fields, which can be of any size you
 > want - so if you *must* have a 16-bit quantity, use a 16-bit bit field.
 > (This also answers the previous objection - if you *must* be machine
 > dependent, at least make it stand out).

And in article <230@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
 > Remember if *bits* are important use bitfields

The problem with using bit fields is that they might not be able to overlap
the <allocation unit> boundary (ARM, pg 184-185: remember, we're talking C++
here) and that a programmer cant tell from looking at the code if the first
field is placed in the high order bits or the low order bits.  I suppose the
former problem doesn't count because we want _smaller_ fields and since we are
being machine dependent by using bit fields the last is not much of a problem
either.

These days I tend to think in terms of embedded processors with lots of low
level peripheral device interface code.  A '64-bit only' processor would not
be an good match in this environment.
-- 
// marc
// home: marc@dumbcat.sf.ca.us		pacbell!dumbcat!marc
// work: marc@ascend.com		uunet!aria!marc

cadsi@ccad.uiowa.edu (CADSI) (05/06/91)

From article <1991May03.120455.158@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby):
> cadsi@ccad.uiowa.edu (CADSI) writes:
> 
>>Binary isn't in general portable.  However, using proper typedefs in
>>a class one can move binary read/write classes from box to box.  I think
>>the solution the the whole issue of sizeof(whatever) is to simply assume
>>nothing.  Always typedef.  It isn't that difficult, and code I've done this
>>runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without
>>code (barring the typedef header files) changes.
> 
> What kind of typedef would you use to swap the high and low bytes
> in a 16 bit value?  An Intel or BIG_ENDIAN machine stores the
> bytes in reverse order, while a Motorolla or LITTLE_ENDIAN
> machine stores the bytes in normal order (High to low).  There is
> no way to fix this short of reading the file one byte at a time
> and stuffing them into the right place.  The point I was trying
> to make is that reading and writing a data file has absolutely
> nothing to do with data types.  As we have already seen, there
> are a lot of different machine types that support C, and as far
> as I know, all of them are capable of reading binary files,
> independent of data type differences.

The big/little endian problem is handled via swab calls.
AND, how do we know when to do this????  We just store
the needed info in a header record.
This header is read in block fashion and typedef'ed to the structure we need.
from there, thats all we need to continue.
The typedefs have to do with internal structures, NOT simple int, char
and those type things, except for the BYTE type.
Last but not least, you'll inevitably ask how we portably read that header.
Well, we store a 'magic number' info and mess with things till the numbers are
read correctly.  Incidentally, that magic number also gives indications
of code revision level and therefore what will and won't be possible.
C'mon, this is not that difficult to comprehend.  You want portable files???
Make 'em yourself.  'C' gives you all the toys you need to do this.

[other stuff deleted - reference above]

|----------------------------------------------------------------------------|
|Tom Hite					|  The views expressed by me |
|Manager, Product development			|  are mine, not necessarily |
|CADSI (Computer Aided Design Software Inc.	|  the views of CADSI.       |
|----------------------------------------------------------------------------|

shap@shasta.Stanford.EDU (shap) (05/07/91)

In article <312@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>
>The most aggravating problem we have is it seems we (Zortech) are the only
>compiler for which:
>	char
>	signed char
>	unsigned char
>are all distinct types! For example,

One of us is misreading the standard.  If it's me maybe someone can set
me straight.

My reading of the ANSI standard was that 'char' was identically equally
to exactly one of 'unsigned char' or 'signed char', but that the choice
was up to the compiler.

Flagging suspect cases with a warning flag would be a useful feature,
but doing it unconditionally is a pain in the butt.

Have I misunderstood something in the standard?

Jonathan

turk@Apple.COM (Ken Turkowski) (05/07/91)

rli@buster.stafford.tx.us (Buster Irby) writes:
>An Intel or BIG_ENDIAN machine stores the
>bytes in reverse order, while a Motorolla or LITTLE_ENDIAN
>machine stores the bytes in normal order (High to low).

You've got this perfectly reversed.  Motorola is a BIG_ENDIAN machine,
and Intel is a LITTLE_ENDIAN machine.  Additionally, there is no
such thing as "normal".

-- 
Ken Turkowski @ Apple Computer, Inc., Cupertino, CA
Internet: turk@apple.com
Applelink: TURK
UUCP: sun!apple!turk

turk@Apple.COM (Ken Turkowski) (05/07/91)

cadsi@ccad.uiowa.edu (CADSI) writes:
>The big/little endian problem is handled via swab calls.
>AND, how do we know when to do this????  We just store
>the needed info in a header record.
>This header is read in block fashion and typedef'ed to the structure we need.

What type of header do you suggest? This should be able to record
the ordering of shorts, longs, floats, and doubles, and might need
to specify floating-point format.
-- 
Ken Turkowski @ Apple Computer, Inc., Cupertino, CA
Internet: turk@apple.com
Applelink: TURK
UUCP: sun!apple!turk

mike@taumet.com (Michael S. Ball) (05/07/91)

In article <184@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>In article <312@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>>
>>The most aggravating problem we have is it seems we (Zortech) are the only
>>compiler for which:
>>	char
>>	signed char
>>	unsigned char
>>are all distinct types! For example,

>Have I misunderstood something in the standard?

You have.  The standard states that char is a distinct type which has
the same characteristics as either signed char or unsigned char.
This appears to be a fairly late change, and earlier versions of the
standard read as you assume.

It's a bit strange, but makes very little difference in C.  Carrying
the same definition over to C++ can make things very strange indeed.
The only solution given this definition is to make all characters
which are really characters "char", and use signed char and unsigned
char only for short arithmetic values.
-- 
Michael S. Ball			mike@taumet.com
TauMetric Corporation		(619)697-7607

chip@tct.com (Chip Salzenberg) (05/07/91)

According to bright@nazgul.UUCP (Walter Bright):
>The most aggravating problem we have is it seems we (Zortech) are the only
>compiler for which:
>	char
>	signed char
>	unsigned char
>are all distinct types!

G++ 1.39 knows that they're all distinct.  It's free, too.
-- 
Brand X Industries Sentient and Semi-Sentient Being Resources Department:
        Because Sometimes, "Human" Just Isn't Good Enough [tm]
     Chip Salzenberg         <chip@tct.com>, <uunet!pdn!tct!chip>

daves@ex.heurikon.com (Dave Scidmore) (05/08/91)

>rli@buster.stafford.tx.us (Buster Irby) writes:
>>An Intel or BIG_ENDIAN machine stores the
>>bytes in reverse order, while a Motorolla or LITTLE_ENDIAN
>>machine stores the bytes in normal order (High to low).

In article <13357@goofy.Apple.COM> turk@Apple.COM (Ken Turkowski) writes:
>You've got this perfectly reversed.  Motorola is a BIG_ENDIAN machine,
>and Intel is a LITTLE_ENDIAN machine.  Additionally, there is no
>such thing as "normal".

Exactly. Both conventions have good points and bad points. The "normal"
Motorolla convention starts to look a little less normal when dynamic
bus sizing is required, in which case all byte data comes over the most
significant data bus lines. In addition big endian machines have values
that expand in size from the most significant location (i.e. two bytes
values at location X have least significant bytes in a different location
than four byte values at the same location). On the other hand the little
endian convention looks odd when you do a dump of memory and try to find long
words within a series of bytes.

In the end you can make either convention look "normal" by how you draw
the picture of it. For example which of these is normal for storing the
value 0x12345678 ?

Motorola:	Location	0	1	2	3
		Bytes		0x12	0x34	0x56	0x78

Intel:		Location	3	2	1	0
		Bytes		0x12	0x34	0x56	0x78
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

daves@ex.heurikon.com (Dave Scidmore) (05/08/91)

In article <312@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>The most aggravating problem we have is it seems we (Zortech) are the only
>compiler for which:
>	char
>	signed char
>	unsigned char
>are all distinct types! For example,
>	char *p;
>	signed char *ps;
>	unsigned char *pu;
>	p = pu;			/* syntax error */
>	p = ps;			/* syntax error */
>It seems we are the only compiler that flags these as errors.
>A related example is:
>	int i;
>	short *ps;
>	*ps = &i;		/* syntax error, for 16 bit compilers too */
>I think a lot of people are in for a surprise when they port to 32 bit
>compilers... :-)

A lot of compiler writers think they are doing you a favor by not reporting
"trivial" errors. I would much rather have to say:
	char *p;
	signed char *ps;
	unsigned char *pu;
	p = (char*) pu;
	p = (char*) ps;
than to have some subtle error result when I unknowingly perform such an
assigment and some subtle side effect occurs. For example, if the above
"unsigned char" pointer were to be used to read a buffer of characters,
where the end of the buffer was indicated by the value 0x80, if written
initialy on a nine bit per byte machine you would have no problem, but
when ported to an eight bit per byte machine you will never find a
"signed char" equal to 0x80 and so a bug will rear its ugly head. If the
compiler warns that "p = pu" are incompatible types I can look at the
statment and possibly remember that this won't work on an eight bit per
byte machine. If it is what I intended all along, I *should* have used
a cast so the next guy who looks at the code knows that the conversion
is intentional.
--
Dave Scidmore, Heurikon Corp.
dave.scidmore@heurikon.com

cadsi@ccad.uiowa.edu (CADSI) (05/08/91)

From article <2826D9E7.407F@tct.com>, by chip@tct.com (Chip Salzenberg):
> According to bright@nazgul.UUCP (Walter Bright):
>>The most aggravating problem we have is it seems we (Zortech) are the only
>>compiler for which:
>>	char
>>	signed char
>>	unsigned char
>>are all distinct types!
> 
> G++ 1.39 knows that they're all distinct.  It's free, too.

BUT, you're not supposed to make money on what you compile with G++,
right?  I thought I read that before.

|----------------------------------------------------------------------------|
|Tom Hite					|  The views expressed by me |
|Manager, Product development			|  are mine, not necessarily |
|CADSI (Computer Aided Design Software Inc.	|  the views of CADSI.       |
|----------------------------------------------------------------------------|

comeau@ditka.Chicago.COM (Greg Comeau) (05/09/91)

In article <312@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>char *p; signed char *ps; unsigned char *pu;
>p = pu;/* syntax error */ 	p = ps;	/* syntax error */
>It seems we [Zortech] are the only compiler that flags these as errors.

I'm quite sure we (Comeau C++) do too.

>A related example is: int i; short *ps; *ps = &i;

We'd flag this too (even if you left off the *! ;-}).

- Greg
-- 
	 Comeau Computing, 91-34 120th Street, Richmond Hill, NY, 11418
                          Producers of Comeau C++ 2.1
          Here:attmail.com!csanta!comeau / BIX:comeau / CIS:72331,3421
                     Voice:718-945-0009 / Fax:718-441-2310

comeau@ditka.Chicago.COM (Greg Comeau) (05/09/91)

In article <184@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>In article <312@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>>The most aggravating problem we have is it seems we (Zortech) are the only
>>compiler for which: 	char signed char unsigned char
>>are all distinct types! For example,
>My reading of the ANSI standard was that 'char' was identically equally
>to exactly one of 'unsigned char' or 'signed char', but that the choice
>was up to the compiler.

That may be so, but that doesn't say that it's not another type in and
of itself even though we don't know its sign or which one "it's like".

I any event ANSI C tells us "The three types char, signed char, and unsigned
char are collectively call the ``character types''." while the ARM
tells us "Plain char, signed char, and unsigned char, are three distinct
types.".

Remember also we've got "unsigned preserving" vs "value preserving"
mess to contend with (of which I currently forget which C++ follows).

- Greg
-- 
	 Comeau Computing, 91-34 120th Street, Richmond Hill, NY, 11418
                          Producers of Comeau C++ 2.1
          Here:attmail.com!csanta!comeau / BIX:comeau / CIS:72331,3421
                     Voice:718-945-0009 / Fax:718-441-2310

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/09/91)

In article <470@heurikon.heurikon.com>, daves@ex.heurikon.com (Dave Scidmore) writes:
> I'm supprised nobody has mentioned that the real solution to this kind
> of portability problem is for the original programmer to use the
> definitions in "types.h" that tell you how big chars, shorts, ints,
> and longs are.

Why be surprised?  I'm using an Encore Multimax running 4.3BSD, and
on this machine there _isn't_ any types.h file.  We have a copy of GCC,
so we _have_ access to the ANSI file, but that's <limits.h>, not "types.h".

-- 
Bad things happen periodically, and they're going to happen to somebody.
Why not you?					-- John Allen Paulos.

conor@lion.inmos.co.uk (Conor O'Neill) (05/09/91)

In article <179@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:
>While I happen to agree with this sentiment, there is an argument that X
>hundred million lines of C code can't be wrong.  The problem with
>theology is that it's not commercially viable.


Or did you mean "C hundred million lines of X code"...

(Apparently X even has such nasties buried inside it as expecting
that successive calls to malloc have higher addresses,
forcing the heap to grow upwards.) (So I'm informed)
---
Conor O'Neill, Software Group, INMOS Ltd., UK.
UK: conor@inmos.co.uk		US: conor@inmos.com
"It's state-of-the-art" "But it doesn't work!" "That is the state-of-the-art".

chip@tct.com (Chip Salzenberg) (05/09/91)

According to cadsi@ccad.uiowa.edu (CADSI):
>From article <2826D9E7.407F@tct.com>, by chip@tct.com (Chip Salzenberg):
>> G++ 1.39 ... is free, too.
>
>BUT, you're not supposed to make money on what you compile with G++,
>right?  I thought I read that before.

Wrong.  You can make money off a program compiled with G++.  In fact,
we have our very first C++ project in beta test right now, and it's
all compiled with G++.

The trick is that you can't distribute a program containing GNU code
(like the GNU C++ library "libg++") unless you also distribute source
code to the whole program linked therewith.  We avoid this problem by
avoiding libg++.

A new version of the GNU public license may soon make a special case
of libraries, by requiring distribution of library source code, but
not the source code of the program(s) linked with the library.

Stay tuned in gcc.announce and gnu.misc.discuss.
-- 
Brand X Industries Sentient and Semi-Sentient Being Resources Department:
        Because Sometimes, "Human" Just Isn't Good Enough [tm]
     Chip Salzenberg         <chip@tct.com>, <uunet!pdn!tct!chip>

david@ap542.uucp (05/15/91)

-cadsi@ccad.uiowa.edu (CADSI) writes:
->by rli@buster.stafford.tx.us (Buster Irby):
->> cadsi@ccad.uiowa.edu (CADSI) writes:
->> 
->>>Binary isn't in general portable.  However, using proper typedefs in
->>>a class one can move binary read/write classes from box to box.  
->> 
->> What kind of typedef would you use to swap the high and low bytes
->> in a 16 bit value?  An Intel or BIG_ENDIAN machine stores the
->> bytes in reverse order, while a Motorolla or LITTLE_ENDIAN
->> machine stores the bytes in normal order (High to low).  There is
->> no way to fix this short of reading the file one byte at a time
->> and stuffing them into the right place.  
->
->The big/little endian problem is handled via swab calls.
->AND, how do we know when to do this????  We just store
->the needed info in a header record.

NO, NO, NO!  The way to do this is to use XDR.

**========================================================================**
David E. Smyth
david%ap542@ztivax.siemens.com	<- I can get your mail, but our mailer
				   is broken.  But I can post.  You figure...
**========================================================================**

ray@philmtl.philips.ca (Ray Dunn) (05/16/91)

In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes:
>>2. If a trade-off has to be made between compliance and ease of
>>porting, what's the better way to go?
>
>If you're compliant, you're portable.

This is like saying if the syntax is correct then the semantics must be too
although, indeed, there is no need to trade compliance for portability.

There are dangers clearly visible though.  All you can say is that your
compliant program has an excellent chance of compiling on another system
running a compliant compiler, not that it will necessarily work correctly
with the new parameters plugged in.

You only know a program is portable *after* you've tested it on another
system.  How many do that during the initial development cycle?

Porting is not an issue that goes away by writing compliant code - it may
in fact *hide* some of the problems.

If I wanted to be controversial, I might say that 'C's supposed
"portability" is a loaded cannon.  Bugs caused by transferring a piece of
software to another system will continue to exist, even in compliant
software.  Prior to "portable" 'C', porting problems were *expected*,
visible, and handled accordingly.

Will developers still assume these bugs to be likely, and handle
verification accordingly, or will they be lulled by "it compiled first
time" into thinking that the portability issue has been taken into account
up front, and treat it with less attention than it deserves?
-- 
Ray Dunn.                    | UUCP: ray@philmtl.philips.ca
Philips Electronics Ltd.     |       ..!{uunet|philapd|philabs}!philmtl!ray
600 Dr Frederik Philips Blvd | TEL : (514) 744-8987  (Phonemail)
St Laurent. Quebec.  H4M 2S9 | FAX : (514) 744-9550  TLX: 05-824090

jeh@cmkrnl.uucp (05/16/91)

In article <521@heurikon.heurikon.com>, daves@ex.heurikon.com (Dave Scidmore)
 writes:
> [...]
> On the other hand the little
> endian convention looks odd when you do a dump of memory and try to find long
> words within a series of bytes.

Yup.  But this can be solved.  The VAX is a little-endian machine, and VMS
utilities address [ahem] this problem by always showing hex contents with
increasing addresses going from right to left across the page.  Since
significance of the bytes (actually the nibbles) increases with increasing
addresses, this looks perfectly correct... the most significant nibble goes on
the left, just the way you'd "naturally" write it.  For example, the value 1
stored in 32 bits gets displayed as 00000001.  

If you get a hex-plus-Ascii dump, such as is produced by DUMP (for files) or
ANALYZE/SYSTEM (lets you look at live memory), the hex goes from right to left,
and the ascii from left to right, like this: 

SDA> ex 200;60
0130011A 0120011B 0130011E 0110011F  ......0... ...0.     00000200
01200107 02300510 04310216 04210218  ..!...1...0... .     00000210
01100103 01100104 01200105 01200106  .. ... .........     00000220
44412107 01100100 01100101 01100102  .............!AD     00000230
4B202020 20444121 44412106 42582321  !#XB.!AD!AD    K     00000240
00524553 55525055 53434558 454E5245  ERNEXECSUPRUSER.     00000250

In the last row, the string "EXEC" is at address 253, and the last byte on the
line, 25F, contains hex 00.  In the first row, the word (16 bits) at location
204 contains hex value 11E; if you address the same location as a longword, you
get the value 0130011E.  

This looks completely bizarre at first, but once you get used to it (a few
minutes or so for most folks) it makes perfect sense.  

The VAX is consistent in bit numbering too:  The least significant bit of a 
byte is called bit 0, and when you draw bit maps of bytes or larger items,
you always put the lsb on the right.  

	--- Jamie Hanrahan, Kernel Mode Consulting, San Diego CA
Chair, VMS Internals Working Group, U.S. DECUS VAX Systems SIG 
Internet:  jeh@dcs.simpact.com, hanrahan@eisner.decus.org, or jeh@crash.cts.com
Uucp:  ...{crash,scubed,decwrl}!simpact!cmkrnl!jeh

augliere@bbn.com (Reed Augliere) (05/18/91)

I've just installed the Oregon C++ compiler on a VaxStation 3100
running VMS 5-3.1 and am trying to get it to work with MOTIF 1.1
(DEC MOTIF Developer's Kit).  I'm getting all kinds of compile errors
which are apparently a consequence of conflicts with with C language
constants in certain VMS header files (like "new" and "class") and with
what is apparently a VMS-specific datatype:  "externalref".

Has anybody tried to use the Oregon C++ compiler using MOTIF on VMS?
If so, have you encountered this problem and figured out a solution?

Any help will be greatly appreciated.


Reed Augliere
raugliere@bbn.coM

bhoughto@pima.intel.com (Blair P. Houghton) (05/18/91)

In article <1991May15.190016.21817@philmtl.philips.ca> ray@philmtl.philips.ca (Ray Dunn) writes:
>In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes:
>>If you're compliant, you're portable.
>
>You only know a program is portable *after* you've tested it on another
>system.  How many do that during the initial development cycle?

Well, I do, several times, to the point of working for a while on
one platform, moving to another, testing, working there for a while,
moving to a third, and so on.  It helps to aim for three targets.

>If I wanted to be controversial, I might say that 'C's supposed
>"portability" is a loaded cannon.  Bugs caused by transferring a piece of
>software to another system will continue to exist, even in compliant
>software.  Prior to "portable" 'C', porting problems were *expected*,
>visible, and handled accordingly.

Now they're bugs in the compiler, not just "issues of
implementation."

If you're using any C construct that produces different
behavior on disparate, conforming implementations, then
either one of those implementations is not conforming or
you are not using ANSI C, but rather have used some sort of
extension or relied on some sort of unspecified behavior,
and therefore your program is not compliant.

>Will developers still assume these bugs to be likely, and handle
>verification accordingly, or will they be lulled by "it compiled first
>time" into thinking that the portability issue has been taken into account
>up front, and treat it with less attention than it deserves?

Your question is all but naive.  I still get valid
enhancement requests on code that's several years old,
which means that I failed to design it to suit the needs of
my customer, which means it's buggy.  Routines that have
been compiled and/or run thousands of times in real-world
situations come up wanting.  Nobody sane assumes that
anything is right the first time (though one may determine
that the probability of failure is low enough to make an
immediate release feasible).

				--Blair
				  "I'm going to put all of this
				   on video and hawk it on cable
				   teevee in the middle of the
				   night while wearing pilled
				   polyester and smiling a lot."

ray@philmtl.philips.ca (Ray Dunn) (05/21/91)

In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes:
>In referenced article, ray@philmtl.philips.ca (Ray Dunn) writes:
>>Prior to "portable" 'C', porting problems were *expected*,
>>visible, and handled accordingly.
>
>Now they're bugs in the compiler, not just "issues of
>implementation."

No - now they're "issues of *system dependancies*".

>>Will developers still assume these bugs to be likely, and handle
>>verification accordingly, or will they be lulled by "it compiled first
>>time" into thinking that the portability issue has been taken into account
>>up front, and treat it with less attention than it deserves?
>
>Your question is all but naive.

Only if you ignore the fact, which you seem to do, that many of the issues
of portability in the real world are created by differences in system
hardware, operating systems and file management facilities.  This is true
for nearly all software for example which has a tightly coupled user
interface, or which is forced to process system specific non-ascii-stream
data files, or to interface with multi-tasking facilities.  Even
differencies in Floating Point handling can create major pains-in-the-neck.

There's more to portability than 'C' conformity.
-- 
Ray Dunn.                    | UUCP: ray@philmtl.philips.ca
Philips Electronics Ltd.     |       ..!{uunet|philapd|philabs}!philmtl!ray
600 Dr Frederik Philips Blvd | TEL : (514) 744-8987  (Phonemail)
St Laurent. Quebec.  H4M 2S9 | FAX : (514) 744-9550  TLX: 05-824090
-- 
Ray Dunn.                    | UUCP: ray@philmtl.philips.ca
Philips Electronics Ltd.     |       ..!{uunet|philapd|philabs}!philmtl!ray
600 Dr Frederik Philips Blvd | TEL : (514) 744-8987  (Phonemail)
St Laurent. Quebec.  H4M 2S9 | FAX : (514) 744-9550  TLX: 05-824090