shap@shasta.Stanford.EDU (shap) (04/27/91)
Several companies have announced or are known to be working on 64 bit architectures. It seems to me that 64 bit architectures are going to introduce some nontrivial problems with C and C++ code. I want to start a discussion going on this topic. Here are some seed questions: 1. Do the C/C++ standards need to be extended to cover 64-bit environments, or are they adequate as-is? 2. If a trade-off has to be made between compliance and ease of porting, what's the better way to go? 3. If conformance to the standard is important, then the obvious choices are short 16 bits int 32 bits long 64 bits void * 64 bits How bad is it for sizeof(int) != sizeof(long). 4. Would it be better not to have a 32-bit data type and to make int be 64 bits? If so, how would 32- and 64- bit programs interact? Looking forward to a lively exchagne...
torek@elf.ee.lbl.gov (Chris Torek) (04/27/91)
In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: >How bad is it for sizeof(int) != sizeof(long). This has been the case on PDP-11s for over 20 years. It does cause problems---there is always software that makes invalid assumptions---but typically long-vs-int problems, while rampant, are also easily fixed. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
sarima@tdatirv.UUCP (Stanley Friesen) (04/28/91)
In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: >Several companies have announced or are known to be working on 64 bit >architectures. It seems to me that 64 bit architectures are going to >introduce some nontrivial problems with C and C++ code. >1. Do the C/C++ standards need to be extended to cover 64-bit >environments, or are they adequate as-is? They are adequate as is. They make only minimum requirements for a conforming implementation. In particular, there is no reason why you cannot have 64 bit long's (which is what I would do) or even 64 bit int's. And you could easily have 64 and 128 bit floating point types (either as 'float' and 'double' or as 'double' and 'long double' - both approaches are standard conforming). >2. If a trade-off has to be made between compliance and ease of >porting, what's the better way to go? In writing a new compiler from scratch (or even mostly from scratch) there is no question, full ANSI compliance is absolutely necessary. [An ANSI compiler may itself be less portable, but it allows the application programers to write portable code more easily]. In porting an existing compiler, do whatever seems most practical. >3. If conformance to the standard is important, then the obvious >choices are > short 16 bits > int 32 bits > long 64 bits > void * 64 bits OR: short 32 bits int 64 bits long 64 bits OR: short 32 bits int 32 bits long 64 bits Any one of the above may be the most appropriate depending on the instruction set. If there are no instructions for 16 bit quantities then using 16 bit short's is a big loss. And if it really is set up as a hybrid 32/64 bit architecture, even the last may be useful. [For instance, the Intel 80X86 series chips are hybrid 16/32 bit architectures, so both 16 and 32 bit ints make sense]. >How bad is it for sizeof(int) != sizeof(long). Not particularly. There are already millions of machines where this is true - on PC class machines running MS-DOS sizeof(int) == sizeof(short) for most existing compilers, (that is the sizes are 16, 16, 32). [And on Bull mainframes, unless things have changed, all three are the same size]. >4. Would it be better not to have a 32-bit data type and to make int >be 64 bits? If so, how would 32- and 64- bit programs interact? Programs on different machines should not talk to each other in binary. [See the long, acrimonious discussions about binary I/O right here]. And as long as you use either ASCII text or XDR representation for data exchange, there is no problem. Howver, I would be more likely to skip the 16 bit type than the 32 bit type. (Of course if the machine has a 16 bit add and not a 32 bit one ...). In short the idea is that C should translate as cleanly as possible into the most natural data types for the machine in question. This is what the ANSI committee had in mind. -- --------------- uunet!tdatirv!sarima (Stanley Friesen)
bhoughto@pima.intel.com (Blair P. Houghton) (04/28/91)
In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: >It seems to me that 64 bit architectures are going to >introduce some nontrivial problems with C and C++ code. Nope. They're trivial if you didn't assume 32-bit architecture, which you shouldn't, since many computers still have 36, 16, 8, etc.-bit architectures. >I want to start a discussion going on this topic. Here are some seed >questions: Here's some fertilizer (but most of you consider it that way, at any time :-) ): >1. Do the C/C++ standards need to be extended to cover 64-bit >environments, or are they adequate as-is? The C standard allows all sorts of data widths, and specifies a scad of constants (#defines, in <limits.h> to let you use these machine-specific numbers in your code, anonymously. >2. If a trade-off has to be made between compliance and ease of >porting, what's the better way to go? If you're compliant, you're portable. >3. If conformance to the standard is important, then the obvious >choices are > > short 16 bits > int 32 bits > long 64 bits > void * 64 bits The suggested choices are: short <the shortest integer the user should handle; >= 8 bits> int <the natural width of integer data on the cpu; >= a short> long <the longest integer the user should handle; >= an int> void * <long enough to specify any location legally addressable> There's no reason for an int to be less than the full register-width, and no reason for an address to be limited to the register width. An interesting side-effect of using the constants is that you never need to know the sizes of these things on your own machine; i.e., use CHAR_BIT (the number of bits in a char) and `sizeof int' (the number of chars in an int) and you'll never need to know how many bits an int contains. >How bad is it for sizeof(int) != sizeof(long). It's only bad if you assume it's not true. (I confess: I peeked. I saw Chris' answer, and I'm not going to disagree.) >4. Would it be better not to have a 32-bit data type and to make int >be 64 bits? If so, how would 32- and 64- bit programs interact? Poorly, if at all. Data transmission among architechures with different bus sizes is a hairy issue of much aspirin. The only portable method is to store and transmit the data in some width-independent form, like morse-code or a text format (yes, ascii is a 7 or 8 bits wide, but it's a _common_ form of data-width hack, and if all else fails, you can hire people to read and type it into your machine). >Looking forward to a lively exchagne... --Blair "Did anyone NOT bring potato salad?"
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (04/29/91)
shap@shasta.Stanford.EDU (shap) writes: >2. If a trade-off has to be made between compliance and ease of >porting, what's the better way to go? User selectable. >3. If conformance to the standard is important, then the obvious >choices are > short 16 bits > int 32 bits > long 64 bits > void * 64 bits That depends on the natural address size of the machine. If the machine uses 32 bit addresses, then (void *) should be 32 bits. I would not want my address arrays taking up more memory than is needed. Is it really necessary that sizeof(void *) == sizeof(long)? >How bad is it for sizeof(int) != sizeof(long). Would not bother me as long as sizeof(int) <= sizeof(long) >4. Would it be better not to have a 32-bit data type and to make int >be 64 bits? If so, how would 32- and 64- bit programs interact? Again it would depend on the machine. If the machine has both 32 bit and 64 bit operations, then do include them. If a 32 bit operation is unnatural to the machine, then don't. If it has 16 bit operations then that makes sense for short. -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
jac@gandalf.llnl.gov (James A. Crotinger) (04/29/91)
Anyone want to comment on experiences with Crays? I believe the C compilers have sizeof(int) = sizeof(short) = sizeof(long) == 46 or 64 bits, depending on a compile time flag (Crays can do 46 bit integer arithmetic using the vectorizing floating point processors, so that is the default). Binary communications between Crays and other computers is something I haven't done, mostly because Cray doesn't support IEEE floating point. Jim -- ----------------------------------------------------------------------------- James A. Crotinger Lawrence Livermore Natl Lab // The above views jac@moonshine.llnl.gov P.O. Box 808; L-630 \\ // are mine and are not (415) 422-0259 Livermore CA 94550 \\/ necessarily those of LLNL
campbell@redsox.bsw.com (Larry Campbell) (04/29/91)
In article <1991Apr29.050715.22968@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
->3. If conformance to the standard is important, then the obvious
->choices are
-
-> short 16 bits
-> int 32 bits
-> long 64 bits
-> void * 64 bits
-
-That depends on the natural address size of the machine. If the
-machine uses 32 bit addresses, then (void *) should be 32 bits.
-I would not want my address arrays taking up more memory than is
-needed.
-
-Is it really necessary that sizeof(void *) == sizeof(long)?
Of course not.
We're currently porting a largish (150K lines) program to a machine
on which:
short (dunno, not at work now so I can't check)
int 32 bits
long 32 bits
void * 128 bits
Thank *god* it has a fully-compliant ANSI compiler.
For extra credit: can you guess what machine this is?
--
Larry Campbell The Boston Software Works, Inc., 120 Fulton Street
campbell@redsox.bsw.com Boston, Massachusetts 02109 (USA)
mvm@jedi.harris-atd.com (Matt Mahoney) (04/29/91)
When I need to specify bits, I'm usually forced to make the following assumptions: char 8 bits short 16 bits long 32 bits since this is true on most machines. Anything else would probably break a lot of code. ------------------------------- Matt Mahoney, mvm@epg.harris.com #include <disclaimer.h>
robertk@lotatg.lotus.com (Robert Krajewski) (04/29/91)
Actually, there's one thing that people didn't mention -- the feasibility of bigger lightweight objects. I'd assume that any processor that advertised a 64-bit archtitecture would be able to efficiently move around 64 bits at a time, so it would be very cheap to move 8-byte (excuse me, octet) objects around by copying.
turk@Apple.COM (Ken "Turk" Turkowski) (04/30/91)
shap@shasta.Stanford.EDU (shap) writes: >3. If conformance to the standard is important, then the obvious >choices are > short 16 bits > int 32 bits > long 64 bits > void * 64 bits >4. Would it be better not to have a 32-bit data type and to make int >be 64 bits? If so, how would 32- and 64- bit programs interact? It is necessary to have 8, 16, and 32-bit data types, in order to be able to read data from files. I would suggest NOT specifying a size for the int data type; this is supposed to be the most efficient integral data type for a particular machine and compiler. A lot of programs rely on the fact that nearly all C implementations have a 32-bit long int. I would suggest: short 16 bits long 32 bits long long 64 bits int UNSPECIFIED void * UNSPECIFIED This is patterned after ANSI floating-point extensions to accommodate an extended format (i.e. "long double"). How about "double long", because it really is two longs? (Donning flame retardent suit). What then would 128-bit ingeter be? long long long? double double long? long double long? quadruple long? How about the Fortran method: int*8? Another proposal would be to invent a new word, like "big", "large", "whopper", "humongous", "giant", "extra", "super", "grand", "huge", "jumbo", "broad", "vast", "wide", "fat", "hefty", etc. Whatever chaice is made, there should be ready extensions to 128 and 256 bit integers, as well as 128 and 256-bit floating point numbers. P.S. By the way is there an analagous word for floating-point numbers as int does for integers? -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA Internet: turk@apple.com Applelink: TURK UUCP: sun!apple!turk
john@sco.COM (John R. MacMillan) (04/30/91)
shap <shap@shasta.Stanford.EDU> writes: |Several companies have announced or are known to be working on 64 bit |architectures. It seems to me that 64 bit architectures are going to |introduce some nontrivial problems with C and C++ code. In a past life I did a fair amount of work with C on a 64 bit architecture, the C/VE compiler on NOS/VE. C/VE was a pre-ANSI compiler, but many of the comments I think still apply. C/VE had 64 bit ints and longs, 32 bit shorts, 8 bit chars, and 48 bit pointers (as an added bonus, the null pointer was not all bits zero, but that's another headache entirely; ask me how many times I've wanted to strangle a programmer who used bzero() to clear structures that have pointers in them). |I want to start a discussion going on this topic. Here are some seed |questions: | |1. Do the C/C++ standards need to be extended to cover 64-bit |environments, or are they adequate as-is? The C standard certainly is; it's obvious a lot of effort went into making sure it would be. |2. If a trade-off has to be made between compliance and ease of |porting, what's the better way to go? I don't think there's any reason to trade off compliance with standards; if you want to trade off ease of porting versus exploiting the full power of the architecture that's another question. The answer is going to be different for different people. If you make the only 64 bit data type be ``long long'' or some such, it will make life much easier on porters, but most of the things you port then won't take advantage of having 64 bit data types... |3. If conformance to the standard is important, then the obvious |choices are | | short 16 bits | int 32 bits | long 64 bits | void * 64 bits This isn't really a conformance issue. The idea is to make the machine's natural types fit the C types well. However, if the architecture can support all these sizes easily, then for ease of porting it would be nice to have all of the common sizes available. Many (often poorly written) C programs depend on, say, short being 16 bits, and having a 16 bit data type is the easiest way to port such programs. One problem with 32 bit ints and 64 bit pointers is that a lot of (bad) code assumes you can put a pointer into an int, and vice versa. As an aside, using things like ``typedef int int32'', is not always the answer, especially if you don't tell the porter (or are inconsistent about) whether this should be an integer data type that is _exactly_ 32 bits or _at least_ 32 bits. |How bad is it for sizeof(int) != sizeof(long). The C/VE compiler had sizeof(int) == sizeof(long) so I can't comment on that one in particular, but... |4. Would it be better not to have a 32-bit data type and to make int |be 64 bits? If so, how would 32- and 64- bit programs interact? ...there is a lot of badly written code out there, and no matter what you do, you'll break somebody's bogus assumptions. In particular a lot of code makes assumptions about pointer sizes, whether they'll fit in ints, and whether you can treat them like ints. Expect porting to 64 bit architectures to be work, because it is.
wmm@world.std.com (William M Miller) (04/30/91)
bhoughto@pima.intel.com (Blair P. Houghton) writes: > The suggested choices are: > > short <the shortest integer the user should handle; >= 8 bits> Actually, ANSI requires at least 16 bits for shorts (see SHRT_MIN and SHRT_MAX in <limits.h>, X3.159-1989 2.2.4.2.1). -- William M. Miller, Glockenspiel, Ltd. wmm@world.std.com
gwyn@smoke.brl.mil (Doug Gwyn) (04/30/91)
In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: >1. Do the C/C++ standards need to be extended to cover 64-bit >environments, or are they adequate as-is? This question presupposes something that is not true, namely that 64-bit environments differ from current environments. In fact, I've been using 64-bit C environments for years, in addition to 16-bit and 32-bit ones, with occasional dabbling in 60-bit environments. The C standard does not presuppose any particular architecture. >2. If a trade-off has to be made between compliance and ease of >porting, what's the better way to go? There is no excuse for a new C implementation to not conform to the C standard. Note that the standard allows the C implementor much flexibility when it comes to architecturally-determined choices. >3. If conformance to the standard is important, then the obvious >choices are > short 16 bits > int 32 bits > long 64 bits > void * 64 bits (You seem to have also assumed that a char is 8 bits.) There is nothing particularly "obvious" about these choices; I could readily imagine many other choices that would be both standard conforming and useful. >How bad is it for sizeof(int) != sizeof(long). There should not be any applications that depend on int and long having the same size. >4. Would it be better not to have a 32-bit data type and to make int >be 64 bits? If so, how would 32- and 64- bit programs interact? I don't know what you mean by a "32-bit program". >Looking forward to a lively exchagne... I don't see what there is to discuss. The C standard specifies minimum ranges for the basic types, and anything beyond that is up to the implementor to decide, taking into account his customers' needs.
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/01/91)
mvm@jedi.harris-atd.com (Matt Mahoney) writes: >When I need to specify bits, I'm usually forced to make the >following assumptions: > char 8 bits > short 16 bits > long 32 bits >since this is true on most machines. Anything else would probably break >a lot of code. What would break if you did: char 8 bits short 16 bits int 32 bits long 64 bits where any pointer or pointer difference would fit in 32 bits? -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/01/91)
john@sco.COM (John R. MacMillan) writes: >One problem with 32 bit ints and 64 bit pointers is that a lot of >(bad) code assumes you can put a pointer into an int, and vice versa. >...there is a lot of badly written code out there, and no matter what >you do, you'll break somebody's bogus assumptions. In particular a >lot of code makes assumptions about pointer sizes, whether they'll fit >in ints, and whether you can treat them like ints. For how long should we keep porting code, especially BAD CODE? This sounds a lot like school systems that keep moving failing students up each year and we know what that results in. IMHO, no code older than 8 years should be permitted to be ported and if it is found to be "bad" code then it must have been written more than 8 years ago. -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
peter@llama.trl.OZ.AU (Peter Richardson - NSSS) (05/01/91)
In article <4068@inews.intel.com>, bhoughto@pima.intel.com (Blair P. Houghton) writes: > In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: > >It seems to me that 64 bit architectures are going to > >introduce some nontrivial problems with C and C++ code. > > Nope. They're trivial if you didn't assume 32-bit architecture, > which you shouldn't, since many computers still have 36, 16, 8, > etc.-bit architectures. > > >I want to start a discussion going on this topic. Here are some seed > >questions: Hmmm. As I understand it. if you want to write truly portable code, you should never make assumptions about sizeof any integral types. We have a local header file on each machine type defining Byte, DoubleByte etc. For example, on sun4: typedef unsigned char Byte; // always a single byte typedef unsigned short DoubleByte; // always two bytes typedef unsigned long QuadByte; // always four bytes If you want to use an int, use an int. If you want to use a 16 bit quantity, use a DoubleByte. To port to new machines, just change the header file. Purists may prefer "Octet" to "Byte". It is up to the platform/compiler implementation to determine the appropriate sizeof integral types. It should not be part of the language. > > Poorly, if at all. Data transmission among architechures > with different bus sizes is a hairy issue of much aspirin. > The only portable method is to store and transmit the data > in some width-independent form, like morse-code or a text > format (yes, ascii is a 7 or 8 bits wide, but it's a > _common_ form of data-width hack, and if all else fails, > you can hire people to read and type it into your > machine). There is an international standard for doing this, called Abstract Syntax Notation One (ASN.1), defined by ISO. It is based on the CCITT standards X.208 and X.209 (I think). It is more powerful than either of the proprietary standards XDR or NDR. Compilers are used to translate ASN.1 data descriptions into C/C++ structures, and produce encoder/decoders. --- Peter Richardson Phone: +61 3 541-6342 Telecom Research Laboratories Fax: +61 3 544-2362 Snail: GPO Box 249, Clayton, 3168 Victoria, Australia Internet: p.richardson@trl.oz.au X400: g=peter s=richardson ou=trl o=telecom prmd=telecom006 admd=telememo c=au
bhoughto@pima.intel.com (Blair P. Houghton) (05/01/91)
In article <1991Apr30.140217.7065@world.std.com> wmm@world.std.com (William M Miller) writes: >bhoughto@pima.intel.com (Blair P. Houghton) writes: >> short <the shortest integer the user should handle; >= 8 bits> >Actually, ANSI requires at least 16 bits for shorts (see SHRT_MIN and >SHRT_MAX in <limits.h>, X3.159-1989 2.2.4.2.1). I had my brain packed-BCD mode that day, apparently :-/... The minimum sizes for the four integer types are: char 8 bits short 16 int 16 long 32 Other than that, one need only ensure that short, int, and long are multiples of the size of a char, e.g., 9, 27, 36, 36. --Blair "Hike!"
dlw@odi.com (Dan Weinreb) (05/01/91)
In article <1991May1.012242.26211@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
From: phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN)
Date: 1 May 91 01:22:42 GMT
References: <168@shasta.Stanford.EDU> <1991Apr29.211937.10865@sco.COM>
Organization: University of Illinois at Urbana
IMHO, no code older than 8 years should be permitted to be ported
I see from the Organization field in your mail header that you're from
a university.
jerry@talos.npri.com (Jerry Gitomer) (05/01/91)
mvm@jedi.harris-atd.com (Matt Mahoney) writes:
:When I need to specify bits, I'm usually forced to make the
:following assumptions:
: char 8 bits
: short 16 bits
: long 32 bits
:since this is true on most machines. Anything else would probably break
:a lot of code.
We are caught between the rock (wanting to take *full* advantage of
the new wider register and memory data path machines 64 bits today,
128 tomorrow, and 256 the day after tomorrow) and the hard place
(wanting to preserve code that was handcrafted to the
idiosyncracies of prior generation hardware). Our choices are
simple -- throw the performance of the new machines out the window
or spend the time and money required to fix up the code so that it
complies with the standard. (Sure, standards aren't cast in
concrete, but their life exptancy exceeds that of today's typical
computer system).
IMHO (now isn't that an arrogant phrase? :-) ) it is better to fix
up the offending programs now than to do it later. I say this
because I presume that salaries will continue to increase, which
will make it more expensive to fix things up later, and because
staff turnover leads to a decrease over time in knowledge of the
offending programs.
--
Jerry Gitomer at National Political Resources Inc, Alexandria, VA USA
I am apolitical, have no resources, and speak only for myself.
Ma Bell (703)683-9090 (UUCP: ...uunet!uupsi!npri6!jerry )
rli@buster.stafford.tx.us (Buster Irby) (05/02/91)
turk@Apple.COM (Ken "Turk" Turkowski) writes: >It is necessary to have 8, 16, and 32-bit data types, in order to be able >to read data from files. I would suggest NOT specifying a size for the int >data type; this is supposed to be the most efficient integral data type >for a particular machine and compiler. You assume a lot about the data in the file. Is it stored in a specific processor format (ala Intel vs Motorolla)? My experience has been that binary data is not portable anyway.
gwyn@smoke.brl.mil (Doug Gwyn) (05/02/91)
In article <13229@goofy.Apple.COM> turk@Apple.COM (Ken "Turk" Turkowski) writes: >I would suggest: >short 16 bits >long 32 bits >long long 64 bits >int UNSPECIFIED >void * UNSPECIFIED What on Earth do you mean by "UNSPECIFIED"? An implementation MUST make a definite choice here. The C language standard already contains all the requisite specifications. Note that a standard-conforming implementation is obliged to diagnose use of any construct such as "long long". Therefore that is a stupid extension. I guess I shouldn't be surprised, however, given that the APW C math library functions were declared as returning type "extended" rather than the type "double" required by the C standard. It didn't dawn on them, apparently, that "double" would have best been implemented as SANE extended format in the first place.
daves@ex.heurikon.com (Dave Scidmore) (05/02/91)
In article <6157@trantor.harris-atd.com> mvm@jedi.UUCP (Matt Mahoney) writes: >When I need to specify bits, I'm usually forced to make the >following assumptions: > > char 8 bits > short 16 bits > long 32 bits > >since this is true on most machines. Anything else would probably break >a lot of code. I'm supprised nobody has mentioned that the real solution to this kind of portability problem is for the original programmer to use the definitions in "types.h" that tell you how big chars, shorts, ints, and longs are. I know that a lot of existing code does not take advantage of the ability to use typdefs or #defines to alter the size of key variables or adjust for numbers of bits for each, but doing so would help prevent the kinds of portability problems mentioned. I always urge people when writing their own code to be aware of size dependent code and either use the existing "types.h", or make their own and use it to make such code more portable. This won't help you when porting someone elses machine dependant (and dare I say poorly written) code, but the next guy who has to port your code will have an easier time of it. -- Dave Scidmore, Heurikon Corp. dave.scidmore@heurikon.com
daves@ex.heurikon.com (Dave Scidmore) (05/02/91)
turk@Apple.COM (Ken "Turk" Turkowski) writes: >It is necessary to have 8, 16, and 32-bit data types, in order to be able >to read data from files. Bad practice!!!! This works fine if the one reading the data is always the same as the one writing it, but you are implying that these data sizes are important for having a machine read files written by another machine, then storing structures as binary images can result in severe problems. Byte ordering is a more fundamental problem than the size of types when trying to read and write binary images. The world of microcomputers is divided into two camps: those who store the least significant byte of a 16 or 32 bit quantity in the lowest memory location (as in Intel processors), and those which store the most significant byte in the lowest memory location (as in Motorola processors). Given the value 0x12345678 each stores 32 bit quantities as follows: Memory address LSB 0 1 2 3 0x78 0x56 0x34 0x12 LSB in lowest address (Intel convention) 0x12 0x34 0x56 0x78 MSB in lowest address (Motorola convention) From this you can see that if a big-endian processor writes a 32 bit int into memory a little endian processor will read it back backwards. The end result is the need to swap all bytes within 16 and 32 bit quantites. When reading structures from a file, this can only be done if you know the size of each component of the structure and swap it after reading. In general this is usualy sufficient reason not to store binary images of data in files unless you can assure that the machine reading the values will always follow the same size and byte ordering convention. >I would suggest NOT specifying a size for the int >data type; this is supposed to be the most efficient integral data type >for a particular machine and compiler. I agree. >A lot of programs rely on the fact that nearly all C implementations >have a 32-bit long int. The precident for non-32 bit ints predates the microprocessor and anyone who writes a supposedly "portable" program assuming long ints are 32 bits is creating a complex and difficult mess for the person who has to port the code to untangle. >I would suggest: > >short 16 bits >long 32 bits >long long 64 bits >int UNSPECIFIED >void * UNSPECIFIED I would suggest not making assumptions about the size of built in types when writing portable code. -- Dave Scidmore, Heurikon Corp. dave.scidmore@heurikon.com
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/02/91)
dlw@odi.com (Dan Weinreb) writes: >In article <1991May1.012242.26211@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes: > From: phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) > Date: 1 May 91 01:22:42 GMT > References: <168@shasta.Stanford.EDU> <1991Apr29.211937.10865@sco.COM> > Organization: University of Illinois at Urbana > IMHO, no code older than 8 years should be permitted to be ported >I see from the Organization field in your mail header that you're from >a university. I see from the domain in your return address you are from a commercial organization. SO............ -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
john@sco.COM (John R. MacMillan) (05/02/91)
|>4. Would it be better not to have a 32-bit data type and to make int |>be 64 bits? If so, how would 32- and 64- bit programs interact? | |It is necessary to have 8, 16, and 32-bit data types, in order to be able |to read data from files. It's not necessary, but it does make it easier. |I would suggest NOT specifying a size for the int |data type; this is supposed to be the most efficient integral data type |for a particular machine and compiler. | |[...] | |short 16 bits |long 32 bits |long long 64 bits |int UNSPECIFIED |void * UNSPECIFIED Problem with this is that I don't think sizeof(long) is allowed to be less than sizeof(int) which would constrain your ints to 32 bits.
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/02/91)
jerry@talos.npri.com (Jerry Gitomer) writes: > IMHO (now isn't that an arrogant phrase? :-) ) it is better to fix > up the offending programs now than to do it later. I say this > because I presume that salaries will continue to increase, which > will make it more expensive to fix things up later, and because > staff turnover leads to a decrease over time in knowledge of the > offending programs. Also, what about staff MORALE? I don't know about a lot of other programmers, but I for one would be much happier at the very least cleaning up old code and making it work right (or better yet rewriting it from scratch the way it SHOULD have been done in the first place) than perpetuationg bad designs of the past which translate into inefficiencies of the future. But if you are interested in getting things converted quickly, then just make TWO models of the compiler. You then assign a special flag name to make the compiler work in such a way that it will avoid breaking old code. Programs written AFTER the compiler is ready should be required to compile WITHOUT that flag. You could call the flag "-badcode". I think that might be a fair compromise between getting all the old bad code to work now under the new machine, while still promoting better programming practices for the present and future (and flagging examples of what NOT to do). -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
john@sco.COM (John R. MacMillan) (05/02/91)
|For how long should we keep porting code, especially BAD CODE? This sounds |a lot like school systems that keep moving failing students up each year |and we know what that results in. Whether or not it's a good idea, people will keep porting bad code as long as other people are willing to pay for it. Users are really buying Spiffo 6.3. They don't care how it's written; they just like it. So Monster Hardware, in an effort to boost sales, wants to be able to sell their boxes as a platform for running Spiffo 6.3. They don't care how it's written, they just want it. The core of Spiffo 6.3 is from Spiffo 1.0, and was written 10 years ago by 5 programmers who are now either VPs or no longer there, and who never considered it would have to run on an MH1800 where the chars are 11 bits and ints are 33. It happens. Honest. I suspect many of us know a Spiffo or two. |IMHO, no code older than 8 years should be permitted to be ported and if |it is found to be "bad" code then it must have been written more than 8 |years ago. The first part simply won't happen if there's demand, and I'm not sure I understand the second part.
jfc@athena.mit.edu (John F Carr) (05/02/91)
In article <16023@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >Note that a standard-conforming implementation is obliged to diagnose >use of any construct such as "long long". Therefore that is a stupid >extension. I disagree. I want a compiler that supports ANSI features, but I would rather have "long long" cause the compiler to generate 64 bit code than cause the compiler to say "error: invalid type". I think the C standard is valuable because it is a list of what is valid C, not because it also says what is not valid C. -- John Carr (jfc@athena.mit.edu)
henry@zoo.toronto.edu (Henry Spencer) (05/02/91)
In article <1991May2.033545.15051@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes: >rather have "long long" cause the compiler to generate 64 bit code than >cause the compiler to say "error: invalid type". I think the C standard is >valuable because it is a list of what is valid C, not because it also says >what is not valid C. The C standard says both. However, why do you assume that the compiler must complain *or* generate 64-bit code? ANSI C does not prevent it from doing both. The only thing the standard requires is that violations of its constraints must draw at least one complaint. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
cadsi@ccad.uiowa.edu (CADSI) (05/02/91)
From article <1991May01.172042.5214@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby): > turk@Apple.COM (Ken "Turk" Turkowski) writes: > >>It is necessary to have 8, 16, and 32-bit data types, in order to be able >>to read data from files. I would suggest NOT specifying a size for the int >>data type; this is supposed to be the most efficient integral data type >>for a particular machine and compiler. > > You assume a lot about the data in the file. Is it stored in a specific > processor format (ala Intel vs Motorolla)? My experience has been that > binary data is not portable anyway. Binary isn't in general portable. However, using proper typedefs in a class one can move binary read/write classes from box to box. I think the solution the the whole issue of sizeof(whatever) is to simply assume nothing. Always typedef. It isn't that difficult, and code I've done this runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without code (barring the typedef header files) changes. |----------------------------------------------------------------------------| |Tom Hite | The views expressed by me | |Manager, Product development | are mine, not necessarily | |CADSI (Computer Aided Design Software Inc. | the views of CADSI. | |----------------------------------------------------------------------------|
steve@taumet.com (Stephen Clamage) (05/02/91)
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes: >But if you are interested in getting things converted quickly, then just make >TWO models of the compiler. You then assign a special flag name to make the >compiler work in such a way that it will avoid breaking old code. Some compilers already do this. For example, our compilers (available from Oregon Software) have a "compatibility" switch which allows compilation of old-style code, including old-style preprocessing. In this mode, ANSI features (including ANSI preprocessing and function prototypes) are still available, allowing gradual migration of programs from old-style to ANSI C. -- Steve Clamage, TauMetric Corp, steve@taumet.com
bright@nazgul.UUCP (Walter Bright) (05/03/91)
In article <12563@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes: /In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: />How bad is it for sizeof(int) != sizeof(long). /It does cause problems---there is always software that makes invalid /assumptions---but typically long-vs-int problems, while rampant, are /also easily fixed. The most aggravating problem we have is it seems we (Zortech) are the only compiler for which: char signed char unsigned char are all distinct types! For example, char *p; signed char *ps; unsigned char *pu; p = pu; /* syntax error */ p = ps; /* syntax error */ It seems we are the only compiler that flags these as errors. A related example is: int i; short *ps; *ps = &i; /* syntax error, for 16 bit compilers too */ I think a lot of people are in for a surprise when they port to 32 bit compilers... :-)
gwyn@smoke.brl.mil (Doug Gwyn) (05/03/91)
In article <1991May2.033545.15051@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes: -In article <16023@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: ->Note that a standard-conforming implementation is obliged to diagnose ->use of any construct such as "long long". Therefore that is a stupid ->extension. -I disagree. I want a compiler that supports ANSI features, but I would -rather have "long long" cause the compiler to generate 64 bit code than -cause the compiler to say "error: invalid type". I think the C standard is -valuable because it is a list of what is valid C, not because it also says -what is not valid C. I think you missed the point. There are numerous CONFORMING ways in which additional integer types can be added to C. "long long" is NOT one of these, and a standard-conforming implementation is OBLIGED to diagnose the use of "long long", which violates the Constraints of X3.159-1989 section 3.5.2. Therefore "long long" is not a wise way to make such an extension.
gwyn@smoke.brl.mil (Doug Gwyn) (05/03/91)
In article <1991May01.222112.13130@sco.COM> john@sco.COM (John R. MacMillan) writes: >|It is necessary to have 8, 16, and 32-bit data types, in order to be able >|to read data from files. >It's not necessary, but it does make it easier. Not even that. Assuming that for some unknown reason you're faced with reading a binary file that originated on some other system, there is a fair chance that it used a "big endian" architecture while your system is "little endian" or vice-versa. Binary data transportability is a much thornier issue than most people realize.
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/03/91)
jfc@athena.mit.edu (John F Carr) writes: >I disagree. I want a compiler that supports ANSI features, but I would >rather have "long long" cause the compiler to generate 64 bit code than >cause the compiler to say "error: invalid type". I think the C standard is >valuable because it is a list of what is valid C, not because it also says >what is not valid C. I see nothing wrong with this. You have ANSI C and you have extensions. Of course YOUR extensions and MY extensions may not be the same, and may even be mutually exclusive. For each of us to ensure our code will compile on the other's compiler, we can restrict ourselves to ANSI C. On the other hand if we can get together and make our extensions the same, we widen the domain in which our non-standard code that takes advantage of these powerful features can be used. When I am writing ANSI C, it does help to have something jump in there and complain when I go beyond the standard. I believe in GCC this is "-pedantic" or something like that. -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
jfc@athena.mit.edu (John F Carr) (05/03/91)
In article <1991May2.041911.14489@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >However, why do you assume that the compiler >must complain *or* generate 64-bit code? ANSI C does not prevent it from >doing both. The only thing the standard requires is that violations of its >constraints must draw at least one complaint. I know that diagnostics are not required to be fatal errors, but I would be annoyed to get a warning every time I compiled code that used a nonstandard extension. I think the gcc solution is a good one: support ANSI features by default, but only print warnings for use of extensions when the user asks for them. -- John Carr (jfc@athena.mit.edu)
shap@shasta.Stanford.EDU (shap) (05/03/91)
In article <13229@goofy.Apple.COM> turk@Apple.COM (Ken "Turk" Turkowski) writes: >I would suggest: > >short 16 bits >long 32 bits >long long 64 bits >int UNSPECIFIED >void * UNSPECIFIED > >This is patterned after ANSI floating-point extensions to accommodate >an extended format (i.e. "long double"). > >Another proposal would be to invent a new word, like "big", "large", >"whopper", "humongous", "giant", "extra", "super", "grand", "huge", >"jumbo", "broad", "vast", "wide", "fat", "hefty", etc. > >Whatever chaice is made, there should be ready extensions to 128 and 256 >bit integers, as well as 128 and 256-bit floating point numbers. Actually, that's what you did. The 'long long' data type does not conform to the ANSI standard. The advantage to the approach short 16 int 32 long 32 long long 64 Is that fewer datatypes change size (this approach leaves only pointers changing), and the code could conceivably have the same integer sizes in 32- and 64-bit mode. But isn't ANSI conformance a requirement?
shap@shasta.Stanford.EDU (shap) (05/03/91)
In article <4068@inews.intel.com> bhoughto@pima.intel.com (Blair P. Houghton) writes: >In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: > >>2. If a trade-off has to be made between compliance and ease of >>porting, what's the better way to go? > >If you're compliant, you're portable. While I happen to agree with this sentiment, there is an argument that X hundred million lines of C code can't be wrong. The problem with theology is that it's not commercially viable. Reactions? Jonathan
shap@shasta.Stanford.EDU (shap) (05/03/91)
In article <1991May2.033545.15051@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes: >In article <16023@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >>Note that a standard-conforming implementation is obliged to diagnose >>use of any construct such as "long long"... > >I disagree. I want a compiler that supports ANSI features, but I would >rather have "long long" cause the compiler to generate 64 bit code than >cause the compiler to say "error: invalid type". I think the C standard is >valuable because it is a list of what is valid C, not because it also says >what is not valid C. Fortunately, you aren't the standard. The standard is very precise. It does not require that the use of an extension be an error. It does REQUIRE that the compiler issue a diagnostic. Something like file.c: 64: Thanks for using long long! Would conform. Credit for the example to Dave Prosser of AT&T. Jonathan > >-- > John Carr (jfc@athena.mit.edu)
shap@shasta.Stanford.EDU (shap) (05/03/91)
In article <1991May01.172042.5214@buster.stafford.tx.us> rli@buster.stafford.tx.us writes: >turk@Apple.COM (Ken "Turk" Turkowski) writes: > >>It is necessary to have 8, 16, and 32-bit data types, in order to be able >>to read data from files. I would suggest NOT specifying a size for the int >>data type; this is supposed to be the most efficient integral data type >>for a particular machine and compiler. > >You assume a lot about the data in the file. Is it stored in a specific >processor format (ala Intel vs Motorolla)? My experience has been that >binary data is not portable anyway.
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/03/91)
In article <1991May1.023356.8048@trl.oz.au>, peter@llama.trl.OZ.AU (Peter Richardson - NSSS) writes: > Hmmm. As I understand it. if you want to write truly portable code, you > should never make assumptions about sizeof any integral types. We have > a local header file on each machine type defining Byte, DoubleByte etc. > For example, on sun4: > > typedef unsigned char Byte; // always a single byte > typedef unsigned short DoubleByte; // always two bytes > typedef unsigned long QuadByte; // always four bytes > > If you want to use an int, use an int. If you want to use a 16 bit > quantity, use a DoubleByte. To port to new machines, just change the > header file. Purists may prefer "Octet" to "Byte". Sorry. You have just made a non-portable assumption, namely that there *is* an integral type which holds an octet and that there *is* an integral type which holds two octets, and so on. If you want "at least 8 bits", then use {,{,un}signed} char, and if you want "at least 16 bits", then use {,unsigned} short. The ANSI standard guarantees those. There is no need to introduce your own private names for them. If you want "exactly 8 bits" or "exactly 16 bits", you have no reason to expect that such types will exist. I am greatly disappointed that C++, having added so much to C, has not added something like int(Low,High) to the language, which would stand for the "most efficient" available integral type in which both Low and High were representable. The ANSI C committee were right not to add such a construct to C, because their charter was to standardise, not innovate. An anecdote which may be of value to people designing a C compiler for 64-bit machines: there was a UK company who built their own micro-coded machine, and wanted to put UNIX on it. Their C compiler initially had char=8, short=16, int=32, long=64 bits, sizeof (int) == sizeof (char*). They changed their compiler in a hurry, so that long=32 bits; it was less effort to do that than to fix all the BSD sources. It also turned out to have market value in that many of their customers had been just as sloppy with VAX code. sizeof (char) is fixed at 1. However, it should be quite easy to set up a compiler so that the user can specify (whether in an environment variable or in the command line) what sizes to use for short, int, long, and (if you want to imitate GCC) long long. Something like setenv CINTSIZES="16,32,32,64" # short,int,long,long long. The system header files would have to use the default types (call them __int, __short, and so on) so that only one set of system libraries would be needed, and this means that using CINTSIZES to set the sizes to something other than the defaults would make the compiler non-conforming. Make the defaults the best you can, but if you let people over-ride the defaults then the task of porting sloppy code will be eased. Other vendors have found the hard way that customers have sloppy code. -- Bad things happen periodically, and they're going to happen to somebody. Why not you? -- John Allen Paulos.
shankar@hpcupt3.cup.hp.com (Shankar Unni) (05/03/91)
In comp.lang.c, torek@elf.ee.lbl.gov (Chris Torek) writes: > In article <168@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: > >How bad is it for sizeof(int) != sizeof(long). > This has been the case on PDP-11s for over 20 years. > It does cause problems---there is always software that makes invalid > assumptions---but typically long-vs-int problems, while rampant, are > also easily fixed. Well, lint goes a long way towards pointing out int-long mismatches, etc. But in my (humble?) opinion, much trouble would be headed off if C compilers on 64-bit architectures would simply dispense with the 16-bit type and make sizes of int == long == void * == 64 bits, and short == 32 bits. Why is it so terribly important to have a 16-bit data type? In any case, memory is getting cheaper these days.. ----- Shankar Unni E-Mail: HP India Software Operation, Bangalore Internet: shankar@india.hp.com Phone : +91-812-261254 x417 UUCP: ...!hplabs!hpda!shankar
rli@buster.stafford.tx.us (Buster Irby) (05/03/91)
cadsi@ccad.uiowa.edu (CADSI) writes: >From article <1991May01.172042.5214@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby): >> turk@Apple.COM (Ken "Turk" Turkowski) writes: >> >>>It is necessary to have 8, 16, and 32-bit data types, in order to be able >>>to read data from files. I would suggest NOT specifying a size for the int >> >> You assume a lot about the data in the file. Is it stored in a specific >> processor format (ala Intel vs Motorolla)? My experience has been that >> binary data is not portable anyway. >Binary isn't in general portable. However, using proper typedefs in >a class one can move binary read/write classes from box to box. I think >the solution the the whole issue of sizeof(whatever) is to simply assume >nothing. Always typedef. It isn't that difficult, and code I've done this >runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without >code (barring the typedef header files) changes. What kind of typedef would you use to swap the high and low bytes in a 16 bit value? An Intel or BIG_ENDIAN machine stores the bytes in reverse order, while a Motorolla or LITTLE_ENDIAN machine stores the bytes in normal order (High to low). There is no way to fix this short of reading the file one byte at a time and stuffing them into the right place. The point I was trying to make is that reading and writing a data file has absolutely nothing to do with data types. As we have already seen, there are a lot of different machine types that support C, and as far as I know, all of them are capable of reading binary files, independent of data type differences. The only sane way to deal with this issue is to never assume anything about the SIZE or the ORDERING of data types, which is basically what the C standard says. It tells you that a long >= int >= short >= char. It says nothing about actual size or byte ordering within a data type. Another trap I ran accross recently is the ordering of bit fields. On AT&T 3B2 machines the first bit defined is the high order bit, but on Intel 386 machines the first bit defined is the low order bit. This means that anyone who attempts to write this data to a file and transport it to another platform is in for a surprise, they are not compatible. Again, the C standard says nothing about bit ordering, and in fact cautions you against making such assumptions.
tmb@ai.mit.edu (Thomas M. Breuel) (05/04/91)
You have just made a non-portable assumption, namely that there *is* an integral type which holds an octet and that there *is* an integral type which holds two octets, and so on. If you want "at least 8 bits", then use {,{,un}signed} char, and if you want "at least 16 bits", then use {,unsigned} short. The ANSI standard guarantees those. There is no need to introduce your own private names for them. If you want "exactly 8 bits" or "exactly 16 bits", you have no reason to expect that such types will exist. I am greatly disappointed that C++, having added so much to C, has not added something like int(Low,High) to the language, which would stand for the "most efficient" available integral type in which both Low and High were representable. The ANSI C committee were right not to add such a construct to C, because their charter was to standardise, not innovate. I think allowing the programmer to specify arbitrary precision integers is equally bad, since it adds too much complexity to the language and to compilers. A good compromise would be to provide a set of precisions that can be supported on current machines and extend the language standard as newer, more powerful machines become available. In essence, this is actually what the C standard does, if one continues to use the current data types with roughly their current meaning: "short" is close to, and at least 16 bits, and "long" is close to, and at least 32 bit. The emphasis here is on "close to", and this should probably be made explicit in the standard, since programmers pragmatically do, and need to, rely on it to be able to estimate what the space requirements of their programs will be. When machines capable of handling larger integer data types become available, new names for the new data types should be introduced. Perhaps a more consistent naming scheme would be good: int8 (>= 8bit integer), ..., int128 (>= 128 bit integer), etc. Thomas.
rockwell@socrates.umd.edu (Raul Rockwell) (05/04/91)
Richard A. O'Keefe: An anecdote which may be of value to people designing a C compiler for 64-bit machines: there was a UK company who built their own micro-coded machine, and wanted to put UNIX on it. Their C compiler initially had char=8, short=16, int=32, long=64 bits, sizeof (int) == sizeof (char*). They changed their compiler in a hurry, so that long=32 bits; it was less effort to do that than to fix all the BSD sources. ... eh?? any reason they couldn't have compiled with -Dlong=int ? (Or, if you wanna be fancy, you could #define long _long typedef int _long; seems rather silly to break the compiler just because of old code... Raul Rockwell
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/05/91)
What would be the best way to do this: I want to pass around integer numbers that I know will require more than 32 bits but not more than 63 bits. An example of such a number of the number of microseconds in the century. The uses include passing them to functions as arguments and receiving them back as return values. I want to specify it sufficiently that a reasonable implementation on a 64 bit machine will in fact use the 64 bit integer instructions. Whatever way it is to be specified should work on all such 64 bit machines. If I were to use an array of smaller integers, I'd have to code specific macros or functions to apply operations to these values that would be more preferrable to do as simple arithmetic operations. But the big deal is that a 64-bit machine would not get to use its 64-bit capabilities. It does no good to get a 64-bit machine if it is just going to be doing 32-bit data operations all the time. But of course I want to do it portable within the scope of 64 bit machines. Shouldn't "long" always represent at least the longest natural operation width on the given architecture, so that it is at least POSSIBLE to code applications that need that architecture? (I am speaking in terms of current and future directions in C, not in compatibility of old code, which is a separate issue) -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/05/91)
rockwell@socrates.umd.edu (Raul Rockwell) writes: >seems rather silly to break the compiler just because of old code... And it seems rather silly to prevent NEW code from being GOOD code just because of old code... New compilers can be made to have different modes, one for old traditional code, and one for new modern portable standard and possibly extended code. The only time you'd need that is when you are porting old code to a new machine and are not expecting to get the full benefit of the new machine (such as using only 32 bit operations on a 64 bit machine... ick). -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
cadsi@ccad.uiowa.edu (CADSI) (05/06/91)
From article <1991May03.120455.158@buster.stafford.tx.us>, by rli@buster.stafford.tx.us (Buster Irby): > cadsi@ccad.uiowa.edu (CADSI) writes: > >>Binary isn't in general portable. However, using proper typedefs in >>a class one can move binary read/write classes from box to box. I think >>the solution the the whole issue of sizeof(whatever) is to simply assume >>nothing. Always typedef. It isn't that difficult, and code I've done this >>runs on things ranging from DOS machines to CRAY's COS (and UNICOS) without >>code (barring the typedef header files) changes. > > What kind of typedef would you use to swap the high and low bytes > in a 16 bit value? An Intel or BIG_ENDIAN machine stores the > bytes in reverse order, while a Motorolla or LITTLE_ENDIAN > machine stores the bytes in normal order (High to low). There is > no way to fix this short of reading the file one byte at a time > and stuffing them into the right place. The point I was trying > to make is that reading and writing a data file has absolutely > nothing to do with data types. As we have already seen, there > are a lot of different machine types that support C, and as far > as I know, all of them are capable of reading binary files, > independent of data type differences. The big/little endian problem is handled via swab calls. AND, how do we know when to do this???? We just store the needed info in a header record. This header is read in block fashion and typedef'ed to the structure we need. from there, thats all we need to continue. The typedefs have to do with internal structures, NOT simple int, char and those type things, except for the BYTE type. Last but not least, you'll inevitably ask how we portably read that header. Well, we store a 'magic number' info and mess with things till the numbers are read correctly. Incidentally, that magic number also gives indications of code revision level and therefore what will and won't be possible. C'mon, this is not that difficult to comprehend. You want portable files??? Make 'em yourself. 'C' gives you all the toys you need to do this. [other stuff deleted - reference above] |----------------------------------------------------------------------------| |Tom Hite | The views expressed by me | |Manager, Product development | are mine, not necessarily | |CADSI (Computer Aided Design Software Inc. | the views of CADSI. | |----------------------------------------------------------------------------|
boyne@hplvec.LVLD.HP.COM (Art Boyne) (05/06/91)
In comp.lang.c, bhoughto@pima.intel.com (Blair P. Houghton) writes:
There's no reason for an int to be less than the full
register-width, and no reason for an address to be limited
to the register width.
Wrong! There is a *good* reason. On processors whose data bus width
is less than the register width (eg., 68000/8/10), the performance penalty
for the extra data fetches may be significant. And since these processors
have only 16x16 multiplies and 32x16 divides, a 16-bit "int" type may make
a lot more sense than a 32-bit "int".
Typical applications also should have an impact on the choice. If the
compiler is intended to support general-purpose applications running on
a family of processors (eg., 680x0), then perhaps it should be tailored
to somewhere mid- to high-range. On the other hand, one intended to
support imbedded applications only (like the instrument controllers I
work with), had better look at the low end *very* carefully. 68000's
as instrument controllers are common here. 68020's are almost unheard-of.
32-bit ints are a detriment for typical instrument control application,
in terms of RAM usage, ROM size, *and* performance.
For such CPU's and applications, it would *really* helpful for the
compiler to support a 16 or 32 bit "int" switch. I wish the compiler
we use did.
Art Boyne, boyne@hplvla.hp.com
shap@shasta.Stanford.EDU (shap) (05/07/91)
In article <5535@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: > >sizeof (char) is fixed at 1. However, it should be quite easy to set up >a compiler so that the user can specify (whether in an environment variable >or in the command line) what sizes to use for short, int, long, and (if you >want to imitate GCC) long long. Something like > setenv CINTSIZES="16,32,32,64" # short,int,long,long long. >The system header files would have to use the default types (call them >__int, __short, and so on) so that only one set of system libraries would >be needed, and this means that using CINTSIZES to set the sizes to something >other than the defaults would make the compiler non-conforming. In practice (having tried it once for other reasons), this doesn't work as well as you might like. The problem comes from the fact that the vendor doesn't control the independent software vendors. For their part, the ISV's want portability, so it's real hard to convince them of the merits of converting their header files. It also becomes an ongoing support and update nightmare. Jonathan
turk@Apple.COM (Ken Turkowski) (05/07/91)
rli@buster.stafford.tx.us (Buster Irby) writes: >An Intel or BIG_ENDIAN machine stores the >bytes in reverse order, while a Motorolla or LITTLE_ENDIAN >machine stores the bytes in normal order (High to low). You've got this perfectly reversed. Motorola is a BIG_ENDIAN machine, and Intel is a LITTLE_ENDIAN machine. Additionally, there is no such thing as "normal". -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA Internet: turk@apple.com Applelink: TURK UUCP: sun!apple!turk
turk@Apple.COM (Ken Turkowski) (05/07/91)
cadsi@ccad.uiowa.edu (CADSI) writes: >The big/little endian problem is handled via swab calls. >AND, how do we know when to do this???? We just store >the needed info in a header record. >This header is read in block fashion and typedef'ed to the structure we need. What type of header do you suggest? This should be able to record the ordering of shorts, longs, floats, and doubles, and might need to specify floating-point format. -- Ken Turkowski @ Apple Computer, Inc., Cupertino, CA Internet: turk@apple.com Applelink: TURK UUCP: sun!apple!turk
gwyn@smoke.brl.mil (Doug Gwyn) (05/07/91)
In article <1991May4.202438.14664@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes: >I want to pass around integer numbers that I know will require more than >32 bits but not more than 63 bits. >I want to specify it sufficiently that a reasonable implementation on a 64 bit >machine will in fact use the 64 bit integer instructions. Whatever way it >is to be specified should work on all such 64 bit machines. The question is, should it also work on non-64 bit architectures? If not, just use "long". If so, you'll need some fairly obvious type definitions, macros, etc. To automatically configure your code to accommodate both types of architecture, you can make the definitions conditional on some arithmetic property in the preprocessor that will produce different results in the two environments; for example you could test for sign extension of the 32nd bit.
gwyn@smoke.brl.mil (Doug Gwyn) (05/07/91)
In article <TMB.91May3225038@volterra.ai.mit.edu> tmb@ai.mit.edu (Thomas M. Breuel) writes: >In essence, this is actually what the C standard does, if one >continues to use the current data types with roughly their current >meaning: "short" is close to, and at least 16 bits, and "long" is >close to, and at least 32 bit. The emphasis here is on "close to", and >this should probably be made explicit in the standard, since >programmers pragmatically do, and need to, rely on it to be able to >estimate what the space requirements of their programs will be. It was not the intention of the C standard to require the "close to" attribute as you describe it. Some architectures are such as to make that an impractical implementation, and on such architectures a char might even be 128 bits. (However, most implementations will make an exception for "char" and pack them fairly tightly into a word, even though it does slow down operations on char type considerably.)
john@sco.COM (John R. MacMillan) (05/07/91)
|>|It is necessary to have 8, 16, and 32-bit data types, in order to be able |>|to read data from files. |>It's not necessary, but it does make it easier. | |Not even that. Assuming that for some unknown reason you're faced with |reading a binary file that originated on some other system, there is a |fair chance that it used a "big endian" architecture while your system |is "little endian" or vice-versa. I certainly didn't mean to imply that this would help something be universally portable; rather that if you're _lucky_ and the endianess is the same, having the right size data types available might make that single port easier. |Binary data transportability is a much thornier issue than most people |realize. Yes, I've seen many ``portable'' binary formats that simply weren't.
msb@sq.sq.com (Mark Brader) (05/07/91)
> There are numerous CONFORMING ways in > which additional integer types can be added to C. "long long" is NOT > one of these, and a standard-conforming implementation is OBLIGED to > diagnose the use of "long long", which violates the Constraints of > X3.159-1989 section 3.5.2. Therefore "long long" is not a wise way > to make such an extension. I disagree. I think "long long" is a preferable approach. The Standard does not guarantee that there exists, in a C implement- ation, any integral type wider than 32 bits. A programmer wishing to do arithmetic on integer values exceeding what can be stored in 32 bits has three options: (a) use floating point; (b) represent each such value using more than one object of some existing integral type, e.g. using a "bignums package"; or (c) use an integral type known to provide the required number of bits, and never port the program to machines where no such type exists. Option (a) may not be feasible for a number of reasons, not least of which is that significance may be lost using floating point -- the Standard does not guarantee that any floating point type in a C imple- mentation can hold as many significant digits as a 32-bit integer can. Option (b) also has significant costs, so the programmer with access to a suitable machine may choose to accept the portability loss and choose (c). Now, what would we like to happen if a program that assumed 64-bit integers existed was ported to a machine where they didn't? We would like the compilation to fail, that's what! Suppose that the implementation defines long to be 64 bits; then, to force such a failure, the programmer would have to take some explicit action, like assert (LONG_MAX >= 0777777777777777777777); On the other hand, suppose that the implementation defines a separate "long long" type for 64-bit integers. Then when the user compiles the program on the 64-bit machine, they get: cc: warning: "long long" is an extension and not portable and, assuming a reasonable quality of implementation, they can eliminate this message with a cc option if desired. And if they do try to port, they get a fatal error in compilation. This behavior seems exactly right to me. Now, I am *not* saying that an implementation should necessarily make "long long" its *only* 64-bit integral type. It'd be wholly reasonable to define *both* "long" and "long long" as 64 bits. Just as a programmer uses "long" whenever more than 16 bits are *required*, although "int" may be the same as "long", "long long" could be used whenever more than 32 bits are required, although "long" might be the same as "long long". -- Mark Brader \ "He's suffering from Politicians' Logic." SoftQuad Inc., Toronto \ "Something must be done, this is something, therefore utzoo!sq!msb, msb@sq.com \ we must do it." -- Lynn & Jay: YES, PRIME MINISTER This article is in the public domain.
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/08/91)
msb@sq.sq.com (Mark Brader) writes: >Now, what would we like to happen if a program that assumed 64-bit >integers existed was ported to a machine where they didn't? We would How about coding it so that if a symbol such as "LONGLONG64" is not defined conditional compilation will fall back to code that invokes a bignum package for 64 bit ints. Then simply -DLONGLONG64 will get the good code. I just picked LONGLONG64 not knowing if there is a better thing to use. -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
daves@ex.heurikon.com (Dave Scidmore) (05/08/91)
>rli@buster.stafford.tx.us (Buster Irby) writes: >>An Intel or BIG_ENDIAN machine stores the >>bytes in reverse order, while a Motorolla or LITTLE_ENDIAN >>machine stores the bytes in normal order (High to low). In article <13357@goofy.Apple.COM> turk@Apple.COM (Ken Turkowski) writes: >You've got this perfectly reversed. Motorola is a BIG_ENDIAN machine, >and Intel is a LITTLE_ENDIAN machine. Additionally, there is no >such thing as "normal". Exactly. Both conventions have good points and bad points. The "normal" Motorolla convention starts to look a little less normal when dynamic bus sizing is required, in which case all byte data comes over the most significant data bus lines. In addition big endian machines have values that expand in size from the most significant location (i.e. two bytes values at location X have least significant bytes in a different location than four byte values at the same location). On the other hand the little endian convention looks odd when you do a dump of memory and try to find long words within a series of bytes. In the end you can make either convention look "normal" by how you draw the picture of it. For example which of these is normal for storing the value 0x12345678 ? Motorola: Location 0 1 2 3 Bytes 0x12 0x34 0x56 0x78 Intel: Location 3 2 1 0 Bytes 0x12 0x34 0x56 0x78 -- Dave Scidmore, Heurikon Corp. dave.scidmore@heurikon.com
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/09/91)
shankar@hpcupt3.cup.hp.com (Shankar Unni) writes: >Well, lint goes a long way towards pointing out int-long mismatches, etc. >But in my (humble?) opinion, much trouble would be headed off if C compilers >on 64-bit architectures would simply dispense with the 16-bit type and make >sizes of int == long == void * == 64 bits, and short == 32 bits. Why is it >so terribly important to have a 16-bit data type? That DOUBLES the size of a program that is loading and working with digitized audio samples that are moer than 8 bits wide. >In any case, memory is getting cheaper these days.. And software is getting bigger just as fast to fill it up. -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/09/91)
In article <470@heurikon.heurikon.com>, daves@ex.heurikon.com (Dave Scidmore) writes: > I'm supprised nobody has mentioned that the real solution to this kind > of portability problem is for the original programmer to use the > definitions in "types.h" that tell you how big chars, shorts, ints, > and longs are. Why be surprised? I'm using an Encore Multimax running 4.3BSD, and on this machine there _isn't_ any types.h file. We have a copy of GCC, so we _have_ access to the ANSI file, but that's <limits.h>, not "types.h". -- Bad things happen periodically, and they're going to happen to somebody. Why not you? -- John Allen Paulos.
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/09/91)
In article <TMB.91May3225038@volterra.ai.mit.edu>, tmb@ai.mit.edu (Thomas M. Breuel) writes: > I am greatly disappointed that C++, having added so much to C, has not > added something like int(Low,High) to the language, which would stand > for the "most efficient" available integral type in which both Low and > High were representable. The ANSI C committee were right not to add > such a construct to C, because their charter was to standardise, not > innovate. > > I think allowing the programmer to specify arbitrary precision > integers is equally bad, since it adds too much complexity to > the language and to compilers. But nowhere did I suggest "allowing the programmer to specify arbitrary precision integers". Read what I wrote: the compiler would select "the most efficient AVAILABLE integral type in which both Low and High were representable". I'm not talking about increasing the stock of integral types built into the compiler, simply talking about a way of selecting one of those types. For any given copy of <limits.h> I can define an M4 macro selected_integral_type(Low,High) which expands to char,short,int,long, &c. Indeed, I believe I posted such a macro to this group last year. My point is that this could have been built into the compiler TRIVIALLY. I really do mean "trivially"; we are talking about adding one 10-line function to a C compiler, plus a couple of productions in a Yacc grammar. What I was asking for is _less_ than what Pascal provides! -- Bad things happen periodically, and they're going to happen to somebody. Why not you? -- John Allen Paulos.
clive@x.co.uk (Clive Feather) (05/09/91)
In article <1991May6.232116.11401@sq.sq.com> msb@sq.sq.com (Mark Brader) writes: > I disagree. I think "long long" is a preferable approach. [...] > A programmer wishing > to do arithmetic on integer values exceeding what can be stored in > 32 bits has three options: [...] > (c) use an integral type known to provide the required number of bits, > and never port the program to machines where no such type exists. [...] > Now, what would we like to happen if a program that assumed 64-bit > integers existed was ported to a machine where they didn't? We would > like the compilation to fail, that's what! Suppose that the implementation > defines long to be 64 bits; then, to force such a failure, the programmer > would have to take some explicit action, like > > assert (LONG_MAX >= 0777777777777777777777); > > On the other hand, suppose that the implementation defines a separate > "long long" type for 64-bit integers. Then when the user compiles the > program on the 64-bit machine, they get: > > cc: warning: "long long" is an extension and not portable > > and, assuming a reasonable quality of implementation, they can eliminate > this message with a cc option if desired. And if they do try to port, > they get a fatal error in compilation. > > This behavior seems exactly right to me. If you want the compilation to fail, then what's wrong with the following ? #if LONG_MAX < 0xFFFFffffFFFFffff ??=error Long type not big enough for use. #endif This causes the compilation to fail only when long is not big enough, does not require any new types in the implementation, and generates *no* messages on an 64-bit-long system. Notes: the use of a hex, rather than octal, constant awith mixed case makes it easier to count the number of digits, and the explicit trigraph is used to choke (non-ANSI) implementations which don't have #error, and which might object to it even when the condition of the #if is false. -- Clive D.W. Feather | IXI Limited | If you lie to the compiler, clive@x.co.uk | 62-74 Burleigh St. | it will get its revenge. Phone: +44 223 462 131 | Cambridge CB1 1OJ | - Henry Spencer (USA: 1 800 XDESK 57) | United Kingdom |
conor@lion.inmos.co.uk (Conor O'Neill) (05/09/91)
In article <179@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes: >While I happen to agree with this sentiment, there is an argument that X >hundred million lines of C code can't be wrong. The problem with >theology is that it's not commercially viable. Or did you mean "C hundred million lines of X code"... (Apparently X even has such nasties buried inside it as expecting that successive calls to malloc have higher addresses, forcing the heap to grow upwards.) (So I'm informed) --- Conor O'Neill, Software Group, INMOS Ltd., UK. UK: conor@inmos.co.uk US: conor@inmos.com "It's state-of-the-art" "But it doesn't work!" "That is the state-of-the-art".
det@nightowl.MN.ORG (Derek E. Terveer) (05/10/91)
msb@sq.sq.com (Mark Brader) writes: >> There are numerous CONFORMING ways in >> which additional integer types can be added to C. "long long" is NOT >> one of these, and a standard-conforming implementation is OBLIGED to >> diagnose the use of "long long", which violates the Constraints of >> X3.159-1989 section 3.5.2. Therefore "long long" is not a wise way >> to make such an extension. >I disagree. I think "long long" is a preferable approach. >The Standard does not guarantee that there exists, in a C implement- >ation, any integral type wider than 32 bits. [...] But the standard also does not guarantee (as far as i know) that there doesn't exist >32 bits. What is wrong with simply implementing the following in a compiler? char = 8 bits short = 16 bits int = 32 bits long = 64 bits -- det@nightowl.mn.org
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/11/91)
det@nightowl.MN.ORG (Derek E. Terveer) writes: >What is wrong with simply implementing the following in a compiler? > char = 8 bits > short = 16 bits > int = 32 bits > long = 64 bits There is apparently (as some people complain about) code out there that depends upon the MAX size of the type. In other words, if "long" is longer than 32 bits, it breaks. But porting such code to a 64-bit machine *AND* writing good standardized code for the same machine are in mutual conflict because of this. The only way out I can see is for the compiler to default to what is the most reasonable for NEW AND GOOD code to be developed, and have some sort of flag or flags to allow it to be customized to better handle the cases of porting old code. I'd also suspect that if you can find code where the long depends on being exactly 32 bits, you could well find code where the int depends on being exactly 16 bits. So there probably is not one single ideal solution to the problem. So perhaps a system of flags like: -SHORTnn -INTnn -LONGnn Which actually change the sizes of the primitive types, giving a warning if the "short <= int <= long" constraint is violated (but do the compile as specified anyway). It would be an extension, not standard C. But when porting old code, we aren't addressing standards, are we? -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
gwyn@smoke.brl.mil (Doug Gwyn) (05/11/91)
In article <45690005@hpcupt3.cup.hp.com> shankar@hpcupt3.cup.hp.com (Shankar Unni) writes: >But in my (humble?) opinion, much trouble would be headed off if C compilers >on 64-bit architectures would simply dispense with the 16-bit type and make >sizes of int == long == void * == 64 bits, and short == 32 bits. Why is it >so terribly important to have a 16-bit data type? It isn't important, except perhaps to people who are trying to port poorly-implemented code that managed to depend on such a system-dependent feature. Such code undoubtedly has far worse problems that that anyway. By the way, there is NO NEED to say what data size choices a 64-bit implementation "should" make. It "should" not matter to any sensible application.
david@ap542.uucp (05/15/91)
-cadsi@ccad.uiowa.edu (CADSI) writes: ->by rli@buster.stafford.tx.us (Buster Irby): ->> cadsi@ccad.uiowa.edu (CADSI) writes: ->> ->>>Binary isn't in general portable. However, using proper typedefs in ->>>a class one can move binary read/write classes from box to box. ->> ->> What kind of typedef would you use to swap the high and low bytes ->> in a 16 bit value? An Intel or BIG_ENDIAN machine stores the ->> bytes in reverse order, while a Motorolla or LITTLE_ENDIAN ->> machine stores the bytes in normal order (High to low). There is ->> no way to fix this short of reading the file one byte at a time ->> and stuffing them into the right place. -> ->The big/little endian problem is handled via swab calls. ->AND, how do we know when to do this???? We just store ->the needed info in a header record. NO, NO, NO! The way to do this is to use XDR. **========================================================================** David E. Smyth david%ap542@ztivax.siemens.com <- I can get your mail, but our mailer is broken. But I can post. You figure... **========================================================================**
ray@philmtl.philips.ca (Ray Dunn) (05/16/91)
In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes: >>2. If a trade-off has to be made between compliance and ease of >>porting, what's the better way to go? > >If you're compliant, you're portable. This is like saying if the syntax is correct then the semantics must be too although, indeed, there is no need to trade compliance for portability. There are dangers clearly visible though. All you can say is that your compliant program has an excellent chance of compiling on another system running a compliant compiler, not that it will necessarily work correctly with the new parameters plugged in. You only know a program is portable *after* you've tested it on another system. How many do that during the initial development cycle? Porting is not an issue that goes away by writing compliant code - it may in fact *hide* some of the problems. If I wanted to be controversial, I might say that 'C's supposed "portability" is a loaded cannon. Bugs caused by transferring a piece of software to another system will continue to exist, even in compliant software. Prior to "portable" 'C', porting problems were *expected*, visible, and handled accordingly. Will developers still assume these bugs to be likely, and handle verification accordingly, or will they be lulled by "it compiled first time" into thinking that the portability issue has been taken into account up front, and treat it with less attention than it deserves? -- Ray Dunn. | UUCP: ray@philmtl.philips.ca Philips Electronics Ltd. | ..!{uunet|philapd|philabs}!philmtl!ray 600 Dr Frederik Philips Blvd | TEL : (514) 744-8987 (Phonemail) St Laurent. Quebec. H4M 2S9 | FAX : (514) 744-9550 TLX: 05-824090
jeh@cmkrnl.uucp (05/16/91)
In article <521@heurikon.heurikon.com>, daves@ex.heurikon.com (Dave Scidmore) writes: > [...] > On the other hand the little > endian convention looks odd when you do a dump of memory and try to find long > words within a series of bytes. Yup. But this can be solved. The VAX is a little-endian machine, and VMS utilities address [ahem] this problem by always showing hex contents with increasing addresses going from right to left across the page. Since significance of the bytes (actually the nibbles) increases with increasing addresses, this looks perfectly correct... the most significant nibble goes on the left, just the way you'd "naturally" write it. For example, the value 1 stored in 32 bits gets displayed as 00000001. If you get a hex-plus-Ascii dump, such as is produced by DUMP (for files) or ANALYZE/SYSTEM (lets you look at live memory), the hex goes from right to left, and the ascii from left to right, like this: SDA> ex 200;60 0130011A 0120011B 0130011E 0110011F ......0... ...0. 00000200 01200107 02300510 04310216 04210218 ..!...1...0... . 00000210 01100103 01100104 01200105 01200106 .. ... ......... 00000220 44412107 01100100 01100101 01100102 .............!AD 00000230 4B202020 20444121 44412106 42582321 !#XB.!AD!AD K 00000240 00524553 55525055 53434558 454E5245 ERNEXECSUPRUSER. 00000250 In the last row, the string "EXEC" is at address 253, and the last byte on the line, 25F, contains hex 00. In the first row, the word (16 bits) at location 204 contains hex value 11E; if you address the same location as a longword, you get the value 0130011E. This looks completely bizarre at first, but once you get used to it (a few minutes or so for most folks) it makes perfect sense. The VAX is consistent in bit numbering too: The least significant bit of a byte is called bit 0, and when you draw bit maps of bytes or larger items, you always put the lsb on the right. --- Jamie Hanrahan, Kernel Mode Consulting, San Diego CA Chair, VMS Internals Working Group, U.S. DECUS VAX Systems SIG Internet: jeh@dcs.simpact.com, hanrahan@eisner.decus.org, or jeh@crash.cts.com Uucp: ...{crash,scubed,decwrl}!simpact!cmkrnl!jeh
msb@sq.sq.com (Mark Brader) (05/18/91)
> > I think "long long" is a preferable approach. ... A programmer wishing > > to do arithmetic on integer values exceeding what can be stored in > > 32 bits has three options: > [...] > > (c) use an integral type known to provide the required number of bits, > > and never port the program to machines where no such type exists. > [...] > > Now, what would we like to happen if a program that assumed 64-bit > > integers existed was ported to a machine where they didn't? We would > > like the compilation to fail, that's what! Suppose that the implementation > > defines long to be 64 bits; then, to force such a failure, the programmer > > would have to take some explicit action, like > > assert (LONG_MAX >= 0777777777777777777777); > If you want the compilation to fail, then what's wrong with the > following ? > > #if LONG_MAX < 0xFFFFffffFFFFffff /* wrong, actually */ > ??=error Long type not big enough for use. > #endif I would take that to be "something like" my assert() example, and don't have a strong preference between one and the other. In ANSI C the use of an integer constant larger than ULONG_MAX is a constraint violation (3.1.3) anyway, so it really suffices to say 01777777777777777777777; and this has a certain charm to it. But the compiler might issue only a warning, rather than aborting the compilation. > This causes the compilation to fail only when long is not big enough, > does not require any new types in the implementation, and generates *no* > messages on an 64-bit-long system. But whichever of these the programmer uses, *it has to be coded explicitly*. My feeling is that enough of the 64-bit people [i.e. those worth $8 :-)] will carelessly omit to do so, once 64-bit machines [i.e. those worth $8! :-) :-)] become more common, as to create portability problems. It will, I fear, be exactly the situation that we've already seen where there's much too much code around that assumes 32-bit ints. > Notes: the use of a hex, rather than octal, constant with mixed case > makes it easier to count the number of digits ... The octal was for fun, since it was a "bad example" anyway. However, if one *is* going to amend it, it would be as well if the amended version retained the correct value of the constant. (It was LONG_MAX, not ULONG_MAX, in that example.) } But the standard also does not guarantee (as far as i know) that there } doesn't exist [a type with] >32 bits. } } What is wrong with simply implementing the following in a compiler? } char = 8 bits } short = 16 bits } int = 32 bits } long = 64 bits Nothing -- unless, as I explained above, it leads to a community of users who *expect* long to have 64 bits. My own preference would in fact be to have a long long type, but for *both* long and long long to be 64 bits. (The long long type would also imply such things as LL suffixes on constants, %lld printf formats, appropriate type conversion rules, and so on. I haven't examined the standard exhaustively to see whether there's anything where the appropriate extension is non-obvious, but certainly for most things it is obvious.) I would like to see "long long" established enough in common practice that, in the *next* C standard, the section that now itemizes among other things the following minimum values: SHRT_MAX +32767 INT_MAX +32767 LONG_MAX +2147483647 will, *if* 64-bit machines are sufficiently common by then, leave those values unchanged and add: LLONG_MAX +9223372036854775807 -- Mark Brader "'A matter of opinion'[?] I have to say you are SoftQuad Inc., Toronto right. There['s] your opinion, which is wrong, utzoo!sq!msb, msb@sq.com and mine, which is right." -- Gene Ward Smith This article is in the public domain.
bhoughto@pima.intel.com (Blair P. Houghton) (05/18/91)
In article <1991May15.190016.21817@philmtl.philips.ca> ray@philmtl.philips.ca (Ray Dunn) writes: >In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes: >>If you're compliant, you're portable. > >You only know a program is portable *after* you've tested it on another >system. How many do that during the initial development cycle? Well, I do, several times, to the point of working for a while on one platform, moving to another, testing, working there for a while, moving to a third, and so on. It helps to aim for three targets. >If I wanted to be controversial, I might say that 'C's supposed >"portability" is a loaded cannon. Bugs caused by transferring a piece of >software to another system will continue to exist, even in compliant >software. Prior to "portable" 'C', porting problems were *expected*, >visible, and handled accordingly. Now they're bugs in the compiler, not just "issues of implementation." If you're using any C construct that produces different behavior on disparate, conforming implementations, then either one of those implementations is not conforming or you are not using ANSI C, but rather have used some sort of extension or relied on some sort of unspecified behavior, and therefore your program is not compliant. >Will developers still assume these bugs to be likely, and handle >verification accordingly, or will they be lulled by "it compiled first >time" into thinking that the portability issue has been taken into account >up front, and treat it with less attention than it deserves? Your question is all but naive. I still get valid enhancement requests on code that's several years old, which means that I failed to design it to suit the needs of my customer, which means it's buggy. Routines that have been compiled and/or run thousands of times in real-world situations come up wanting. Nobody sane assumes that anything is right the first time (though one may determine that the probability of failure is low enough to make an immediate release feasible). --Blair "I'm going to put all of this on video and hawk it on cable teevee in the middle of the night while wearing pilled polyester and smiling a lot."
bret@orac.UUCP (Bret Indrelee) (05/20/91)
In article <1991May9.192156.19291@nightowl.MN.ORG> det@nightowl.MN.ORG (Derek E. Terveer) writes: > >What is wrong with simply implementing the following in a compiler? > > char = 8 bits > short = 16 bits > int = 32 bits > long = 64 bits I agree, this is the best of many choices. The problems with it are: 1) You will use more data space because longs are twice as large. On a 64bit arch, this means problems in swapping. There is enough VA space, you just wouldn't be using as efficiently as if long was 32 bits. 2) You break programs that assume int is going to match the size of anything. Translation: you break programs that already can not be ported between available 32 bit machines that make a different choice (sizeof int == sizeof short) || (sizeof int == sizeof long) Fix the programs. 3) You break programs that don't use void pointers when they need a generic pointer. See #2 above. 4) You may find new bugs in programs, where an overflow that you never knew existed on a 32bit machine now makes your integer math come out different. Most of these come down to problems with programs that already don't work on existing 32 machines. Start using typedef and MAX_INT people. Your replacements will thank you rather than curse you. -Bret -- ------------------------------------------------------------------------------ Bret Indrelee | <This space left intentionally blink> bret@orac.edgar.mn.org | ;^)
bret@orac.UUCP (Bret Indrelee) (05/20/91)
In article <16103@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: > >By the way, there is NO NEED to say what data size choices a 64-bit >implementation "should" make. It "should" not matter to any sensible >application. Except maybe the guy writing a device driver, where the person needs to exactly match the size of a data type to the size of a hardware register. Yet another reason to spread the sizes around. -Bret -- ------------------------------------------------------------------------------ Bret Indrelee | <This space left intentionally blink> bret@orac.edgar.mn.org | ;^)
clive@x.co.uk (Clive Feather) (05/21/91)
In article <1991May18.011520.8330@sq.sq.com> msb@sq.sq.com (Mark Brader) writes: [>>> is msb, >> is myself] >>> Suppose that the implementation >>> defines long to be 64 bits; then, to force such a failure, the programmer >>> would have to take some explicit action, like >>> assert (LONG_MAX >= 0777777777777777777777); >> If you want the compilation to fail, then what's wrong with the >> following ? >> #if LONG_MAX < 0xFFFFffffFFFFffff /* wrong, actually */ >> ??=error Long type not big enough for use. >> #endif > I would take that to be "something like" my assert() example, and don't > have a strong preference between one and the other. True. The only advantage I claim for it is that it is a compile time test, rather than a run-time test. > But whichever of these the programmer uses, *it has to be coded explicitly*. > My feeling is that enough of the 64-bit people [i.e. those worth $8 :-)] > will carelessly omit to do so, once 64-bit machines [i.e. those worth $8! > :-) :-)] become more common, as to create portability problems. It will, > I fear, be exactly the situation that we've already seen where there's > much too much code around that assumes 32-bit ints. But no-one suggests forcing ints to be 32 bits just to solve this problem. In general, there is no sympathy for people who fail to code for portability. Why should we make an exception for this one case. You might equally well say that people used to 36-bit machines shouldn't have to write code like: #if UINT_MAX >= 0x3FFFF typedef signed int native_int_like_type; #else typedef signed long native_int_like_type; #else #if INT_MAX >= 0x1FFFF && INT_MIN < -0x20000 typedef unsigned int native_uint_like_type; #else typedef unsigned long native_uint_like_type; #else if they want to continue to think in 18-bit ints. > However, if one *is* going to amend it, it would be as well if the > amended version retained the correct value of the constant. Mea culpa. -- Clive D.W. Feather | IXI Limited | If you lie to the compiler, clive@x.co.uk | 62-74 Burleigh St. | it will get its revenge. Phone: +44 223 462 131 | Cambridge CB1 1OJ | - Henry Spencer (USA: 1 800 XDESK 57) | United Kingdom |
ray@philmtl.philips.ca (Ray Dunn) (05/21/91)
In referenced article, bhoughto@pima.intel.com (Blair P. Houghton) writes: >In referenced article, ray@philmtl.philips.ca (Ray Dunn) writes: >>Prior to "portable" 'C', porting problems were *expected*, >>visible, and handled accordingly. > >Now they're bugs in the compiler, not just "issues of >implementation." No - now they're "issues of *system dependancies*". >>Will developers still assume these bugs to be likely, and handle >>verification accordingly, or will they be lulled by "it compiled first >>time" into thinking that the portability issue has been taken into account >>up front, and treat it with less attention than it deserves? > >Your question is all but naive. Only if you ignore the fact, which you seem to do, that many of the issues of portability in the real world are created by differences in system hardware, operating systems and file management facilities. This is true for nearly all software for example which has a tightly coupled user interface, or which is forced to process system specific non-ascii-stream data files, or to interface with multi-tasking facilities. Even differencies in Floating Point handling can create major pains-in-the-neck. There's more to portability than 'C' conformity. -- Ray Dunn. | UUCP: ray@philmtl.philips.ca Philips Electronics Ltd. | ..!{uunet|philapd|philabs}!philmtl!ray 600 Dr Frederik Philips Blvd | TEL : (514) 744-8987 (Phonemail) St Laurent. Quebec. H4M 2S9 | FAX : (514) 744-9550 TLX: 05-824090 -- Ray Dunn. | UUCP: ray@philmtl.philips.ca Philips Electronics Ltd. | ..!{uunet|philapd|philabs}!philmtl!ray 600 Dr Frederik Philips Blvd | TEL : (514) 744-8987 (Phonemail) St Laurent. Quebec. H4M 2S9 | FAX : (514) 744-9550 TLX: 05-824090
timr@gssc.UUCP (Tim Roberts) (05/23/91)
In article <313@orac.UUCP> bret@orac.UUCP (Bret Indrelee) writes: >In article <1991May9.192156.19291@nightowl.MN.ORG> det@nightowl.MN.ORG (Derek E. Terveer) writes: >> >>What is wrong with simply implementing the following in a compiler? >> >> char = 8 bits >> short = 16 bits >> int = 32 bits >> long = 64 bits > >I agree, this is the best of many choices. Wrong! This is NOT necessarily the best of many choices. THINK about this for a minute! We've had a lot of entirely useless philisophical discussion on this issue. What if your 64 bit architecture doesn't have any instructions to deal with 16 bit units? You certainly aren't going to include something as a fundamental type when your architecture can't easily deal with it, are you? What if, going further, you can't manipulate 32 bit objects either? On such a machine, you would probably create short=int=long=64 bits. The point is this: C data types are intended to map into the fundamental operating units of the underlying hardware. Discussing the correctness of C data type sizing on 64-bit machines in the general case is a pointless waste of network bandwidth. -- timr@gssc.gss.com Tim N Roberts, CCP Graphic Software Systems Beaverton, OR This is a very long palindrome. .emordnilap gnol yrev a si sihT
henry@zoo.toronto.edu (Henry Spencer) (05/23/91)
In article <6659@gssc.UUCP> timr@gssc.UUCP (Tim Roberts) writes: >What if your 64 bit architecture doesn't have any instructions to deal with >16 bit units? ... Then it's going to be in big trouble trying to implement TCP/IP...! >The point is this: C data types are intended to map into the fundamental >operating units of the underlying hardware. Discussing the correctness of >C data type sizing on 64-bit machines in the general case is a pointless waste >of network bandwidth. Not really. There are some really sticky questions even for well-designed 64-bit machines, where there is no strong a priori preference for one scheme or the other. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
dhoward@ready.eng.ready.com (David Howard) (05/23/91)
In article <6659@gssc.UUCP> timr@gssc.UUCP (Tim Roberts) writes: >... >What if your 64 bit architecture doesn't have any instructions to deal with >16 bit units? You certainly aren't going to include something as a fundamental >type when your architecture can't easily deal with it, are you? What if, going >further, you can't manipulate 32 bit objects either? On such a machine, you >would probably create short=int=long=64 bits. C compilers for the 80x86 abortchitecture have long=32 and pointer=32, neither of which is easily supported or natural on that chip. The question as to whether C types should map to the architecture or to what is easiest on the programmer is an interesting one.
bhoughto@pima.intel.com (Blair P. Houghton) (05/23/91)
In article <314@orac.UUCP> bret@orac.UUCP (Bret Indrelee) writes: >In article <16103@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >>It "should" not matter to any sensible application. > >Except maybe the guy writing a device driver, where the person needs >to exactly match the size of a data type to the size of a hardware >register. Yet another reason to spread the sizes around. Picayune semantics: "sensible applications" and "device drivers" are two entirely different laws of physics. More to the point: the driver developer is going to be doing many things more heinous than bit-fields; e.g., casting integer types to pointer types in order to reach memory mappings (even tricks with indexing "all of memory" require placing the base of the "all-of-memory array" at some defined point). ANSI C is specifically not designed for that sort of work. Such things are often better done in assembler, anyway (regardless of ease-of-maintenance). --Blair "The janitorial service industry is 7000 years old, and still nobody thinks there's any dirty work left to be done..."
steve@taumet.com (Stephen Clamage) (05/23/91)
dhoward@ready.eng.ready.com (David Howard) writes: >C compilers for the 80x86 abortchitecture have long=32 and pointer=32, >neither of which is easily supported or natural on that chip. Type long is required to be at least 32 bits. This is reasonable on 386/486. If it is not convenient on 8086/286, that is irrelevant, since the type cannot be smaller. No one would implement long as, say, 36 bits on these machines, since that would in fact be unnatural. Pointers on 8086/286 (or 386 in "real" mode) are either 16 or 32 bits, depending on whether they are "near" or "far", and consequently are easily supported and natural. -- Steve Clamage, TauMetric Corp, steve@taumet.com
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (05/24/91)
timr@gssc.UUCP (Tim Roberts) writes: >What if your 64 bit architecture doesn't have any instructions to deal with >16 bit units? You certainly aren't going to include something as a fundamental >type when your architecture can't easily deal with it, are you? What if, going >further, you can't manipulate 32 bit objects either? On such a machine, you >would probably create short=int=long=64 bits. I believe the discussion centered around machines that indeed could manipulate quantites in all the sizes. But you do have a valid point. The concern I have in the matter is whether or not the capability to manipulate quantites in 64-bit sizes is left out of the standardized part of C. >The point is this: C data types are intended to map into the fundamental >operating units of the underlying hardware. Discussing the correctness of >C data type sizing on 64-bit machines in the general case is a pointless waste >of network bandwidth. I believe C requires: short <= int <= long But it is also suggested that the fundamental unit be defined as "int", not as "long". Which way would you go? -- /***************************************************************************\ / Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu | Guns don't aim guns at \ \ Lietuva laisva -- Brivu Latviju -- Eesti vabaks | people; CRIMINALS do!! / \***************************************************************************/
henry@zoo.toronto.edu (Henry Spencer) (05/24/91)
In article <4383@inews.intel.com> bhoughto@pima.intel.com (Blair P. Houghton) writes: >More to the point: the driver developer is going to be >doing many things more heinous than bit-fields... >ANSI C is specifically not designed for that sort of work. Au contraire; C was designed for that sort of work from the beginning, since that was its first major application, and ANSI C did not break this. One needs to be a bit careful nowadays about using things like "volatile", since modern C compilers are much more aggressive than the DMR original that was used to rewrite the Unix kernel in C, but that's a detail. >Such things are often better done in assembler, anyway >(regardless of ease-of-maintenance). They are *almost* always better done in C. Given a good compiler, it's rare for something to be doable in assembler but not in C. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry