dwb@houxh.UUCP (D.BECK) (09/05/85)
<> What are the reasons for using the type short in C ? On machines that have a different size for "int" and "short" the reason seems obvious, space. However, I would appreciate any thoughts on there usefulness there also. Specifically, I am interested in its use on machines where sizeof (int) = sizeof (short). Thx ..... ! [allegra | ihnp4 ]! hru3c!peb
guy@sun.uucp (Guy Harris) (09/06/85)
> What are the reasons for using the type short in C ? On > machines that have a different size for "int" and "short" > the reason seems obvious, space. However, I would appreciate any thoughts > on there usefulness there also. Specifically, I am interested > in its use on machines where sizeof (int) = sizeof (short). AIEEEEEEEEEEEEEEEEEEEEEE! You don't choose "int" vs. "short" or "int" vs. "long" based on how big "int" is on your machine. Doing that is not writing C code; it is writing code for that particular machine that just happens to be written in C. If you write code that uses "int" instead of "long" because it happens to work on your machine, sure as Goddess made little green apples *somebody* is going to be on your doorstep bitching because your code breaks on their machine. The penalty for using "int" instead of "short" may not be as high (unless you're describing some externally-defined data object, like something recorded on a file or sent over a network, but in those circumstances the size of data objects is the least of your worries - there's byte order, 1's complement vs. 2's complement, character code, floating point format, structure padding, etc.), but if you use "int" instead of "short" on a machine where sizeof(int) == sizeof(short), when that code gets moved to a machine where sizeof(int) != sizeof(short), your program is going to waste some space. If the object being declared is a huge array, it is going to waste a huge amount of space. There is an unfortunate tendency for C programmers to think in terms of a concrete machine that they're programming for, rather than an abstract machine - or, even better, an abstract model of the particular computation they're performing. Thinking of data objects not as lumps of machine words but as abstractions will, I suspect, improve the quality of your code in general, and specifically its portability. Guy Harris
henry@utzoo.UUCP (Henry Spencer) (09/06/85)
> What are the reasons for using the type short in C ? ... > Specifically... on machines where sizeof (int) = sizeof (short). Sloppiness in this is common (although not nearly as disastrous as being sloppy on machines where sizeof(int) == sizeof(long)!). The main reason for distinguishing between the two is portability, i.e. making your programs work on machines which violate this property as well as machines which satisfy it. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
nather@utastro.UUCP (Ed Nather) (09/07/85)
> There is an unfortunate tendency for C programmers to think in terms of a > concrete machine that they're programming for, rather than an abstract > machine - [...] > > Guy Harris The only concrete machine I ever programmed just let you dial in how long it was supposed to run before the stuff got all mixed up. :-) -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather@astro.UTEXAS.EDU
franka@mmintl.UUCP (Frank Adams) (09/10/85)
Or to put it a little differently, use 'int' to optimize access time for the variable, and 'short' to optimize space used.
laura@l5.uucp (Laura Creighton) (09/10/85)
Remember that any code that you write is likely to be ported to machines you never dreamed of. [There are gremlins who wait until the crack of 7 a.m. when all hackers are blissfully asleep to scrounge copies of your source and distribute it to 50 random net sites all over the world.] If your int is the same size as your short it may make no difference to you whether you use int or short, but it may matter to someone else a great deal. [The worse problem is people who think that sizeof(int) == sizeof(long) mostly because that is the way it works on their vax. Move their code to a pdp 11 and watch it dump core when things just aren't big enough.] There are days when I think that the declaration ``int'' should be banned to see if the code quality would improve. -- Laura Creighton (note new address!) sun!l5!laura (that is ell-five, not fifteen) l5!laura@lll-crg.arpa
preece@ccvaxa.UUCP (09/10/85)
> There is an unfortunate tendency for C programmers to think in terms of > a concrete machine that they're programming for, rather than an > abstract machine - or, even better, an abstract model of the particular > computation they're performing. Thinking of data objects not as lumps > of machine words but as abstractions will, I suspect, improve the > quality of your code in general, and specifically its portability. /* > Written 12:58 am Sep 6, 1985 by guy@sun.uucp in ccvaxa:net.lang.c */ ---------- But 'int' is a perfectly good abstraction; more abstract that 'short' or 'long.' The restriction of certain values to certain ranges CAN be part of an abstraction, but it can also be an incidental factor that is only useful because some machines make a distinction that makes it useful. That says to me that use of 'short' or 'long' instead of 'int' shows more attention to machine specificity. It may be the case that in a certain piece of code it is possible to prove that a variable's value must lie in a particular range. If the programmer specifies that range somehow, compilers for languages that support that distinction can produce code taking advantage of it. From the programmer's point of view, however, that provable range is probably not significant to her view of the process. Often it doesn't matter to the programmer whether the variable is real or integer, either, but that abstraction is more deeply ingrained. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
bc@cyb-eng.UUCP (Bill Crews) (09/10/85)
> What are the reasons for using the type short in C ? On > machines that have a different size for "int" and "short" > the reason seems obvious, space. However, I would appreciate any thoughts > on there usefulness there also. Specifically, I am interested > in its use on machines where sizeof (int) = sizeof (short). > > Thx The percentage of machines in the universe that are either 16-bit or 32-bit machines is very highgh. Therefore, one can get a great degree (but certainly not total) compatibility of communicated information by refraining from declaring ints in structures, but instead declaring either shorts or longs. All compilers for such machines of which I am aware implement short as 16 bits and long as 32 bits, although the definition of int may go either way. So, for my data files and communications protocols, this is what I do. If I run into a 36-bit or 48-bit machine, I may have some work to do. By the way, the uint16-type stuff mentioned lately on the net can help here also. -- / \ Bill Crews ( bc ) Cyb Systems, Inc \__/ Austin, Texas [ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc
bc@cyb-eng.UUCP (Bill Crews) (09/10/85)
> There is an unfortunate tendency for C programmers to think in terms of a > concrete machine that they're programming for, rather than an abstract > machine - or, even better, an abstract model of the particular computation > they're performing. Thinking of data objects not as lumps of machine words > but as abstractions will, I suspect, improve the quality of your code in > general, and specifically its portability. > > Guy Harris This sounds great. I agree with it to a point. But doesn't it depend upon what one is trying to accomplish? Certainly those implementing communications protocols DO care. Kernel hackers probably care in lots of places, too, especially those writing device drivers. The net is that sometimes one wants to get close to the machine, and sometimes one wants to use C as a high(er)-level language. That's the beauty of C; it CAN be used either way. But it is up to the programmer. If you really believe what you say, would you support the abolition of the short and long data types? Double as well? I'd be interested. -- / \ Bill Crews ( bc ) Cyb Systems, Inc \__/ Austin, Texas [ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc
guy@sun.uucp (Guy Harris) (09/12/85)
> > Thinking of data objects not as lumps of machine words > > but as abstractions will, I suspect, improve the quality of your code in > > general, and specifically its portability. > > This sounds great. I agree with it to a point. But doesn't it depend upon > what one is trying to accomplish? Certainly those implementing > communications protocols DO care. Kernel hackers probably care in lots > of places, too, especially those writing device drivers. "Do care" about what? Even device drivers, protocol modules, etc. are probably much more likely to be correct if you think of the objects they manipulate abstractly. Some bits of code in device drivers on machines with memory-mapped I/O might have to treat structures which represent device registers, say, as low-level objects, but 99% of the data the OS manipulates - and probably a high percentage of the data the device driver manipulates - are not such low-level objects. > The net is that sometimes one wants to get close to the machine, and > sometimes one wants to use C as a high(er)-level language. That's the > beauty of C; it CAN be used either way. But it is up to the programmer. Even when writing grubby device driver code, you should use C as a higher-level language. There's no benefit to be gained from thinking of, say, the I/O operation/buffer header queue of a disk driver as a bunch of words, some of which contain addresses, some of which contain counts, etc.. The ability of C to get "close to the machine" is vastly overemphasized; 99% of the code people write, even in OSes and the like, doesn't need to get "close to the machine" in the same sense as an assembler language gets "close to the machine" and, in most cases, *doesn't* get "close to the machine" in that sense. > If you really believe what you say, would you support the abolition of the > short and long data types? Double as well? I'd be interested. No, I don't support the abolition of those data types. "int" means "most convenient integral type, guaranteed to hold numbers in the range -32767 to 32767 but not guaranteed to hold anything outside". Needless to say, this is an extremely inappropriate type to represent "number in the range -1 million to 1 million". The type "long" is needed for this (in ANSI C, "long"s are guaranteed to hold values between -(2^31-1) to (2^31-1)). "int" is "closer to the machine" than "long", so if anything "int" should go if you're trying to increase the distance of code from the machine, not "long" or "short". "int" should not be abolished, though, because in a lot of cases "short" and "long" overspecify the type and don't allow the language implementation enough freedom to choose the most appropriate type. If you have some variable which can use any reasonable amount of space without any serious effect on the space requirements of the program, and which is *never* going to be outside the range -32767 to 32767, "int" is the appropriate choice. Guy Harris
bc@cyb-eng.UUCP (Bill Crews) (09/13/85)
> Even when writing grubby device driver code, you should use C as a > higher-level language. There's no benefit to be gained from thinking of, > say, the I/O operation/buffer header queue of a disk driver as a bunch of > words, some of which contain addresses, some of which contain counts, etc.. > The ability of C to get "close to the machine" is vastly overemphasized; 99% > of the code people write, even in OSes and the like, doesn't need to get > "close to the machine" in the same sense as an assembler language gets > "close to the machine" and, in most cases, *doesn't* get "close to the > machine" in that sense. > > Guy Harris You are invited to write a 3Com Ethernet driver that works on a PC (16-bit int) and on a Cyb machine (32-bit int) without referring to longs or shorts. I.e., a 16-bit hardware register is a 16-bit hardware register, despite your desire for abstraction. I doubt we disagree fundamentally; I just think you are stating the case too strongly. My goal is to introduce environment dependencies only as needed. I nevertheless believe, for instance, that when Internet protocol specifies that the header checksum is a 16-bit 2's-complement number, it is wise to comply, if you want the packets to fly properly. -- / \ Bill Crews ( bc ) Cyb Systems, Inc \__/ Austin, Texas [ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc
guy@sun.uucp (Guy Harris) (09/14/85)
> But 'int' is a perfectly good abstraction; more abstract that 'short' > or 'long.' The restriction of certain values to certain ranges CAN > be part of an abstraction, but it can also be an incidental factor > that is only useful because some machines make a distinction that > makes it useful. That says to me that use of 'short' or 'long' > instead of 'int' shows more attention to machine specificity. The use of "long" instead of "int" shows more attention to machine specificity? OK, we have the object "Internet address". This object can be represented as, among other things, a 32-bit quantity. (We neglect the problem of non-binary machines for the nonce.) Implementing this object with an "int" shows a hell of a lot of attention to machine specificity, since it won't work worth a damn on a PDP-11, or any other machine with "int"s less than 32 bits (like machines based on current 8086-family chips, or 68000/68010/68008 machines with 16-bit-"int" compilers). Implementing it with a "long" shows a lot less machine specificity, since (according to the ANSI C standard) a "long" can hold numbers in the range -2147483647 to 2147483647. (On a two's complement machine, or a one's complement machine, or even a sign-magnitude machine, this requires 32 bits.) The same argument applies to "unsigned int" vs. "unsigned long". If the 4.xBSD networking code had been written with less implicit knowledge of the machines it would work on - i.e., if the type "long" had been used where the C specification says it should be (the information explicitly described in the ANSI C standard is here considered part of the "implicit" specification of C - yes, it's folklore, but UNIX is still dominated by folklore) - it would have moved to 2.9BSD more easily. I believe a certain popular news-reading system had much the same problem; it stored the length of an article in an "int" instead of a "long". Earlier versions of Berkeley Mail had the same problem (and "mailx" is based on one of those earlier versions, alas). "int" is to be used to implement "integer" objects whose value will *never* be outside the range -32767 to 32767, and where the amount of space taken up by the object is less important than the amount of time required to manipulate it. (Well, modulo machines with 16-bit data paths and large address spaces, where "int"s are often 32 bits even though it takes more time to manipulate them than it does to manipulate 16-bit quantities.) "short" is to be used to manipulate "integer" objects whose value will never be outside that range but where the amount of space taken up by the object is more important than the amount of time required to manipulate it (either because there's a limit on the address space or physical memory available, or because the object's representation must conform to some externally-imposed restrictions). "long" is to be used to manipulate "integer" objects whose value can be outside the aforementioned range, or whose representation must conform to some externally-imposed restriction that requires the use of "long". Code that uses "int" to implement objects known to have values outside the range -32767 to 32767 is incorrect C. The ANSI standard explicitly indicates this. Even in the absence of such an explicit indication in an official language specification document, this information should be imparted to all people being taught C. If you removed "long" from the C language, you would either 1) have a language incapable of talking about numbers outside the range -32767 to 32767 or 2) have a language which requires at least an 18-bit machine and probably at least a 32-bit machine. "short" is less commonly used, since it provides no guarantees about the range of integral values it can represent that "int" doesn't provide. However, it should be obvious to anyone who is aware of the fact that correct C code can, in most if not all cases, be moved from one machine to another (assuming no operating system dependencies) simply by recompiling it that there *is* a reason to use "short" instead of "int" even if sizeof(short) == sizeof(int) and even if the data doesn't have to conform to some external specification. Thinking of "short" as a compact form of "int" and using it wherever space-efficiency is *or might be* of primary importance will yield code that is more likely to run happily on a variety of machines (and is less likely to piss off the guy who has to get the program running efficiently on a computer other than one of the ones the original programmer had in their shop). > It may be the case that in a certain piece of code it is possible > to prove that a variable's value must lie in a particular range. > If the programmer specifies that range somehow, compilers for > languages that support that distinction can produce code taking > advantage of it. From the programmer's point of view, however, > that provable range is probably not significant to her view of > the process. In a lot of cases, I damn well hope that the provable range is significant to the programmer's view of the process. In the code int a[10]; int i; i = <some value>; a[i]++; if the programmer's view of the process does not include the (provable) condition that "i" will never have a value outside the range 0 to 9, this code is incorrect and, by Murphy's Law, will proceed to demonstrate that fact at the worst possible moment. Plenty of code demonstrates its incorrectness in similar fashion; the code FILE *foo; char buf[SIZE]; foo = fopen(<some_file_name>, "r"); fgets(buf, SIZE, foo); will so demonstrate on a Sun (or a CCI Power 5/20 or a lot of other machines) simply by being run after ensuring that the file to be opened and read does not exist. Replacing it with foo = fopen(...); if (foo != NULL) fgets(...); will probably render it provable that the "fgets" call won't screw up - or, at least, won't screw up by reading from an unopened file. I don't know whether there are proof techniques which are powerful enough to prove the correctness of code involving subrange types in all interesting cases and which are practical. If there are, I'd like to see them incorporated into compilers and have the compiler refuse to generate code unless 1) the necessary checks are put in or maybe 2) explicit directives are inserted *into the code* to tell the compiler that you know what you're doing and it should trust you. (I don't want it to be a compiler option; I want the code to explicitly indicate that it's being unsafe.) Guy Harris
throopw@rtp47.UUCP (Wayne Throop) (09/15/85)
> > Even when writing grubby device driver code, you should use C as a > > higher-level language. > > Guy Harris > You are invited to write a 3Com Ethernet driver that works on a PC > (16-bit int) and on a Cyb machine (32-bit int) without referring to longs > or shorts. I.e., a 16-bit hardware register is a 16-bit hardware register, > despite your desire for abstraction. > Bill Crews Well, it surely can't be done without refering to C's primitive types, but I think that it should still be done abstractly. You need a module that "knows" about the physical-to-C-types mappings that you need to use, and there you say typedef packet_id_type short; /* network needs 16 bits signed */ typedef packet_len_type unsigned; /* network needs 32 bits unsigned */ typedef io_register_type long; /* a 32 bit signed register */ ... etc, etc and everywhere else in the code, you refer to the abstract types packet_id_type, packet_len_type, io_register_type, and so on. Thus, the machine-dependant part of things can be kept to a very small part of the world, by abstracting the machine (or specification) dependant types. Why is this an advantage? Because it allows you to have a "handle" on things that are, say, io_register_type. If you just declared them "long" everywhere, you couldn't tell them from file positions, or other things that might be declared "long". In this sense, using "long" or "short" or "int" directly is often "too low level" a use of C. If you want "a small integer that won't get too large" use "short". But if you want "a thing that must be mapped to a specific size and shape in bits", use an abstract type, not the (changable) primitive type. In "real" code for large systems, I'd hope to almost *never* see *anything* declared to be any primitive type. -- Wayne Throop at Data General, RTP, NC <the-known-world>!mcnc!rti-sel!rtp47!throopw
guy@sun.uucp (Guy Harris) (09/16/85)
> You are invited to write a 3Com Ethernet driver that works on a PC > (16-bit int) and on a Cyb machine (32-bit int) without referring to longs > or shorts. I.e., a 16-bit hardware register is a 16-bit hardware register, > despite your desire for abstraction. The object being described is a 16-bit hardware register. As such, technically, there's no abstract way to describe it in C except *maybe* a bitfield. What happens on an 18-bit machine? One hopes that 1) the register is an 18-bit register and 2) "short" is 18 bits on the machine (or that both are 36 bits). For that matter, what if the board has memory-mapped I/O on one machine, but not on another (i.e., you have to issue I/O instructions to read the device registers)? If you truly want a portable driver, you'll have to pick a level of abstraction *above* the device registers and have 99% of the code in the driver deal with that abstraction rather than the device registers. The Ethernet driver has to deal with a lot of objects that aren't part of the board - a 4.2 driver, for instance, has to talk to the various protocol modules above it. That code doesn't have to get up-close and personal with the machine to the degree that code that fiddles hardware registers does. Also, if you can find *any* statement where I say that the use of "short" and "long" violates proper use of abstractions, and should be avoided, I'd really like to see it. In fact, I believe quite the opposite. In some cases, "int" better fits the abstract object being described than "short" or "long" does, and in some cases "short" or "long" better fit it. "short" and "int" capture a requirement that -32767 to 32767 be the range of values of this (integral) object equally well. However, "short" specifies the amount of space required more stringently than "int" does, and "int" (sort of) specifies the required efficiency for instructions that manipulate the object better than "short" does. Choose the one that specifies the part you *do* care about (and in the case of a 16-bit device register, "short", while *not* perfectly specifying this, does so better than "int" does). If you want something that has a range of values of -2147483647 to 2147483647 (*not* -2147483648 - the Sperry 100 users will get a bit annoyed if you assume that all UNIX machines can represent -2147483648), you should use "long", even if you "know" that you'll be running on a machine with 32-bit (or wider) "int"s. > I nevertheless believe, for instance, that when Internet protocol specifies > that the header checksum is a 16-bit 2's-complement number, it is wise to > comply, if you want the packets to fly properly. So, if your language has a primitive integral data type that specifies that objects of the type are 16 bits and that arithmetic is done in 1's complement (not 2's complement - check the IP spec again), use it. There is no way in C to say "I want a 16-bit 1's complement number." (Note that the 4.2BSD VAX and Sun IP checksum routines both drop into assembler language at times.) Guy Harris
tp@ndm20 (09/18/85)
>I doubt we disagree fundamentally; I just think you are stating the case too >strongly. My goal is to introduce environment dependencies only as needed. >I nevertheless believe, for instance, that when Internet protocol specifies >that the header checksum is a 16-bit 2's-complement number, it is wise to >comply, if you want the packets to fly properly. But what type is that in C. On a Harris H-series machine, a short is 24 bits, as is an int. A long is 48 bits. Obviously this kind of code is machine dependent. I believe that the point Guy is trying to make is that you CAN NOT assume that a short is 16 bits on *every* machine, because it isn't, so you should recognize that what you are doing is machine dependent, no matter HOW you code it in C. If the exact number of bits is mandated, it is a machine dependence how that will be handled. If it is not, then we should deal with abstractions to produce more portable code. I always use the guideline that if a number is less than 16K, I use short, and if it is known to be greater than that I use long. If the range of values is not well known, or space efficiency is not a great concern, I use int, as that is presumably the most efficient integer type. These aren't the best guidelines in the world, and I am open to better suggestions. It would be nice to be able to declare a value range and let the compiler pick the type based on machine characteristics (a la Pascal). Terry Poot Nathan D. Maier Consulting Engineers (214)739-4741 Usenet: ...!{allegra|ihnp4}!convex!smu!ndm20!tp CSNET: ndm20!tp@smu ARPA: ndm20!tp%smu@csnet-relay.ARPA
preece@ccvaxa.UUCP (09/19/85)
> The use of "long" instead of "int" shows more attention to machine > specificity? /* Written 8:49 pm Sep 13, 1985 by guy@sun.uucp in > ccvaxa:net.lang.c */ ---------- Well, yes, to my mind. The programmer, thinking abstractly about a particular process, is likely to think of a variable quantity as an integer or as a real. Precision is a secondary consideration that, generally, only becomes a primary consideration when the simpler assumption fails. The range of an integer is not likely to be a concern until the programmer is faced with a case in which the default assumption ("it's an integer") fails. So I submit that as a default, "int" is more abstract than "short" or "long." That's why the white book says an int is "the natural size suggested by the host machine architecture." In practice, the programmer usually will have a pretty good idea when a quantity is likely to violate that default assumption on a particular machine and work around it accordingly (whether by changing "int" to "long" or by providing a specialized data type if "long" isn't long enough). Now we're going to have a C standard pretty soon, in all probability, and that may well change the default assumptions. If "int" is truly defined as a 16bit quantity, I will probably change my default working habits to use long -- otherwise my default abstraction would be violated too often. Up until the present, however, we have been working with a language in which the definition of short, int, and long were specifically machine dependent, and anyone porting software simply had to be aware of the obvious places where machine dependencies showed up. There are other languages where the purer abstraction is quite acceptable. In Common Lisp and integer can get to be any size it needs to be; the system will take care of keeping it in an appropriate kind of object for the size it currently has. On the other hand, there is a certain amount of overhead in that approach that you might not want to swallow. But it IS machine independent. ---------- > Code that uses "int" to implement objects known to have values outside > the range -32767 to 32767 is incorrect C. The ANSI standard explicitly > indicates this. Even in the absence of such an explicit indication in > an official language specification document, this information should be > imparted to all people being taught C. ---------- Now that there is a reasonably firm draft standard, this is a reasonable statement. Not very long ago it was religious dogma. ---------- > In a lot of cases, I damn well hope that the provable range is > significant to the programmer's view of the process. [gives > example of variable used as index into fixed size array] ---------- Well, yes and no. It is significant to the programmer that the value be a legal index into the array, but that may not be a fixed range in the programmer's mental model of the process. That is, it may be temporarily fixed, simply because it is necessary to provide a value for the declaration, but that size may be incidental. In such cases the well-bred programmer will have provided a #define, with a suggestive name, for the range of the array and anything that checks against it, but may not be able to provide a range for the index other than "fitting within the array," which may not be well modeled by architecturally convenient number sizes. An array with a dynamic size is another example. The point is that "fitting within the array" is a good and sufficient abstraction for the programmer's view of the process. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
peter@graffiti.UUCP (Peter da Silva) (09/25/85)
Would it be a horrid assault on the spirit of 'C' to allow the following: int x:24; Which will be allocated the appropriate number of words for the machine involved? If this looks too much like a bit-field, and you're allergic to that for some reason, how about: int:24 x; Then you can define float foo:16; if you really think you can do something useful with 16-bit floats. Someone must use them for something...
jss@sjuvax.UUCP (J. Shapiro) (10/03/85)
I am inclined to prefer the use of int16, int32, int64, int8, char. These leave it entirely unambiguous which one you wanted and can be typedefed by a standard header file to avoid a lot of machine dependency. It ain't perfect, but it's pretty damn close. Anyone with a 48 bit int deserves what he gets;-) Jon Shapiro -- Jonathan S. Shapiro Haverford College "It doesn't compile pseudo code... What do you expect for fifty dollars?" - M. Tiemann
guy@sun.uucp (Guy Harris) (10/05/85)
> Well, yes, to my mind. The programmer, thinking abstractly about a > particular process, is likely to think of a variable quantity as > an integer or as a real. Which is a dangerous pattern of thought. "int"s don't behave like integers - you were never guaranteed that an "int" could hold numbers outside the range -32768 to 32767 (the first C implementation, remember, was on a 16-bit machine) and "float"s and "double"s definitly don't behave like real numbers (otherwise, there wouldn't be a discipline called "numerical analysis"). Given that the maximum absolute value which an "int" can hold is fairly small, it's not particularly safe to ignore considerations of range (it's not safe to ignore considerations in precision - in "real"s -either, but integers are always precise). Furthermore, if this variable quantity is to be used as, say, an array subscript, if the range of valid values for that quantity is a primary consideration that program stands a good chance of getting a subscript range exception or a wild reference. > In practice, the programmer usually will have a pretty good idea when a > quantity is likely to violate that default assumption on a particular > machine I don't want them to have a pretty good idea when it's going to violate that default assumption on a particular machine. I want them to have a pretty good idea when it's going to violate that default assumption on a 16-bit-"int" machine; then there won't be so much d*mn code out there that needs a good going-over by "lint" before it'll run on PDP-11s and 8086-family-chip based machines (or before it'll handle files, or other objects not constrained by the address space, bigger than 64KB on such machines. > Now we're going to have a C standard pretty soon, in all probability, > and that may well change the default assumptions. If "int" is > truly defined as a 16bit quantity, I will probably change my > default working habits to use long -- otherwise my default > abstraction would be violated too often. Sorry, "there wasn't a rigorous enough standard for C until recently" is not an excuse for using "int" to hold quantities which could, conceivably, be outside the range -32768 to 32767. K&R states quite clearly that "int" takes up 16 bits on PDP-11s, so there exists at least one (very popular) machine on which the "default assumption" about "int" would be violated. Given the number of PDP-11s (and IBM PCs) out there, that assumption is going to be violated quite often. Even before the ANSI C standard explicitly stated that you cannot count on an "int" holding values outside that range (it does *not* "define it as a 16bit quantity"; it defines it as a quantity which *may* hold values outside the range that fits in a 16-bit quantity, but which is not *guaranteed* to be able to hold values outside that range), it was unsafe and bad practice to so use "int". (Consider all the postings that say "our news system truncates items with more than 64KB, so could you please repost XXX" for an example of why it is a bad practice.) > Up until the present, however, we have been working with a language in > which the definition of short, int, and long were specifically machine > dependent, and anyone porting software simply had to be aware > of the obvious places where machine dependencies showed up. They're *still* machine-dependent; however, some of the constraints on them that were implicitly stated by the enumeration of C implementations in K&R are now stated explicitly. People *porting* software shouldn't have to be aware of those places. People *writing* software - even if they "know" that it'll never be ported - should be aware of them. Unfortunately, the people doing the porting get stuck with cleaning up after the mess left by the original author, and those people may have no way to force the original author to fix the problem for them. > > Code that uses "int" to implement objects known to have values outside > > the range -32767 to 32767 is incorrect C. The ANSI standard explicitly > > indicates this. Even in the absence of such an explicit indication in > > an official language specification document, this information should be > > imparted to all people being taught C. > ---------- > Now that there is a reasonably firm draft standard, this is a reasonable > statement. Not very long ago it was religious dogma. Hogwash. If you write code that assumes an "int" can hold things outside that range, you should put something like #ifdef pdp11 FORGET IT, NO WAY, GIVE UP AND GO HOME #endif to emphasise that this code will not run on a PDP-11. Not very long ago K&R stated that "int" was a 16-bit quantity on the PDP-11. At the very least, people writing that code should acknowledge the fact that it won't work on a PDP-11 (or IBM PC, or...). If they'd read the C Reference Manual in K&R, they'd have known that even before there was an ANSI standard. Guy Harris
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/06/85)
> I am inclined to prefer the use of int16, int32, int64, int8, char.
int16 => short
int32 => long
int64 => not a primitive data type on all implementations
int8 => signed char
char => char
Why add more symbols when you already have what is needed in the language?
brooks@lll-crg.ARpA (Eugene D. Brooks III) (10/06/85)
>I am inclined to prefer int8, int16, int32, ....
If that is your inclination then #define or typedef them.
You can even put it in "handy.h". We bother the net with it.
preece@ccvaxa.UUCP (10/07/85)
> /* Written 1:46 am Oct 5, 1985 by guy@sun.uucp in ccvaxa:net.lang.c > */ Given that the maximum absolute value which an "int" can hold is > fairly small, it's not particularly safe to ignore considerations of > range (it's not safe to ignore considerations in precision - in "real"s > -either, but integers are always precise). ---------- Well, when I say "ignoring considerations in precision" I'm speaking in ball-park terms. I know when I have to worry about a value possibly not fitting in 32 bits. Most of the quantities most of us deal with most of the time are small integers. ---------- > I don't want them to have a pretty good idea when it's going to violate > that default assumption on a particular machine. I want them to have a > pretty good idea when it's going to violate that default assumption on > a 16-bit-"int" machine; ---------- Well, I can see how that would make life easier for you, but it's not really my problem. The project I work on would have saved a lot of time if the code we're porting hadn't been written for a system using memory-mapped files, but I don't curse the authors for writing for the environment they had. ---------- > K&R states quite clearly that "int" takes up 16 bits on PDP-11s, so > there exists at least one (very popular) machine on which the "default > assumption" about "int" would be violated. ---------- It also states quite clearly that on PDP-10s an int is 36 bits. I don't see why I should give more weight to one than the other. The definition of short, int, and long is EXPLICITLY machine dependent in K&R. The key phrase, in my opinion, is "'Plain' integers have the natural size suggested by the host machine architecture; the others are provided to meet special needs." That is pretty explicit. ---------- > (Consider all the postings that say "our news system truncates items > with more than 64KB, so could you please repost XXX" for an example of > why it is a bad practice.) ---------- What has that to do with anything? Somebody failed to anticipate future needs and used a short when she should have used a long. There are people who rely on two-digit year codes, too. ---------- > People *porting* software shouldn't have to be aware of those places. > People *writing* software - even if they "know" that it'll never be > ported - should be aware of them. ---------- Portability is one of many factors to be considered in setting local coding standards. I have spent a lot of time recently understanding code written for a very different environment and converting it to C. It had lots of size and byte-ordering problems. That's the breaks. It's not the authors' fault that I had different requirements than they. Now, if I worked for a company in the business of writing software to be marketed across a wide variety of machines and architectures, we would have local conventions designed to make those transitions easy. But it is not our business to produce code that runs on PDP-11s, let alone (as you requested in a previous posting) code that runs efficiently on PDP-11s. ---------- > Hogwash. If you write code that assumes an "int" can hold things > outside that range, you should put something like > > #ifdef pdp11 > FORGET IT, NO WAY, GIVE UP AND GO HOME > #endif > > to emphasise that this code will not run on a PDP-11. ---------- I don't consider the PDP-11 to be a sacred special case. We have code that would break on a PDP-10, too, because it would NOT get arithmetic exceptions where we expect them. So? Here's a concrete example. I don't blame Gosling for alignment problems we had in making Emacs run on our machines. It wasn't a consideration he should have had to worry about. I DO fault Unipress for having the same problem in the latest versions, because that IS something they should have to worry about. I have no objection to the principle that we should try, other things being equal, to write portable code. But the FIRST consideration of good professional practice is to write code that is clear, maintainable, and efficient in the environment for which we are paid to produce it. It is not bad practice to put that environment first. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
seifert@hammer.UUCP (Snoopy) (10/07/85)
In article <1924@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes: >> I am inclined to prefer the use of int16, int32, int64, int8, char. > >int16 => short >int32 => long >int64 => not a primitive data type on all implementations >int8 => signed char >char => char > >Why add more symbols when you already have what is needed in the language? For clarity and portability. Snoopy tektronix!tekecs!doghouse.TEK!snoopy
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/09/85)
> >Why add more symbols when you already have what is needed in the language? > > For clarity and portability. Neither of those has been demonstrated.
mwm@ucbopal.BERKELEY.EDU (Mike (I'll be mellow when I'm dead) Meyer) (10/09/85)
In article <1924@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes: >> I am inclined to prefer the use of int16, int32, int64, int8, char. > >int16 => short >int32 => long >int64 => not a primitive data type on all implementations >int8 => signed char >char => char > >Why add more symbols when you already have what is needed in the language? Doug, If someone on a machine that supports 60+ bit ints uses one in their code, and later you have to port it, you should hope they did: typedef long int60 ; /* or whatever the type is for 60 bit ints */ and then used int60 instead of long. You see, if they do that, then when you compile the program and notice it giving you funny numbers (assuming, of course, that you notice :-), a single grep will find all the places where variables you need to worry about are declared. Of course, if there were some standard place to look for those typedefs, and they had included that, then when you compiled the program, it would give you the same list as the grep. Likewise, if *your* code uses int8/int16/whatever correctly, specifying the number of bits needed, then there file of typedefs will get them a reasonable type for those variables. <mike
hamilton@uiucuxc.CSO.UIUC.EDU (10/09/85)
too bad you can't do something like: #define INT(max) \ /* /usr/include/int.h */ #if max<32768 \ /* machine-dependent */ short \ #else \ long \ #endif INT(20000) x; /* -> short x; */ INT(50000) y; /* -> long y; */ with cpp. maybe m4? not real pretty, but then neither is "int16", etc. i think it makes more sense to declare value range(s) than significant bits. at the least, it provides an extra degree of self-documentation. (quick, somebody make me shut up before i say something nice about pascal!) wayne hamilton UUCP: {ihnp4,pur-ee,convex}!uiucdcs!uiucuxc!hamilton ARPA: hamilton@uiucuxc.cso.uiuc.edu CSNET: hamilton%uiucuxc@uiuc.csnet USMail: Box 476, Urbana, IL 61801 Phone: (217)333-8703
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/10/85)
> typedef long int60 ; /* or whatever the type is for 60 bit ints */ My point is, if more than 32 bits are needed then this code is not going to port anyway, no matter what typedefs you use. > Likewise, if *your* code uses int8/int16/whatever correctly, specifying the > number of bits needed, then there file of typedefs will get them a > reasonable type for those variables. If you use the proper C data types, there will be no need to worry about this at all; the code will work on all standard-conforming implementations without any change whatsoever. There is no need to invent system-specific typedefs for any integer type through 32 bits, and for longer integer types typedefs are not sufficient.
mwm@ucbopal.BERKELEY.EDU (Mike (I'll be mellow when I'm dead) Meyer) (10/12/85)
In article <2032@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes: >> typedef long int60 ; /* or whatever the type is for 60 bit ints */ > >My point is, if more than 32 bits are needed then this code is >not going to port anyway, no matter what typedefs you use. If it's gotta be ported, then a port will consist of chasing down all the really long variables, and replacing expressions that use them with the appropriate calls on the mp library. It'd be nice if those variables were tagged for you. If it doesn't *have* to be ported, it would be nice if it wouldn't compile, so that you avoid the possibility of getting bogus answers. Admittedly, this problem could be solved by a C compiler that trapped integer overflows. Anybody got one :-). >> Likewise, if *your* code uses int8/int16/whatever correctly, specifying the >> number of bits needed, then there file of typedefs will get them a >> reasonable type for those variables. > >If you use the proper C data types, there will be no need to worry >about this at all; the code will work on all standard-conforming >implementations without any change whatsoever. There is no need >to invent system-specific typedefs for any integer type through 32 >bits, and for longer integer types typedefs are not sufficient. You missed the key word, "reasonable." It's unreasonable to declare a large array with types twice as large as they need to be. Its non-portable to use the builtin name for the type of reasonable size. Ergo, a typedef that says what size it is, and a standard file that turns those typedefs into the correct builtin type for that machine. This allows for both reasonable and portable code. <mike
guy@sun.uucp (Guy Harris) (10/13/85)
> > I don't want them to have a pretty good idea when it's going to violate > > that default assumption on a particular machine. I want them to have a > > pretty good idea when it's going to violate that default assumption on > > a 16-bit-"int" machine; > Well, I can see how that would make life easier for you, but it's not > really my problem. The project I work on would have saved a lot of > time if the code we're porting hadn't been written for a system using > memory-mapped files, but I don't curse the authors for writing for the > environment they had. One can write: int size_of_UNIX_file; or one can write long size_of_UNIX_file; The former is incorrect, and the latter is correct. The two are equivalent on 32-bit machines, so there is NO reason to write the former rather than the latter on a 32-bit machine. If one can write code for a more general environment with NO extra effort other than a little thought, one should curse an author who didn't make that extra effort. > > (Consider all the postings that say "our news system truncates items > > with more than 64KB, so could you please repost XXX" for an example of > > why it is a bad practice.) > What has that to do with anything? Somebody failed to anticipate > future needs and used a short when she should have used a long. The code in question uses an "int" where it should have used a "long". Using a "short" would have been *more* acceptable; the documentation for this system says <items> can be up to 65535 bytes long (2^32 bytes in 4.1c BSD), Since the only *real* constraint on the size of items is the amount of disk space available and the time taken to transmit the items, neither of which is significantly affected by the width of a processor's ALU and registers, the system should not make the maximum item size dependent on either of those two factors. The ideal would have been to use "long" instead of "int"; however, if the cost of converting item databases on PDP-11s would have been too high, using "short" would have been acceptable. The ideal would have been to do something like #ifdef BACKCOMPAT typedef itemsz_t unsigned int; #else typedef itemsz_t unsigned long; #endif and *not* restrict items to 65535 bytes by default; if it's really too much trouble for a site to convert its database, then they can build a version which is backwards-compatible with older versions. > There are people who rely on two-digit year codes, too. Yes, but how many of them rely on two-digit year codes on 16-bit machines and four-digit year codes on 32-bit machines? Not planning for future needs may be regarded as a misfortune; having a system like the aforementioned meet future needs or not depending on the width of a machine's registers looks like carelessness. (Sorry, Oscar, but I didn't think you'd mind...) There are cases where the difference between a 16-bit machine and a 32-bit machine *is* relevant; an example would be a program which did FFTs of large data sets. I have no problem with 1) the program being written very differently for a PDP-11, which would have to do overlaying, or provide a software virtual memory system, or perform some other technique to do the FFTing on disk, and for a VAX, where you could (assuming you could keep the entire data set in *physical* memory) write it in a more straightforward fashion (although, if it *didn't* all fit in physical memory, it would have to use some techniques similar to the PDP-11 techniques to avoid thrashing) or 2) saying "this program needs a machine with a large address space". > Portability is one of many factors to be considered in setting local > coding standards. I have spent a lot of time recently understanding > code written for a very different environment and converting it to C. > It had lots of size and byte-ordering problems. That's the breaks. > It's not the authors' fault that I had different requirements than they. In many of these cases, there is little if any gain to be had by writing software in a non-portable fashion. Under those circumstances, it *is* the authors' fault that they did something one way when they could have done it another way with little or no extra effort. In the case of byte ordering, it takes more effort to write something so that data is portable between machines. If it's a question of a program which *doesn't* try to exchange data between machines and *still* fails on machines with a different byte order than the machine for which it was written, there'd better have been a significant performance improvement gained by not writing it portably. And in the case of using "long" vs. "int", there is NOTHING to be gained from using "int" instead of "long" on a 32-bit machine (on a truly 32-bit machine, "long"s and "int"s will both be 32-bit quantities unless the implementor of C was totally out to lunch), so it SHOULD NOT BE DONE. Period. > But it is not our business to produce code that runs on PDP-11s, > let alone (as you requested in a previous posting) code that runs > efficiently on PDP-11s. I made no such request, but I'll let that pass. If you can get code that runs on PDP-11s with no effort other than getting people to use C properly, it *is* your business to get them to so use C and write portable code whenever possible. If your system permits code to reference location 0 (or whatever location a null pointer points to, assuming it doesn't have a special "invalid pointer" bit pattern), it *is* your business not to write code which dereferences null pointers - such code is NOT valid C. Programmer X can get away with writing code like that, if they have such a system; programmers Y, Z, and W who work for a company which does not permit code to get away with dereferencing null pointers have every right to stick it to programmer X when their company's customers stick it to them because "your machine is broken and won't run this program". Saying "programmer X is not at fault" is blaming the victim, not the perpetrator. > I have no objection to the principle that we should try, other things > being equal, to write portable code. But the FIRST consideration of > good professional practice is to write code that is clear, > maintainable, and efficient in the environment for which we are paid > to produce it. It is not bad practice to put that environment first. If all other things are not equal, or close to it, I have no objection to unportable code. The trouble is that people don't even seem to try to write portable code when they *are* equal. It *is* bad practice to blindly assume that the environment you're writing for is the only interesting environment. Some minimum amount of thought should be given to portability, even if portability concerns are rejected. Can you absolutely guarantee that the people who paid you to write that code won't ever try to build it in a different environment? If not, by writing non-portable code you may end up costing them *more* money in the long run; it's more expensive to retroactively fix non-portable code than to write it portably in the first place. If somebody says that, now that ANSI C finally "defines 'int's as 16-bit quantities", they'll start thinking about when it's appropriate to use "long" and when it's appropriate to use "int", they haven't given the proper minimum amount of thought to portability. Guy Harris
henry@utzoo.UUCP (Henry Spencer) (10/13/85)
> I have no objection to the principle that we should try, other things > being equal, to write portable code. But the FIRST consideration of > good professional practice is to write code that is clear, > maintainable, and efficient in the environment for which we are paid > to produce it. It is not bad practice to put that environment first. It must be nice to be so confident that your environment will never, ever, ever, change radically. Situations where such confidence is justified are rare; perhaps your situation is such, but this is unusual. One major advantage of Unix is that it does *not* tie you to any single environment... but that advantage is wasted if your own code does. We may be especially conscious of this because our environment is scheduled to change radically sometime in the next year or so, but the principle is valid in general: making your code machine-dependent limits its lifetime. This is sometimes appropriate... but only sometimes. Less often than most people think. With certain specific exceptions (e.g. device drivers, the insides of hot RasterOp implementations, the insides of strcpy(), etc.), portable C code is portably efficient as well. Clarity and maintainability are fairly orthogonal to portability; if anything, there is a positive correlation, because machine-dependent microsecond-grubbing tends to be unclear and hard to maintain too. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
bc@cyb-eng.UUCP (Bill Crews) (10/14/85)
> I have no objection to the principle that we should try, other things > being equal, to write portable code. But the FIRST consideration of > good professional practice is to write code that is clear, > maintainable, and efficient in the environment for which we are paid > to produce it. It is not bad practice to put that environment first. > -- > scott preece Yeah, I know, but logic and rationality aren't near as much fun as religion! It is MUCH better to tie up our phone lines and disk space with religious ranting and raving than with . . . (yuck!) . . . rationality. :-) -- - bc - ..!{seismo,topaz,gatech,nbires,ihnp4}!ut-sally!cyb-eng!bc (512) 458-6609
ado@elsie.UUCP (Arthur David Olson) (10/21/85)
> If you use the proper C data types, there will be no need to worry > . . .at all; the code will work on all standard-conforming > implementations without any change whatsoever. . . Hmmm. . .last time I looked there were no (as in zero) standard-conforming implementations (a small side effect of the standard not yet having been agreed to, no doubt). -- C is a Jack Benny/Mel Blanc trademark. -- UUCP: ..decvax!seismo!elsie!ado ARPA: elsie!ado@seismo.ARPA DEC, VAX and Elsie are Digital Equipment and Borden trademarks (Is ARPA a DARPA trademark?)
jsdy@hadron.UUCP (Joseph S. D. Yao) (10/29/85)
In article <2883@sun.uucp> guy@sun.uucp (Guy Harris) writes: >One can write: > int size_of_UNIX_file; >or one can write > long size_of_UNIX_file; >The former is incorrect, and the latter is correct. ... Actually, if you are trying to write portable code, NEITHER is correct. This particular problem is exactly why we have the typedef, off_t. off_t size_of_UNIX_file; is correct for portable code. However, either of the above is correct for throwaway code on machines for which each one happens to be true. The trouble is, by not developing good (read innocuous but portable) habits in throwaway code, if you suddenly decide that you are an Implementor of Portable Code, you will have a lot of trouble get used to the "new" way of writing code. -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
franka@mmintl.UUCP (Frank Adams) (11/02/85)
[Not food] In article <48@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: >The trouble is, by not >developing good (read innocuous but portable) habits in throwaway code, >if you suddenly decide that you are an Implementor of Portable Code, >you will have a lot of trouble get used to the "new" way of writing >code. The trouble is, you don't know what pieces of code aren't going to be thrown away. You may suddenly find that you *were* an Implementor of [Not Very] Portable Code. Better to do it right the first time. Frank Adams ihpn4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108
meier@srcsip.UUCP (Christopher M. Meier) (11/05/85)
In article <48@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: > >Actually, if you are trying to write portable code, NEITHER is correct. >This particular problem is exactly why we have the typedef, off_t. > > off_t size_of_UNIX_file; > >is correct for portable code. > >However, either of the above is correct for throwaway code on machines >for which each one happens to be true. The trouble is, by not >developing good (read innocuous but portable) habits in throwaway code, >if you suddenly decide that you are an Implementor of Portable Code, >you will have a lot of trouble get used to the "new" way of writing >code. >-- > > Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP} Can someone suggest a good reference (or references) for developing good portable code? We are writing code that will eventually be used on machines other than our current 750 Vax running 4.2, and I would like to make sure we won't have to spend time rewriting code. Christopher Meier {ihnp4!umn-cs,philabs}!srcsip!meier Honeywell Systems & Research Center Signal & Image Processing / AIT
guy@sun.uucp (Guy Harris) (11/11/85)
> Can someone suggest a good reference (or references) for developing > good portable code? We are writing code that will eventually be used > on machines other than our current 750 Vax running 4.2, and I would > like to make sure we won't have to spend time rewriting code. No, but I think Harbison and Steele mentions some things in passing. Laura Creighton is thinking of writing such a book. Until it comes out, here are some rules: Rule 1. Run your code through "lint". Rule 2. Be careful about using "int". Rule 3. Run your code through "lint". Rule 4. Be careful about declaring functions which return things other than "int", like "long" or pointers. Rule 5. Run your code through "lint". Rule 6. There is NO rule 6. Rule 7. Run your code through "lint". Rule 8. Be careful about casting values to their proper type when passing them as arguments - if a routine expects a "long", or a pointer to a particular type, make sure it gets it. For instance, never pass 0 or NULL if the routine expects a pointer - *always* cast it to a pointer of the appropriate type. Rule 9. Run your code through "lint". Rule 10. Never ever ever ever ever assume that you can dereference a pointer which may be null. Rule 11. Run your code through "lint". Rule 12. Never assume that the bytes of a word or a longword are in a particular order. Rule 13. Run your code through "lint". Rule 14. Never assume that "char" is signed. Rule 15. Run your code through "lint". Rule 16. Never assume that you can turn a "char *" which points into the middle of a word or longword into a "short *", "int *", or "long *" and use the pointer in question; VAXes don't impose boundary alignment restrictions, but lots and lots of other machines do. Rule 17. Run your code through "lint". Rule 18. Never assume what the padding between structure members is - it's machine-dependent. Rule 19. Run your code through "lint". Guy Harris
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/14/85)
Here are some more. > Make as much static as possible. (No, not electricity.) Rephrase: restrict the scope of all variables and functions as much as possible. Use auto's and static's in preference to extern's. > Declare all functions which return a value as such. > If possible, declare non-value-returning functions as void. > After (not if) you use lint, do as little type-casting as possible. Instead, take a long look at what you're doing. Are you forgetting to check return values? Passing more bits than you can use? ... THEN cast types. > If you are using an extern in exactly the same way in code in different functions in different modules, perhaps you can make a single function to do all of this, and reduce the scope of the extern. > Do not use constants in functions. Well, maybe 0. MAYBE 1 or -1. Never 0 for a null pointer, or end-of-string. NEVER strings. > NUL is not NULL. ('\0' and (char *)0 may well be the same -- but it's not saying what you mean. Or, did you mean to say that a nul character was the same object as a null pointer? > Always check. Your return values. Your pointers. Your data, before you divide. This is not so much "portability" as "defensive programming," but what the hey. More abounds. Want more bounds? Keep asking. ;-) -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}