james@parkridge.UUCP (06/17/87)
In article <12670@topaz.rutgers.edu> hedrick@topaz.rutgers.edu.UUCP writes: >Unfortunately in C (as most other languages) there is no distinction >between how you describe variables to be used within your program and >how you describe external objects. The result is that network code ... >pointers. But if we are to have any hope of writing portable network >code, there has to be some way to say that something is a 16 or 32 >bit object. Currently short and long are it. Anybody have a better >idea? The only alternative I can think of is to use long:16 and >long:32. Presumably that would continue to work if longs expanded. Forgive me if this is a little bit naive, but what about having system-wide constants which tells the compilers (for whichever languages are available) what the sizes of the objects really are? For example, cc would know that chars are w bits long, ints are x, shorts are y, and longs are z. All the user would have to do would be to set up defines (or whatever) that request a minimum and maximum size for the objects required and make sure that these constraints are followed strictly within his/her code. When the compiler went at it, it would see the requested sizes and make sure that it could satisfy them on the current machine while still following the K&R rules. If it couldn't, it would scream. For example.... #define MIN_CHAR 8 /* Minimum sizes required, max are optional */ #define MAX_CHAR 8 /* Compiler has free reign to shift about */ #define MIN_SHORT 8 /* sizes within the limits imposed here... */ #define MAX_SHORT 16 #define MIN_INT 8 #define MAX_INT 16 #define MIN_LONG 16 Anyone have any reasons why this sort of thing wouldn't work?? This is just off the top of my head, but it seems reasonable if you really want portable code and are prepared to put more work into it..... ___________________________________________________________________________ | | _____ | | | James R. Sheridan | | \ | ..utzoo!parkridge!pcssun!james | | | |__/ | | | Parkridge Computer Systems Inc. | |/ \ | | | 710 Dorval Drive, Suite 115 | \_/\_ | YOU can help wipe out COBOL in | | Oakville, Ontario, CANADA | \ | our lifetime!! | | L6K 3V7 (416) 842-6873 | \_/ | | |_________________________________|________|________________________________| -- ___________________________________________________________________________ | | _____ | | | James R. Sheridan | | \ | ..utzoo!parkridge!pcssun!james | | | |__/ | | | Parkridge Computer Systems Inc. | |/ \ | | | 710 Dorval Drive, Suite 115 | \_/\_ | YOU can help wipe out COBOL in | | Oakville, Ontario, CANADA | \ | our lifetime!! | | L6K 3V7 (416) 842-6873 | \_/ | | |_________________________________|________|________________________________|
james@parkridge.UUCP (06/17/87)
In article <12670@topaz.rutgers.edu> hedrick@topaz.rutgers.edu.UUCP writes: >Unfortunately in C (as most other languages) there is no distinction >between how you describe variables to be used within your program and >how you describe external objects. The result is that network code ... >pointers. But if we are to have any hope of writing portable network >code, there has to be some way to say that something is a 16 or 32 >bit object. Currently short and long are it. Anybody have a better >idea? The only alternative I can think of is to use long:16 and >long:32. Presumably that would continue to work if longs expanded. Forgive me if this is a little bit naive, but what about having system-wide constants which tells the compilers (for whichever languages are available) what the sizes of the objects really are? For example, cc would know that chars are w bits long, ints are x, shorts are y, and longs are z. All the user would have to do would be to set up defines (or whatever) that request a minimum and maximum size for the objects required and make sure that these constraints are followed strictly within his/her code. When the compiler went at it, it would see the requested sizes and make sure that it could satisfy them on the current machine while still following the K&R rules. If it couldn't, it would scream. For example.... #define MIN_CHAR 8 /* Minimum sizes required, max are optional */ #define MAX_CHAR 8 /* Compiler has free reign to shift about */ #define MIN_SHORT 8 /* sizes within the limits imposed here... */ #define MAX_SHORT 16 #define MIN_INT 8 #define MAX_INT 16 #define MIN_LONG 16 Anyone have any reasons why this sort of thing wouldn't work?? This is just off the top of my head, but it seems reasonable if you really want portable code and are prepared to put more work into it..... -- ___________________________________________________________________________ | | _____ | | | James R. Sheridan | | \ | ..utzoo!parkridge!pcssun!james | | | |__/ | | | Parkridge Computer Systems Inc. | |/ \ | | | 710 Dorval Drive, Suite 115 | \_/\_ | YOU can help wipe out COBOL in | | Oakville, Ontario, CANADA | \ | our lifetime!! | | L6K 3V7 (416) 842-6873 | \_/ | | |_________________________________|________|________________________________|
jerry@oliveb.UUCP (07/09/87)
In article <1987Jun16.170300.9918@parkridge.uucp> james@parkridge.UUCP (James Sheridan) writes: > Forgive me if this is a little bit naive, but what about having >system-wide constants which tells the compilers (for whichever languages are >available) what the sizes of the objects really are? For example, cc would >know that chars are w bits long, ints are x, shorts are y, and longs are z. It is an interisting idea but I can see one problem. Normally you load your program with a previously compiled library. The routines in the library expect and return values to be of a specific size, not whatever size you requested the compiler to use on your compilation. And of course the system calls have symilar expectations. For example if you have some code that insists that longs must be only 16 bits the compiler should be able to handle this easily. However if your program uses lseek then the arguments are going to be a bit confused. I prefer having new types, defined by some method, that allow a more specific type definition. In this way you can use an "int16" when you must have a 16 bit integer and use a (long) cast if you must pass that to something requiring a long. For less strengent storage you can use a generic long defined to be what is efficient on that system. The remaining problem is that the compiler may not support a type you need. Something like an int12 or an int64 might work on some systems but isn't likely to be available elsewhere. On a related issue; Is anyone familure with a C compiler where int was not the same size as short or long? I mean where short was 16 bits, int was 32, and long was 64. Jerry Aguirre
jbn@glacier.UUCP (07/09/87)
Newsgroups: comp.unix.wizards Subject: Re: Type size problems Summary: Expires: References: <3659@spool.WISC.EDU> <743@geac.UUCP> Sender: Reply-To: jbn@glacier.UUCP (John B. Nagle) Followup-To: Distribution: Organization: Stanford University Keywords: I did some work in this area at one time, back when Ada came in four colors, and proposed some approaches that are sound but have more of a Pascal or Ada flavor than C programmers are used to. My basic position was similar to that taken by the IEEE floating point standards people: the important thing is to get the right answer. As it turns out, with some work in the compiler, we can do integer arithmetic in a completely portable way with no loss in performance. 1. Sizes belong to the program, not to the machine. Thus, integer variables should be declared by range, by giving a lower and upper bound for the value. (In Pascal, this is called a "subrange", reflecting the assumption by Wirth that the type "integer" is somehow big enough for all practical purposes. This reflects the fact that he was using a Control Data 6600, a machine with a 60-bit word, when he designed Pascal.) For example, in Pascal, one writes VAR x: 0..255; 2. Named types (such as "int" and "short") should be predefined but not built in, and thus redefinable if needed. Some standard definitions such as "unsigned_byte" should be defined the same way in all implementations. But in general programmers should use ranges. (Of course, when declaring a range, expressions evaluatable at compile time should be allowed in range bounds. Pascal doesn't allow this, which results in great frustration.) VAR unsigned_short: 0..65535; is a typical declaration in Pascal. C should have equivalent syntax. It's silly that one has to guess what the type keywords mean in terms of numeric value in each implementation yet can't simply write the range when you want to. Thus, if we had syntax in C for ranges, along the lines of range 0..65535 unsigned_short; we could do in C what one can do in Pascal. Given range declarations, one can create the "fundamental" types of C. typedef range 0..255 unsigned_byte; typedef range -(2^15)..(2^15)-1 short; typedef range 0..(2^16)-1 unsigned_short; typedef range -(2^31)..(2^31)-1 long; typedef range 0..(2^31)-1 unsigned_long; These should be in an include file, not built into the compiler. 3. Now here's the good part. The compiler has to pick the size of intermediate results. (When we write "X = (A+B)+C;", "A+B" generates an intermediate result.) The compiler should always pick a size for an intermediate result that cannot result in overflow unless overflow of the result would occur. This strange rule does what you want; if you write "X = X+1", and X has the range -32768..32767 (what we usually call "short"), then there's no need to compute a long result for "X+1", even though, if X=32767, overflow would occur, because overflow would also occur in the final result, which is an error. (One would like to check for such errors; on VAXen, one can enable such checking in the subroutine entry mask. But nobody does; I once built PCC with it enabled, and almost no UNIX program would work. More on this later.) On the other hand, if one writes "X = (A*B)/C;", and all variables are "short", the term "A*B" will be computed as a "long" automatically, thus avoiding the possibility of overflow. (If you don't like that, you would write "X = ((short)(A*B))/C;" and the compiler would recognize this a statement that A*B should fit in a "short".) 4. Sometimes, but not often, one wants overflow, usually because one is doing checksumming, hashing, or modular arithmetic. The right way to do this is to provide modular arithmetic operators. One should be able to write X = MODPLUS(X,1,256); and get "X+1 % 256". The compiler must recognize as special cases modular arithmetic with bounds of 2^n, and especially 2^(8*b), and do those efficiently. The above example ought to compile into a simple byte-wide add on machines that have the instruction to do it. 3. Some intermediate results aren't computable on most machines. short X, A, B, C, D, E, F, G, H, I; X = (A * B * C * D * E * F * G * H) / I; should generate an error message at compile time indicating that the intermediate result won't fit in the machine. If the user really wants something like that evaluated (and recognize that for most random values overflow would result in the above expression) some casts or coercions will be necessary to tell the compiler what the user has in mind. Note that some programs that will compile on some machines won't compile on others. This is better than getting the wrong answer. 4. Function declarations have to be available when calls are compiled, so the compiler can see what types it is supposed to send. Ada and Pascal work this way, and C++ moves strongly in that direction. 5. There probably shouldn't be a predefined type "int" or "integer" at all. (I've been thinking of publishing the thinking shown here under the title "Type integer considered harmful"). There's a general trend toward making integer arithmetic portable in LISP, where unlimited length integers are often supported. To the Common LISP programmer, the width of the underlying machine's numeric unit is irrelevant. The performance penalty for this generality in LISP is high. But we can achieve equivalent portability in the hard-compiled languages with some effort. This discussion probably should move to the C or C++ groups. John Nagle