cottrell@nbs-vms.ARPA (02/08/85)
/* -) This discussion is getting ugly. I am not a `dedicated amateur hacker.' I am quite good as a matter of fact. I care a lot about quality code, more than most people. I will therefore try & wrap this up. -) I hope that ANSI gets the standard out soon. Please remember we are a privileged group in that we get to hear many viewpoints, including such annoying ones as myself. Many people don't even know there will be a standard. Lots of people work in a vacuum. -) I am shocked to find out that pointers may be of different sizes, & may be of a different size than int. This has been true for so long many people just assumed it. I believe it should be true wherever possible for the following reason: if a default exists, it should be a useful one. Defining different sizes for these items gives credibility to the claim that C is dangerous. Just another accident waiting to happen. -) Perhaps you forgot some of the attraxions of the language: a terse language that modelled the machine closely. No stupid booleans. All those casts are ugly. How are we to convert people to C if they must put up with all that verbosity? Shouldn't the compiler know? Okay, automatic casting will help *execution* correctness, but with the declarations for funxions in another file, the code will still read the same (casts optional). Mostly noone looks at header files. -) I apologize for calling (presumably someone's favorite) machines `weird' or `braindamaged'. Let's say `less capable'. The pdp-11 was a landmark in machine design. It was byte addressable, had a stack, memory mapped i/o, an orthogonal instruxion set, and useful addressing modes. The vax continued this trend. Most micros (all?) are byte addressable. Most have an address space two to the size-of-register-in-bits power. Most of the machines designed before this were not byte addressable. Most of these machines had some very strange glitches. In short weird. Some minis continued the mainframe trend and only addressed words. This sort of machine is an inhospitable host for the C language and some implementations are downright kluges. I claim that they don't run C but another language I would call `C--'. -) While you are claiming that it is MY CODING PRACTICES (and evidently hordes of others, including 4.2bsd & sys III implementors) that are nonportable, I am claiming that it is THOSE WEIRD MACHINES that are nonportable. By changing the rules in the middle of the game, you are depriving me (and others) of the time honored tradition of punning. I know it is easier to change the language than the machines. I say don't do it. Why encourage the production of out-dated hardware? -) I still maintain that assigning zero to a pointer stores an unspecified number of zero bits. The nil/null ptr is a convention, just like null terminated strings. We all agree that zero is special because there is not likely to be real data there. The null ptr is an out-of-band value. We agreed to represent it in-bound. Still, a piece of kernel code should be able to pick up the byte at address zero by: int j; char *p; p = 0; j = *p; Allowing any other value to actually be stored breaks this. Besides, SHOW ME A C THAT USES ANYTHING OTHER THAN ZERO ON ANY MACHINE!!! K&R says of pointer to integer conversions: "The mapping funxion is also machine dependent, but is intended to be unsurprising to those who know the machine." I would be surprised at nonzero null ptrs. -) Guy: if I want the square root of four, I do sqrt(4.0); NO CAST! -) As for the Honeywell x16 & CDC 17xx which were 16 bit word-addressable only (I presume a word pointer in one word, one bit representing left/rite in the second word?) there is another solution, albeit klugy: put the whole thing in one word & restrict character addressing to the lower half of the address space. The Varian V7x series (upgraded 620i) uses this format, altho words can only reference 32k words because bit 15 is used for indirexion and byte addressing will not indirect. Yeah, I know, gross. This can be mitigated if they have memory management. Why would you need 64k words instead of 32k? Hey, it's finite. Get a bigger machine if you need one. -) Perhaps I forgot the :-) on my `swapping by xor' article. Like Guy said, "cute", but not very user friendly. -) How many of you port what percentage of programs? I thought the intent of the standard was to not break existing programs. I claim that the standard should recognize the existing idioms. Languages are also defined by usage as well as specification. Pretty soon you will be passing descriptors around instead of objex! -) I will present (in another article, this one's getting too long) a good reason for type punning. Stay tuned to this channel... -) I started out to be reconciliatory; I am unfortunately (for y'all) more convinced of my position. Eat flaming death fascist media pigs! */
jsdy@SEISMO.ARPA (02/08/85)
I'm kinda tired of cottrell's insistent flaming. I am almost convinced that he does a lot of it just to get under the skins of people like Guy, who can get very righteously provoked. ;-)/;-S I think most folk would agree that we like C's terse style, and that once we have its capabilities on our particular machine down pat, we can right incredibly clever and, yes, punning programs that do all manner of wonderful things. We can do this quickly, brightly, and with beauty, but not necessarily portably. And I'm using that word in a highly literal sense there! If it ports to 99.9% of correct compilers but not to 0.1%, then it is not portable. That is n o t to say it's bad code -- it does what you want, if you don't want to run on the 0.1%. I know that cottrell has said any number of times he doesn't, so his code is still GOOD CODE. At least, as far as this criterion goes. However, there is an important subgroup of us to whom it is important to know exactly what is legal for 100% of all correct compilers, and what is not. This is a subgroup, not the whole. And to that group, it matters much that a program to be 100% portable must take into account "weird" machines and odd-sized word/byte/???? sizes. And, yes, pointers like the DEC-10 byte pointers! (18 bits address, 18 bits byte specifier, remember?) There is room in this world for all of us. I fall into each group, on occasion, although I must admit that Guy sounds much purer than anything I've ever written. (Of course, I've never written ANSI code ...). However, there is not room in my notesfiles for so many flames! [;-)] I am not "the legendary Loren", and have trouble keeping up with my limited number of newsgroups. So, let's keep it to more light than heat, OK? ;-);-);-) Joe Yao hadron!jsdy@seismo.{ARPA,UUCP}
guy@rlgvax.UUCP (Guy Harris) (02/09/85)
> -) I am shocked to find out that pointers may be of different sizes, & > may be of a different size than int. This has been true for so long > many people just assumed it. I believe it should be true wherever possible > for the following reason: if a default exists, it should be a useful one. > Defining different sizes for these items gives credibility to the > claim that C is dangerous. Just another accident waiting to happen. The existence of automobiles is also "an accident waiting to happen" (although it didn't wait very long) in those terms. I don't blame the automobile, I blame the driver. I have no interest in seeing a governor placed on all cars that limits speed to 35MPH. Nor do I have an interest in seeing the requirement that all pointer types must be represented the same way placed on the C language. If people can't cope with machines that require (or, at least, strongly prefer) different pointer representations, that's their problem, not C's. > -) Perhaps you forgot some of the attraxions of the language: a terse > language that modelled the machine closely. No stupid booleans. > All those casts are ugly. How are we to convert people to C if they > must put up with all that verbosity? Shouldn't the compiler know? That's why the ANSI C standard improved the declaration syntax for functions; yes, the compiler should know, and ANSI C compilers do know (except for functions with a variable number of arguments; the prime offender, "execl", is just syntactic sugar for something that can be done equally well with "execv"). > -) I apologize for calling (presumably someone's favorite) machines `weird' > or `braindamaged'. Let's say `less capable'. The pdp-11 was a landmark > in machine design. It was byte addressable, had a stack, memory mapped > i/o, an orthogonal instruxion set, and useful addressing modes. The > vax continued this trend. Most micros (all?) are byte addressable. According to the Stanford MIPS people (see "Hardware/Software Tradeoffs for Increased Performance" in the Proceedings of the Symposium on Architectural Support for Programming Languages and Operating Systems, SIGARCH Computer Architecture News V10#2 and SIGPLAN Notices V17#4), you may be better off if you have a word-addressed machine and special pointers for accessing bytes. (In their case, byte and word pointers are both 32 bits long, but coercions are still not copies.) > Most have an address space two to the size-of-register-in-bits power. As has been said more times than I care to count, the 68000s registers are 32 bits long but 32 bit arithmetic is less efficient than 16 bit arithmetic. I think that this is unfortunate, but it's a fact of life. There are good things to be said for 16-bit "int"s on a 68000. > This sort of machine is an inhospitable host for the C language and > some implementations are downright kluges. I claim that they don't > run C but another language I would call `C--'. You aren't the arbiter of the C language; if you want to hold that opinion you're welcome to it, but I suspect most people wouldn't agree. UNIX runs on the Sperry 1100; if users of UNIX on that machine (or other putatively "inhospitable" machines) have any comments on that point, I'd like to hear them. > -) While you are claiming that it is MY CODING PRACTICES (and evidently > hordes of others, including 4.2bsd & sys III implementors) that are > nonportable, I am claiming that it is THOSE WEIRD MACHINES that are > nonportable. By changing the rules in the middle of the game, you > are depriving me (and others) of the time honored tradition of punning. Aside from any semantic quibbles about the meaning of "nonportable", I object to the reference to the "time honored tradition of punning". Lots of traditions, like self-modifying code, were "time-honored" in the days of small slow machines which "needed" that sort of stuff. I can get away without punning 99.9% of the time; the other .1% of code can be "#ifdef"ed, or written in assembly language, or... > -) I still maintain that assigning zero to a pointer stores an unspecified > number of zero bits. Maintain what you will, the *language spec*, such as it is, says no such thing. Your statement is merely a statement of preference, which people are at leisure to ignore. > The null ptr is an out-of-band value. We agreed to represent it in-bound. Who's "we"? On many machines, there *is* no out-of-band value. On the VAX, 0xffffffff is arguably an out-of-band value, while on most UNIXes on the VAX 0x0 is an in-band value. On other machines, there *is* an out-of-band value, specified by the architectural spec as "how to represent a null pointer", and it need not consist of N 0 bits. > Still, a piece of kernel code should be able to pick up the byte at > address zero by: > int j; char *p; p = 0; j = *p; > Allowing any other value to actually be stored breaks this. However, it doesn't break int j; char *p; j = 0; p = j; j = *p; Admittedly, this is slightly less efficient, but the number of times when you execute code that is intended *only* to fetch the contents of location 0 (as opposed to code that fetches the contents of an arbitrary location; peek(addr) int addr; { return(*(char *)addr); } even works if you say "j = peek(0)") is very small. > Besides, SHOW ME A C THAT USES ANYTHING OTHER THAN ZERO ON ANY MACHINE!!! Hello? Anybody from the Lawrence Livermore Labs S-1 project out there? Don't you have a special bit pattern for the null pointer? > K&R says of pointer to integer conversions: "The mapping funxion is > also machine dependent, but is intended to be unsurprising to those > who know the machine." I would be surprised at nonzero null ptrs. A subtle point; given a "char *" variable "p", the statement p = 0; is different in character from both the statements p = 1; and the statements i = 0; p = i; given an "int" variable "i". Arguably, this is confusing and a mistake, but it is the clearest (and, probably, only correct) interpretation of what K&R says on the subject. The latter two sets of statements do this particular mapping; the former one is a special case which shoves a null character pointer into "p". The mapping function in the third set of statements is unsurprising. If I ran the zoo, there would have been a special keyword "nil" or "null", and THAT would have been the way to specify null pointers; 50% of all these discussions wouldn't have occurred if that was done. Unfortunately, it's too late for that. > -) Guy: if I want the square root of four, I do sqrt(4.0); NO CAST! That's because the C language has a way of representing floating-point constants directly. It doesn't have a way of representing null pointers directly; instead, it has a sneaky language rule that says the symbol "0", when used in conjunction with a cast to a pointer or an expression involving pointers, is interpreted as a null pointer of the appropriate type. If there were, say, a null pointer operator like "sizeof", like null(char *) you could pass null(char *) to a routine. Alternatively, if the language had permitted you to declare the types of the arguments to a function since Day 1, calling a function which expects a "char *" as an argument would be an expression involving pointers and the 0 (or "nil" or "null") would be interpreted as a null pointer to no character. > -) How many of you port what percentage of programs? I thought the > intent of the standard was to not break existing programs. I claim > that the standard should recognize the existing idioms. No, the intent of the standard is not to break existing *correct* programs. There exist programs, written by people at, among other places, a certain large West Coast university, which assume that location 0 contains a null string (although that crap seems to have disappeared as of 4.2BSD). Does this mean that all implementations of C must map location 0 into the address space and must put a zero byte there? "=+" was a legal part of the language once. It has now disappeared; the System V compiler now only accepts "+=". More and more programs are properly declaring functions, casting pointers, etc.. As such, I see no point in supporting the passing of undecorated 0s to functions whose argument types are undeclared as the passing of a null pointer. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
kpmartin@watmath.UUCP (Kevin Martin) (02/10/85)
In article <8121@brl-tgr.ARPA> cottrell@nbs-vms.ARPA writes: >-) I apologize for calling (presumably someone's favorite) machines `weird' > or `braindamaged'. Let's say `less capable'. Let's say 'capable of doing things other than you want done'. I doubt that a pianist would like being called 'weird', 'braindamaged', or even 'less capable' just because you happen to want to hear her play a trombone. > Allowing any other value to actually be stored breaks this. Besides, > SHOW ME A C THAT USES ANYTHING OTHER THAN ZERO ON ANY MACHINE!!! > K&R says of pointer to integer conversions: "The mapping funxion is > also machine dependent, but is intended to be unsurprising to those > who know the machine." I would be surprised at nonzero null ptrs. Then you obviously don't know the machine. A Honeywell DPS8, running CP-6, uses the bit pattern 06014 (36 bits) as the null pointer (byte offset zero in segment number 014). The pointer to int casts (and back) are accomplished by ex-oring with this value. The value 0 CAN'T be used, since this is a valid pointer. >-) How many of you port what percentage of programs? I thought the > intent of the standard was to not break existing programs. I don't think it does break programs (except those which use =<op> and initializers without the '='). Your programs will continue to run on systems on which they previously ran. You may get more warnings when you compile them, though. Kevin Martin, UofW Software Development Group
jdb@mordor.UUCP (John Bruner) (02/10/85)
> Hello? Anybody from the Lawrence Livermore Labs S-1 project out there? > Don't you have a special bit pattern for the null pointer? I had prepared this reply with the intention of avoiding a reference to the S-1 Mark IIA. I've mentioned it several times recently and I wondered if people are getting tired of hearing about it. However, since you asked, the answer is YES. Our machine has a 36-bit word and a 31-bit virtual address space. The 5 high-order bits of a pointer constitute its "tag" which specifies an assortment of things. Two values, 0 and 31, are invalid tags. An attempt to use a pointer manipulation instruction on words containing these tags will cause a trap. (This allows easy detection of indirection through integers if the integer is in the range -2**31..2**31-1.) A tag of 2 indicates a NIL pointer, which can be copied but not dereferenced. There are two operating system projects here. Amber, which is based quite a bit on the MULTICS model, is written in Pastel and uses the NIL pointer tag. The other operating system is UNIX. After a lot of grief because the tide of sloppy programs was too great, we decided to hack the microcode to allow 0-tagged pointers and use an integer zero as our NULL pointer. (There is a special microcode patch which must be applied before we boot UNIX.) We all regard this as WRONG, WRONG, WRONG, WRONG, WRONG, WRONG. It means that C and Pastel cannot easily share data structures, and it defeats a lot of the useful hardware type checking. We hope to develop a C front-end for our Pastel compiler so that C programs which run under Amber can use the NIL pointer properly. (int)0 vs. (int *)0 has become a very sore point with me for this reason. I am firmly convinced that they are NOT the same and I am unhappy that we had to contort our implementation to match an assumption that is valid on simpler architectures. Now, for the reply I just finished editing: Although it is tempting to comment on the assertion that machines which differ from a PDP-11 or a VAX are "less capable", I'm not going to respond to that in this posting. Instead, I'd like to take the notion of "portability" in terms of "changing the rules in the middle of the game" a little bit further. Instead of starting with C under VAX UNIX, however, I want to start with the oldest C that I'm familiar with: the C compiler that came with the Sixth Edition of UNIX. (I trust that any unwarranted assumptions that I make about C based upon the Sixth Edition can be corrected by others who have used even earlier versions.) Let me cite from the C Reference Manual for the Sixth Edition: 2.3.2 Character constants [third paragraph] Character constants behave exactly like integers (not, in particular, like objects of character type). In conformity with the addressing structure of the PDP-11, a charcter constant of length 1 has the code for the given character in the low-order byte and 0 in the high-order byte; a character constant of length 2 has the code for the first character in the low byte and that for the second character in the high-order byte. Character constants with more than one character are inherently machine-dependent and should be avoided. Nonetheless, programs used multi-character constants. One in particular that I'm very familiar with was APL\11. [The author of APL\11 was Ken Thompson (a.k.a. "/usr/sys/ken"), who I think we can agree is rather knowledgable about C and UNIX.] Unfortunately, PCC generated two-character character constants in the opposite order than Ritchie's CC. The manual doesn't say that the results are compiler-dependent, so one should expect them to be the same for both compilers on the same machine. Hence, PCC (and thus the VAX "cc") is nonportable. (The first time I tried to move APL from a V6 PDP-11 to a 32/V VAX I had to find and fix 800 "new" errors.) It is interesting that the assertion has been raised that -1 has always been the standard error return. Here's a simple program for copying standard input to standard output, from "Programming in C -- A Tutorial", section 7. main() { char c; while( (c = getchar()) != '\0' ) putchar(c); } This worked before the Standard I/O library "broke" it. getchar() used to return '\0' on EOF. Also, programs which had previously used the old I/O library (with "fin" and "fout" -- anyone remember what fout = dup(1); did?) or the old Portable C library had to be changed to accommodate STDIO. I guess STDIO is nonportable too. Several programs that I worked with assumed that integers were two bytes long. I guess the VAX is nonportable. [Gee, this is fun!] Back in V6 there was no "/usr/include" -- you had to code the definitions of the system structures directly in your program or hunt down the kernel include files. The advent of "/usr/include" and the changes in the system calls broke several programs that coded these things directly. I guess even V7 is nonportable. Then, of course, there are the totally unnecessary additions to C when it was hacked up for the phototypesetter 7 release. To take one example, consider "unsigned". Who needs "unsigned"? V6 was written without it -- if you needed an unsigned integer you could always use a character pointer. And I, for one, was quite happy to put a "#" on the first line of my C program if [if!] there were any #include or #define statements in my program. [Actually, sometimes I still do this, just to be obstinate!] [I think I'm getting carried away. Time to come back to earth.] As C has developed, it has provided more and more facilities for approaching problems in an abstract, machine-independent way. I for one applaud this growth. I *want* to plan my programs carefully, think about the issues involved, and have utilities like "lint" tell me when I'm being careless. I want to be able to move my programs to new machines without having to rewrite them. As much as I like PDP-11's, I no longer use them (at least, not with UNIX). Eventually I'll log off of a VAX for the last time. Computer architectures are changing, and someday even the assumption of a classical von Neumann architecture will be invalid. (This is already true for some machines.) If C continues to evolve, when that day comes C may still be around (in some form). I am certain that if it sticks to a rigid PDP/VAX view of the world it will be left behind in the dust. -- John Bruner (S-1 Project, Lawrence Livermore National Laboratory) MILNET: jdb@mordor.ARPA [jdb@s1-c] (415) 422-0758 UUCP: ...!ucbvax!dual!mordor!jdb ...!decvax!decwrl!mordor!jdb