stevens@hsi.UUCP (11/08/83)
On a VAX running 4.1bsd I found some code in a tape reading program that went int length; length = read(tapefd, buf, 62000); if (length < 0) /* error */ ... if (length != -1) length &= 0xffff; I removed the mask and sure enough, a read of a 3120 byte tape record returned a length of 65536+3120, hence the mask was required. I reduced the buffer size and the length to read to 30,000 bytes and the returned value became 3120, so the mask was not needed. Anyone know whats going on ? (The tape driver is tm.c.) Richard Stevens Health Systems International, New Haven, CT { decvax | hao | seismo | sdcsvax } ! kpno ! hsi ! stevens ihnp4 ! hsi ! stevens
chris@umcp-cs.UUCP (11/13/83)
Re: read () from a tape gives ridiculous return value. Bug is on 4.1BSD and possibly other versions of Unix]. Welcome to the Wonderful World of Weirdness -- tape drives. The 4.1BSD tm.c has a sign extension bug when filling in b->b_resid. CVL has a fixed version of tm.c; here's a diff listing, with my comments, and some irrelevant stuff (Digi-Data stuff, and some error logging) removed. *** /usr/src/sys/dev/tm.c Fri Oct 9 17:24:17 1981 [4.1 version] --- tm.c Sat Nov 12 19:07:53 1983 [CVL's fixed version] *************** This one looks like someone wasn't thinking... *** 100,106 u_short sc_dens; /* prototype command with density info */ daddr_t sc_timo; /* time until timeout expires */ short sc_tact; /* timeout is active */ ! } te_softc[NTM]; #ifdef unneeded int tmgapsdcnt; /* DEBUG */ #endif --- 100,117 ----- u_short sc_dens; /* prototype command with density info */ daddr_t sc_timo; /* time until timeout expires */ short sc_tact; /* timeout is active */ ! } te_softc[NTE]; /* was NTM - JIP, CVL, 12/30/82 */ #ifdef unneeded int tmgapsdcnt; /* DEBUG */ #endif *************** Not all tape drives are the same speed... *** 422,427 * Set next state; give 5 minutes to complete * rewind, or 10 seconds per iteration (minimum 60 * seconds and max 5 minutes) to complete other ops. */ if (bp->b_command == TM_REW) { um->um_tab.b_active = SREW; --- 433,440 ----- * Set next state; give 5 minutes to complete * rewind, or 10 seconds per iteration (minimum 60 * seconds and max 5 minutes) to complete other ops. + * Changed to allow 30 seconds per iteration, 10 min max, + * with 10 min rewind JIP */ if (bp->b_command == TM_REW) { um->um_tab.b_active = SREW; *************** *** 425,431 */ if (bp->b_command == TM_REW) { um->um_tab.b_active = SREW; ! sc->sc_timo = 5 * 60; } else { um->um_tab.b_active = SCOM; sc->sc_timo = --- 438,444 ----- */ if (bp->b_command == TM_REW) { um->um_tab.b_active = SREW; ! sc->sc_timo = 10 * 60; } else { um->um_tab.b_active = SCOM; sc->sc_timo = *************** *** 429,435 } else { um->um_tab.b_active = SCOM; sc->sc_timo = ! imin(imax(10*(int)-bp->b_repcnt,60),5*60); } if (bp->b_command == TM_SFORW || bp->b_command == TM_SREV) addr->tmbc = bp->b_repcnt; --- 442,448 ----- } else { um->um_tab.b_active = SCOM; sc->sc_timo = ! imin(imax(30*(int)-bp->b_repcnt,60),10*60); } if (bp->b_command == TM_SFORW || bp->b_command == TM_SREV) addr->tmbc = bp->b_repcnt; *************** I'm not sure why this change... maybe it has something to do with the error logging. [note: I collapsed two diff entries] *** 616,622 * If we were reading raw tape and the only error was that the * record was too long, then we don't consider this an error. */ ! if (bp == &rtmbuf[TMUNIT(bp->b_dev)] && (bp->b_flags&B_READ) && (addr->tmer&(TMER_HARD|TMER_SOFT)) == TMER_RLE) ! goto ignoreerr; /* --- 635,641 ----- * If we were reading raw tape and the only error was that the * record was too long, then we don't consider this an error. */ ! /* if (bp == &rtmbuf[TMUNIT(bp->b_dev)] && (bp->b_flags&B_READ) && (addr->tmer&(TMER_HARD|TMER_SOFT)) == TMER_RLE) ! goto ignoreerr; JIP CVL */ /* *************** *** 629,635 ubadone(um); goto opcont; } ! } else /* * Hard or non-i/o errors on non-raw tape * cause it to close. --- 656,662 ----- ubadone(um); goto opcont; } ! } else { /* * Hard or non-i/o errors on non-raw tape * cause it to close. *************** *** 634,639 * Hard or non-i/o errors on non-raw tape * cause it to close. */ if (sc->sc_openf>0 && bp != &rtmbuf[TMUNIT(bp->b_dev)]) sc->sc_openf = -1; /* --- 661,668 ----- * Hard or non-i/o errors on non-raw tape * cause it to close. */ + /* JIP CVL */ if ((addr->tmer&TMER_HARD)==0 && + um->um_tab.b_errcnt) goto ignoreerr; if (sc->sc_openf>0 && bp != &rtmbuf[TMUNIT(bp->b_dev)]) sc->sc_openf = -1; } *************** *** 636,641 */ if (sc->sc_openf>0 && bp != &rtmbuf[TMUNIT(bp->b_dev)]) sc->sc_openf = -1; /* * Couldn't recover error */ --- 665,671 ----- um->um_tab.b_errcnt) goto ignoreerr; if (sc->sc_openf>0 && bp != &rtmbuf[TMUNIT(bp->b_dev)]) sc->sc_openf = -1; + } /* * Couldn't recover error */ *************** [This here is your length error.] *** 688,694 */ um->um_tab.b_errcnt = 0; dp->b_actf = bp->av_forw; ! bp->b_resid = -addr->tmbc; ubadone(um); iodone(bp); /* --- 729,739 ----- } #endif ERRORLOG dp->b_actf = bp->av_forw; ! /* allow for long reads JIP */ ! /* compiler bug!! casting as (short unsigned) before assigning to ! * long doesn't do anything. ! */ ! bp->b_resid = (-addr->tmbc) & 0xffff; ubadone(um); iodone(bp); /* -- In-Real-Life: Chris Torek, Univ of MD Comp Sci UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris.umcp-cs@CSNet-Relay
buck%nrl-css@sri-unix.UUCP (03/02/84)
From: Joe Buck <buck@nrl-css> There are two meanings for portability here; in the more realistic (but weaker) sense, this is a portable construct. Any C compiler can recover objects (array, structure, int, etc) of type y written with write(fd,&y,sizeof(y)) by using read(fd,&y,sizeof(y)). Of course sizeof(y) is different on different machines; that's the reason sizeof is included in the language, to take care of machine dependencies in an elegant way. There's a second, tougher standard of portability. This is, what if machine A does the write and machine B does the read? For this case, even ints of the same size may be nonportable because the VAX and PDP-11 have one way of ordering bytes and everyone else (almost) has another. You have to encode everything as char values to have any hope at all of this type of portability; even then there are problems in that some bytes aren't eight bits. In summary, the use of sizeof with read and write is the proper thing to do and should be encouraged. ARPA: buck@nrl-css UUCP: ...!decvax!nrl-css!buck -Joe
buck%nrl-css@sri-unix.UUCP (03/04/84)
From: Joe Buck <buck@nrl-css> Well, almost. On machines with character pointers of different length and structure from other pointers (and in all cases, just to please lint) you should say read(fd, (char *) &y, sizeof y) Ok Doug? By the way, does anyone know of such a Unix implementation (one in which the statement above, without the cast, won't work? -Joe
gwyn%brl-vld@sri-unix.UUCP (03/04/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> The (char *) is not "just to please lint". Different pointer types in general have different sizes so one MUST coerce the pointer to the type expected by the function. I know of C (not UNIX) implementations where the cast is definitely necessary. This usually occurs on word-addressible machines where a (char *) cannot be fully contained in a single word. You seem to have a funny idea about "lint"'s purpose.
ark@rabbit.UUCP (Andrew Koenig) (03/05/84)
You are better off writing: read (fd, (char *) &y, sizeof (y)) It makes a difference on some machines.
hartwell%shasta@sri-unix.UUCP (03/14/84)
From: Steve Hartwell <hartwell@shasta> I don't think this is non-portable. For a machine which has pointers of more than one width, the compiler can be expected to widen the shorter ones to the width of the largest as it is pushed onto the argument list, just as chars are promoted to ints when pushed. The called function will know it's stored that way and shorten it if it needs to before it's used. So it doesn't matter what the type of "y" is in the read call is or what the actual width of &y is. It seems simple to me that there should be only one width of a pointer on the argument stack [Not necessarily the width of an int, either]. Steve Hartwell, Stanford University p.s. this also speaks to the NULL vs. 0 vs. ((char *) 0) issue.
pb%camjenny@ucl-cs.arpa (03/15/84)
From: Piete Brooks <pb%camjenny@ucl-cs.arpa> Consider the case of a WORD addressed machine, such as the PERQ. It encodes BYTE pointers specially, shifting them left. Thus the WORD pointer for an object and it's BYTE pointer are NOT the same. As the compiler does not know that read expects a BYTE pointer, when given a WORD pointer, it will not CAST it for you, unless you EXPLICITLY tell it to. It does keep one on one's toes ..........
guy@rlgvax.UUCP (Guy Harris) (03/18/84)
> I don't think this is non-portable. For a machine which has pointers of more > than one width, the compiler can be expected to widen the shorter ones to > the width of the largest as it is pushed onto the argument list, just as > chars are promoted to ints when pushed. The called function will know it's > stored that way and shorten it if it needs to before it's used. Anyone who expects a C compiler to do this is going to be sorely disappointed. There is NOTHING in K&R which says that this must be done, and there is no reason for a compiler to do so. It is explicitly stated in K&R that integer and floating point values are coerced to "int" and "double", respectively, so one would expect this to happen. > So it doesn't matter what the type of "y" is in the read call is or what > the actual width of &y is. It seems simple to me that there should be only > one width of a pointer on the argument stack [Not necessarily the width of > an int, either]. WHY? Why should there be only one width of a pointer on the argument stack? Just to make life easier for lazy programmers who refuse to write type-correct code no matter how often they've been told to? The analogy between different widths of "int" and different width of pointer breaks down because there is a semantic difference between "char", "short", "int", and "long" in the C language that pertains directly to the width of the object; the semantic difference between "char *" and "int *" has nothing to do with the *width* of those pointers - any difference or lack of same between their widths is purely a consequence of the C implementation and of the architecture of the underlying machine. > p.s. this also speaks to the NULL vs. 0 vs. ((char *) 0) issue. No, it doesn't. Even if a C compiler took the ill-advised step of "widening" pointers when passed as arguments, this would have no effect on a program which illegally passed an "int" of 0 to a routine expecting a pointer. We've repeatedly heard ideas for changes to the C language or the C compiler to "solve" the "problem" caused by the facts that 1) C has several different pointer types which may not have identical implementations and 2) that null pointers in C are represented by coercing the "int" value 0 to a pointer, and that C has no way of telling the compiler what kinds of arguments a function takes, so the 0 value must be coerced explicitly with a cast when passed to a function. THIS ISN'T A "PROBLEM", FOLKS, AND IT DOESN'T REQUIRE A "SOLUTION". The way to "solve" the "problem" is to write type-correct code and explicitly cast all NULLs or 0s passed as values to routines expecting pointers. This requires NO changes to C, or to any correct C compilers; it merely requires changes to incorrect C code and to the incorrect models of the C language held by certain programmers. And if you have trouble finding all the places you forgot to cast pointers, well, there's a very nice tool - at least on UNIX - to fix this. It's called "lint". USE IT. This non-problem requires no further debate; there is only one correct way to deal with it. Tired of explaining pointers, and tired of pointing people back to K&R, Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
geoff@callan.UUCP (Geoff Kuenning) (03/19/84)
> You are better off writing: > > read (fd, (char *) &y, sizeof (y)) > > It makes a difference on some machines. Actually, you are still non-portable if there is a possibility that the data will be read on a machine different from the one it was written on. Any of the following problems might crop up: Character sizes differ (yes, there are still 6-, 7-, and 9-bit bytes out there--GCOS, for example, uses 9 bits) Long sizes differ (less likely but conceivable) Byte orderings differ I ran into the last one trying to read the Bell distribution tapes on a 68000. "cpio" writes the tape header with the type of construct suggested above, but writes the tape contents in character form. If I byte-swap the contents appropriately, the header gets screwed up because of 68000/vax byte ordering differences. "cpio" has a switch ('-c') to solve this problem by never writing binary data, but Bell (in their infinite wisdom) did not use this option when writing their distribution tapes. _ _ _ _ _ _ _ (Isn't every computer a |d|i|g|i|t|a|l| computer?) - - - - - - - Geoff Kuenning Callan Data Systems ...!ihnp4!sdcrdcf!trwrb!wlbr!callan!geoff
hartwell@Su-Shasta.ARPA (03/22/84)
From: Steve Hartwell <hartwell@Su-Shasta.ARPA> I think you should curb your dogma, Guy. I see a degree of cleanness in the basic datatype promotions from small ints to "generic" ints, and small floats to "generic" floats (that is, doubles) when passed as parameters, and I believe that that specification was made with the intention to aid in simplifying compiler implementation [ and not just because the architecture of the pdp11 demanded it, as suggested to me in a previous letter ]. And I see no reason why this concept should not be generalized to pointers as well. Passing a character as an argument to a function without an explicit cast is hardly reckless abandon; K&R say that chars are members of the int family and are treated that way. Why should passing a pointer be any more stringent? It seems so much more conceptually clean to me to say that a (foo *) is a member of the pointer family and give the compiler implementors (and program writers) a break. That is why I think that null pointers should be represented as NULL, whose definition is 0 *cast to a pointer to anything you like*; I don't believe that it should make a difference whether it is a (char *) 0 or a (struct _iob *)0. My view is that program control and data management is complex enough as it is, and if by clustering basic operative groups {ints, floats, pointers} can free me from cast slavery then I would argue for it. Fervent bible-thumping serves no purpose in a discussion which considers the merits and limitations of the standard(s); we /know/ what it SAYS; but you certainly hold no patent on its interpretation, or on what is worth cross-examining. Steve Hartwell, Stanford
guy@rlgvax.UUCP (Guy Harris) (03/25/84)
> I think you should curb your dogma, Guy. Sorry, my dogma is trained as a watchdog and has to bark at intruders. > I see a degree of cleanness in the basic datatype promotions from small > ints to "generic" ints, and small floats to "generic" floats (that is, > doubles) when passed as parameters, and I believe that that specification > was made with the intention to aid in simplifying compiler implementation > [ and not just because the architecture of the pdp11 demanded it, as > suggested to me in a previous letter ]. It's nice that you believe that, but do you have any evidence to back up that belief? Have you asked Dennis Ritchie about this? By the way, the architecture of the PDP-11 *doesn't* demand it. Any instruction which pushes a byte onto the stack decrements the stack pointer by two, so if you do a movb frobozz,-(sp) it'll push a word onto the stack. However, *nothing* in the PDP-11 architecture requires that the program referencing frobozz at some offset "frob(sp)" reference it with word instructions. And I don't see why it's any cleaner than the alternative. As for floats and doubles, my uninformed guess (which is worth neither more nor less than your uninformed guess) is that one reason they promote floats to doubles in general is that they didn't want to have to write a compiler which generated "setd" and "setf" instructions. Fine, but it doesn't elevate the notion of "design your language around the person writing the compiler" to a general design principle. In fact, taken as such a principle, it's flatly *wrong*. > And I see no reason why this concept should not be generalized to > pointers as well. Passing a character as an argument to a function > without an explicit cast is hardly reckless abandon; K&R say that > chars are members of the int family and are treated that way. Why should > passing a pointer be any more stringent? I see no reason why it *should* be generalized to pointers. Again, your comparison of "char" <-> "int" and "xxx *" <-> "char *" missed the point - K&R says "char"s are members of the "int" family but does NOT say ANYTHING remotely similar about a "pointer family". As such, passing characters without explicit casts isn't reckless abandon because K&R says explicitly that there is such a cast, but passing pointers without explicit casts is dangerous and wrong because K&R does not promise that any such cast will be done. If you don't like the C language's rules for dealing with pointers as parameters, fine; say that the language should be changed. Just don't add your own rules on top of K&R and claim that it's part of C. To quote from your original article: > I don't think this is non-portable. For a machine which has pointers of more > than one width, the compiler can be expected to widen the shorter ones to > the width of the largest as it is pushed onto the argument list, just as > chars are promoted to ints when pushed. The called function will know it's > stored that way and shorten it if it needs to before it's used. This statement is flatly false. No exceptions, no appeals. There exist implementations of C in which the code read(fd,&y,sizeof(y)); will not properly execute. (Proof by counterexample.) Your claim that "the compiler can be expected to widen the shorter ones... just as chars are promoted to ints" is equally incorrect - see the same counterexample, and see K&R and notice the lack of any such claim. C is not what people want it to be; pending either an ANSI C language standard, or public release of any of AT&T's internal C language standards, C is what K&R says it is. No more, no less. If you deny that, you're denying that there is any authoritative reference manual to C, which would render it useless as a language for writing portable code. (By the way, I also note the use of the word "pushed" in the paragraph quoted. The term "passed as an argument" should be used, because there's no guarantee that parameters will be passed on a simple stack. It indicates that a lot of the thinking on this question is based on the low-level details of how C is implemented. If C is to be used as a portable implementation language, however, people will just have to forget what they know about the C implementation most of the time and target their code for an abstract C implementation; otherwise, when their code is ported to a C implementation that doesn't reproduce the characteristics of the implementation they wrote the code for, it may not work.) > It seems so much more conceptually clean to me to say that a (foo *) is > a member of the pointer family and give the compiler implementors (and > program writers) a break. Again, why is this more conceptually clean? I haven't thought of pointers as a generic data type since I stopped programming in PL/I, lo these many years ago. ALGOL 68, PASCAL, Modula-2, Mesa, and many other languages have "pointer" as an adverb, so that you have "pointer to int" and "pointer to char" and "pointer to frobozz" - C is another of these languages. Several of these languages have a generic null pointer, but so does C, in a sense. > That is why I think that null pointers should be represented as NULL, > whose definition is 0 *cast to a pointer to anything you like*; I don't > believe that it should make a difference whether it is a (char *) 0 or > a (struct _iob *)0. In all of these languages, except C, you declare the types of the arguments to a procedure when you want to use the procedure, and the compiler can automatically generate code to pass a null pointer of the appropriate type. C currently lacks this facility. Why not ask for that facility, instead of changing the language in ways that: 1) make current reasonable implementations non-conforming; and 2) cause extra code to be generated (to cast the pointers to this "generic" type when passing them as parameters, and to cast them back to the appropriate type when the pointers are used); > My view is that program control and data management is complex enough > as it is, and if by clustering basic operative groups {ints, floats, pointers} > can free me from cast slavery then I would argue for it. "program control and data management is complex enough as it is"? Sorry, son, if that's a plea for sympathy it fails miserably. Running your code through "lint" and throwing in a few casts doesn't cost much on top of the rest of the work I hope you put into the code you write. Referring to it as "cast slavery" is cute but wrong. > Fervent bible-thumping serves no purpose in a discussion which considers the > merits and limitations of the standard(s); we /know/ what it SAYS; > but you certainly hold no patent on its interpretation, or on what > is worth cross-examining. 1) Show me where you can interpret K&R as *requiring* the treatment of pointers you desire, not just *permitting* it (which it certainly does). Otherwise, my interpretation that it doesn't require your treatment of pointers *is* the only correct one. And, if that is the case, code which requires that treatment of pointers is incorrect code, and is not guaranteed to work on all implementations of C. 2) Throwing around terms like "Fervent bible-thumping" serves no purpose in this discussion at all. Here's what I said in the article you're responding to: > Anyone who expects a C compiler to do this is going to be sorely disappointed. > There is NOTHING in K&R which says that this must be done, and there is no > reason for a compiler to do so. It is explicitly stated in K&R that integer > and floating point values are coerced to "int" and "double", respectively, so > one would expect this to happen. The second and third sentences are true, as anyone with a copy of K&R can verify. The first is also true, if what another poster says about the Perq C compiler is true, namely that you *do* get bitten if you aren't type-correct in your handling of pointers. The rest of my article wasn't bible-thumping, it was expressing frustration at dealing with code written by people who assume that 0 and NULL can freely be passed to routines expecting pointers without casting them. I have to pick up the "core" files when such a program dies on our 68K-based machines, and I have to fix them. I'm justifiably tired of doing so. (I'm also tired of dealing with programs that either flatly assume that there's a null string at whatever location NULL points to, or just assumes that trying to dereference a NULL pointer is harmless; unfortunately, I suspect people are going to continue to write that kind of code. As such, I suspect that even if C were changed to permit you to explicitly declare the arguments that a function takes, people would still not bother using it and cause the same old problems all over again.) At this point, I say debate about the subject doesn't help much. The facts about how the language *is* (not how it *should be*) have been laid out more times than I can count by several people; if people still aren't convinced that until the language changes they'll just have to start casting their pointers, they're not ever going to be convinced. I'll just hope that most people start casting their pointers properly, or that some way of declaring function argument types enters the language and people start using it, and that I rarely have to deal with code that doesn't properly coerce pointers. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
jbf@ccieng5.UUCP (Jens Bernhard Fiederer) (03/28/84)
Speak for yourself. The need to cast every null pointer argument to the specific pointer required IS A PROBLEM WITH THE C LANGUAGE. It is a bloody pain, not to mention a waste of time. I do it, but I would rather not. Three cheers for the set of languages that support the "lazy" programmer! Three cheers for the empty set! Azhrarn -- Reachable as ....allegra![rayssd,rlgvax]!ccieng5!jbf Or just address to 'native of the night' and trust in the forces of evil.