dhesi@bsu-cs.UUCP (Rahul Dhesi) (07/15/87)
In article <6109@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >There are several UNIX library routines in various implementations >that attempt to return a -1 value for a function whose return type >is (char *). . . . >I would hope that all these >botches could be fixed (certainly in any proposed standards!). . . >Phase 2 -- change these functions to return >NULL (of the appropriate type) on failure. On the contrary, I think we need more distinguished pointer values, not just a single zero or NULL value. I have a set of custom I/O routines that use the pointer value NOFILE to indicate that no file could be opened (equivalent to (char *) 0 in current C implementations) and another pointer value NULLFILE to indicate that the custom I/O library routines should ignore all output to this file (equivalent to the value (char *) -1 but conceptually equivalent to opening /dev/null for output, except that a file descriptor isn't wasted and the existence of a null device or its name need not be presumed). Consider again how gets(3) indicates end-of-file and error. If there were two distinguished pointer values, one could test for both end-of-file and for error without using the botched-up errno. Upward compatibility will prohibit doing this for gets(), but the need for more than one distinguished pointer is clear. We already have a (char *) -1; let's just give it a different name and keep it. Then sbrk can return ERRPTR on error, and we can define ERRPTR as (char *) -1, and remain fully compatible to boot. -- Rahul Dhesi UUCP: {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi
ron@topaz.rutgers.edu (Ron Natalie) (07/16/87)
> On the contrary, I think we need more distinguished pointer values, not > just a single zero or NULL value. I have a set of custom I/O routines > that use the pointer value NOFILE to indicate that no file could be > opened (equivalent to (char *) 0 in current C implementations) and > another pointer value NULLFILE to indicate that the custom I/O library > routines should ignore all output to this file (equivalent to the value > (char *) -1 but conceptually equivalent to opening /dev/null for > output, except that a file descriptor isn't wasted and the existence of > a null device or its name need not be presumed). Actually, I think your example is sloppy programming, but there is no problem with defining a NULLFILE in C. They already do it with standard I/O with stdin, stdout, and stderr. Do this... FILE null_file; #define NULLFILE &null_file Great, now you have a new pointer value, guaranteed to be unique and to point to nothing else that your I/O routines can test for. This requires no modifications to existing compilers and it obviates the need for special case code to map your "-1" pointers to something that is storable in the architecture that the machine is working with. Many machines have no representable "(char *) -1." Some consider it a botch to even do (char *) 0. -Ron
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/16/87)
In article <846@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >We already have a (char *) -1; let's just give it a different name and keep it. You missed my point entirely! "(char *)-1" is MEANINGLESS in some C implementations and CANNOT be returned as a function value. Sure, using symbolic names such as SIG_ERR is helpful, since that permits all implementations to provide a meaningful definition. However, my specific recommendation that routines such as sbrk() be changed to return NULL ((char *)0) instead of (char *)-1 is based on there being just one "failed" value, along with the desire to be able to "phase in" the new semantics. If you really want to have to test for multiple failure modes every time you use such a function, more power to you, but PLEASE do not attempt to dictate the numeric values for pointers -- use symbolic names in the specification.
guy%gorodish@Sun.COM (Guy Harris) (07/16/87)
> On the contrary, I think we need more distinguished pointer values, not > just a single zero or NULL value. Well, you're completely wrong. > I have a set of custom I/O routines that use the pointer value NOFILE to > indicate that no file could be opened (equivalent to (char *) 0 in current > C implementations) and another pointer value NULLFILE to indicate that > the custom I/O library routines should ignore all output to this file > (equivalent to the value (char *) -1 but conceptually equivalent to > opening /dev/null for output, except that a file descriptor isn't wasted > and the existence of a null device or its name need not be presumed). There are plenty of other ways to do this. Consider having a flag in whatever data structure the custom I/O routines refer to that says "ignore all output to this file". Then set that bit in the cases where you would otherwise return NULLFILE. > Consider again how gets(3) indicates end-of-file and error. If there > were two distinguished pointer values, one could test for both > end-of-file and for error without using the botched-up errno. RTFM. You can do that *now*; look up "ferror" and "feof" in the appropriate manual page. > Upward compatibility will prohibit doing this for gets(), but the need > for more than one distinguished pointer is clear. You haven't shown that yet. > We already have a (char *) -1; let's just give it a different name and > keep it. Let's not. Consider a system where "(char *)-1" would evaluate to a pointer that *could* point to a legitimate object. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
chips@usfvax2.UUCP (Chip Salzenberg) (07/19/87)
In article <846@bsu-cs.UUCP>, dhesi@bsu-cs.UUCP writes: > In article <6109@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) > writes: > >There are several UNIX library routines in various implementations > >that attempt to return a -1 value for a function whose return type > >is (char *). . . . > >Phase 2 -- change these functions to return > >NULL (of the appropriate type) on failure. > > On the contrary, I think we need more distinguished pointer values, not > just a single zero or NULL value. I have a set of custom I/O routines > that use the pointer value NOFILE to indicate that no file could be > opened (equivalent to (char *) 0 in current C implementations) and > another pointer value NULLFILE to indicate that the custom I/O library > routines should ignore all output to this file. > [contention that (FILE *) -1 should be portable, so it can be used as > an alternative invalid pointer] That's easy to do, and there's no need for "another NULL". Declare a global variable of type FILE and use its address as a "magic number". For example: --- C code follows --- FILE nullfile; /* This is never opened, read, etc. */ ... FILE *logfile; logfile = myopen("logfile", "w"); ... mywrite(logfile, buf, bufsiz); ... FILE *myopen(fname, fmode); { if (no_disk_io) return (&nullfile); else return (fopen(fname, fmode)); } int mywrite(fp, buf, len) FILE *fp; char *buf; int len; { if (fp == NULL) big_problem(); /* :-) */ else if (fp == &nullfile) return (len); /* do nothing */ else return (fwrite(fp, 1, len, buf)); } --- End of C code --- > Consider again how gets(3) indicates end-of-file and error. If there > were two distinguished pointer values, one could test for both > end-of-file and for error without using the botched-up errno. But where would it end? No, if there must be an invalid pointer -- which there is :-) -- then it must be one of a kind. > Rahul Dhesi UUCP: {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi -- Chip Salzenberg UUCP: "uunet!ateng!chip" or "chips@usfvax2.UUCP" A.T. Engineering, Tampa Fidonet: 137/42 CIS: 73717,366 "Use the Source, Luke!" My opinions do not necessarily agree with anything.
henry@utzoo.UUCP (Henry Spencer) (07/19/87)
> On the contrary, I think we need more distinguished pointer values, not > just a single zero or NULL value. I have a set of custom I/O routines > that use the pointer value NOFILE... and another pointer value NULLFILE... This is utterly trivial to do without any language extensions whatsoever. Put the following code fragment into a file and include it in your library: char myio_no; char myio_null; And then include this in the include file for your library: #define NOFILE ((FILE *)&myio_no) #define NULLFILE ((FILE *)&myio_null) This works just fine without any unportable messes like "(char *) -1". It isn't quite so handy for system calls, although equivalents could be devised. -- Support sustained spaceflight: fight | Henry Spencer @ U of Toronto Zoology the soi-disant "Planetary Society"! | {allegra,ihnp4,decvax,utai}!utzoo!henry
bilbo.dobson@CS.UCLA.EDU (Peter Dobson) (07/22/87)
> From: Rahul Dhesi <dhesi@bsu-cs.uucp> > Newsgroups: comp.lang.c > Subject: Distinguished pointers (was Re: Weird syscall returns) > Date: 15 Jul 87 16:14:16 GMT > To: info-c@brl-smoke.arpa In article <846@bsu-cs.UUCP> dhesi@bsu-cs.uucp (Rahul Dhesi) writes: > On the contrary, I think we need more distinguished pointer values, not > just a single zero or NULL value. I have a set of custom I/O routines > that use the pointer value NOFILE to indicate that no file could be > opened (equivalent to (char *) 0 in current C implementations) and > another pointer value NULLFILE to indicate that the custom I/O library > routines should ignore all output to this file (equivalent to the value > (char *) -1 but conceptually equivalent to opening /dev/null for > output, except that a file descriptor isn't wasted and the existence of > a null device or its name need not be presumed). > Consider again how gets(3) indicates end-of-file and error. If there > were two distinguished pointer values, one could test for both > end-of-file and for error without using the botched-up errno. > Upward compatibility will prohibit doing this for gets(), but the need > for more than one distinguished pointer is clear. We already have a > (char *) -1; let's just give it a different name and keep it. Then > sbrk can return ERRPTR on error, and we can define ERRPTR as > (char *) -1, and remain fully compatible to boot. > -- > Rahul Dhesi UUCP: {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi A portable way to do this is to declare a global data object that is just used as a location for a pointer to point to, like: /* outside a function declaration */ char nofile, nullfile; #define NULLFILE &nullfile #define NOFILE &nofile This way the pointers will be valid values on all machines, and won't contain a value that is valid in a pointer to point to any data object (as the characters nullfile, and nofile aren't used.) This may fail on machines where pointers can't be cast from type (char *) to another type and back again. For example in MicroSoft C on MS-DOS in some memory models a pointer to character cast to pointer to function, cast back to pointer to function, doesn't work. I don't think this is likely to be a problem on any implementations where all the pointer types point to data. --- Peter
steve@nuchat.UUCP (Steve Nuchia) (08/01/87)
In article <6129@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes: > symbolic names such as SIG_ERR is helpful, since that permits all > implementations to provide a meaningful definition. However, my specific This reminds me of an issue regarding symbolic constants and the switch construct that has come up a handful of times porting stuff. For example, lets say we are switching on an input character, we wrote the code for UNIX, and we are porint it to OS9. switch ( getchar() ) { ...... case '\n': case '\r': common code for the "enter" morpheme ...... } Now we find ourselves on OS9, where the compiler says that '\n' and '\r' are both 13. (They thoughtfully provide '\l' == 11). Now the switch doesn't work, even though all it needs is to be collapsed. This same situation arrises in porting from a featureful unix (say bsd) to a feature-sparse environment, say a 3b2. It happens, perhaps not "often", but it happens, that error indicators and the like wind up with aliased names and a similar switch syntax error come up. To "fix" the problem we have to make the code not reverse portable - we have to break it to get it to work. I'm not sure I want to require the compiler to recognise and allow (warning?) the special case of stacked identical case values, put it might be worth looking into. More random thoughts for your consideration, Steve Nuchia {{soma,academ}!uhnix1,sun!housun}!nuchat!steve
peter@sugar.UUCP (Peter da Silva) (08/01/87)
> Now we find ourselves on OS9, where the compiler says that '\n' and > '\r' are both 13. (They thoughtfully provide '\l' == 11). Now the > switch doesn't work, even though all it needs is to be collapsed. While there's nothing inherently wrong with having non linefeed newlines, the 'C' compiler should deal with it in such a way as not to break existing code. MS-DOS uses CR/LF as the newline, which is broken, but at least the fix is handled by the I/O library... which allows normal text-oriented stuff to work. Of course this breaks other stuff (fseek on text files... it needn't, but it does). Suggestion: lie and say '\r' is LF. Suggestion: give up and accept that if your file formats differ you're going to have to use #ifdefs. It least it's not like RSX where lines are variable length records with a word length and a bunch of byte data. PS: seek is handled badly by a number of compilers and libraries on non UNIX systems. The UNIX manual states that on some systems offsets are magic cookies, and that the only reliable way to get an offset in these systems is to read to that point, yet. Some libraries don't include seek, because they would have to use magic cookies. Some libraries require 'C' programs to access unblocked files only. One that I know of physically copies a file to an unblocked file when you open it in 'C', so you can seek. An Atari-800 'C' compiler implements 'note()' and 'point()' with identical semantics to ftell() and fseek(), because "you can't seek to a random location in the file"... which is true but irrelevant. I wish these people would RTFM. -- -- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter (I said, NO PHOTOS!)
richard@aiva.ed.ac.uk (Richard Tobin, JANET: R.Tobin@uk.ac.ed ) (08/03/87)
In article <8317@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >Put the following code fragment into a file and include it in your library: > > char myio_no; > char myio_null; > >And then include this in the include file for your library: > > #define NOFILE ((FILE *)&myio_no) > #define NULLFILE ((FILE *)&myio_null) Is this valid in all Ansi-conforming implementations? That is, is a comparison like if(file == NOFILE) valid? My copy of the draft C standard says (apropos of pointer comparisons): "If the objects pointed to are not members of the same aggregate object, the result is undefined" and that seems to apply here. I assume this restriction is to allow segmented architectures to just compare the segment offset (or is there another reason?). Of course, if the routines can be arranged to always return pointers from the same array, then the out-of-band values could be in the same array, and all would be well. My copy of the C standard is a little (18 months) out of date, so maybe this has changed. -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
mouse@mcgill-vision.UUCP (der Mouse) (08/05/87)
In article <8317@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes: >> I think we need more distinguished pointer values [...]. I have >> [need for other FILE * values] > This is utterly trivial to do [in current C]. Put the following code > fragment into a file and include it in your library: > char myio_no; > char myio_null; > And then include this in the include file for your library: > #define NOFILE ((FILE *)&myio_no) > #define NULLFILE ((FILE *)&myio_null) > This works just fine without any unportable messes like "(char *)-1". Hm. I see no guarantee that NOFILE != NULLFILE. Do it right: FILE myio_no; FILE myio_null; #define NOFILE (&myio_no) #define NULLFILE (&myio_null) der Mouse (mouse@mcgill-vision.uucp)
kent@xanth.UUCP (Kent Paul Dolan) (08/05/87)
In article <274@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes: [...] >For example, lets say we are switching on an input character, we >wrote the code for UNIX, and we are [porting] it to OS9. > > switch ( getchar() ) > { > ...... > case '\n': > case '\r': > common code for the "enter" morpheme > ...... > } > >Now we find ourselves on OS9, where the compiler says that '\n' and >'\r' are both 13. (They thoughtfully provide '\l' == 11). Now the >switch doesn't work, even though all it needs is to be collapsed. > Steve Nuchia > {{soma,academ}!uhnix1,sun!housun}!nuchat!steve What ends up happening if the cases didn't have common code; that is, you did something different for '\n' than for '\r'? Seems you could have 1) compiler bletches, or 2) subtle error sneaks through. Kent, the man from xanth.
kdmoen@watcgl.UUCP (08/05/87)
In article <274@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes: > switch ( getchar() ) > { > ...... > case '\n': > case '\r': > common code for the "enter" morpheme > ...... > } >Now we find ourselves on OS9, where the compiler says that '\n' and >'\r' are both 13. (They thoughtfully provide '\l' == 11). Now the >switch doesn't work, even though all it needs is to be collapsed. >To "fix" the problem we have to make the code not reverse portable - >we have to break it to get it to work. I'm not sure I want to require >the compiler to recognise and allow (warning?) the special case of >stacked identical case values, put it might be worth looking into. This isn't pretty, but it solves your problem: switch ( getchar() ) { ...... case '\n': #if '\r' != '\n' case '\r': #endif common code for the "enter" morpheme ...... } -- Doug Moen University of Waterloo Computer Graphics Lab UUCP: {ihnp4,watmath}!watcgl!kdmoen INTERNET: kdmoen@cgl.waterloo.edu
flaps@utcsri.UUCP (08/05/87)
>> Now we find ourselves on OS9, where the compiler says that '\n' and >> '\r' are both 13. (They thoughtfully provide '\l' == 11). Now the >> switch doesn't work, even though all it needs is to be collapsed. [ "the switch" had a "case '\r':" immediately followed by "case '\n':". ] In article <452@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes: >Suggestion: give up and accept that if your file formats differ you're >going to have to use #ifdefs... DON'T USE IFDEFS! It's very easy to accidently overuse ifdefs, but when possible you should use the other features of the pre-processor because they are often a more direct way of saying what you want. switch(...) { case '\n': #ifndef OS9 case '\r': is WRONG. The test line should be: #if '\n' != '\r' -- // Alan J Rosenthal // \\ // flaps@csri.toronto.edu, {seismo!utai or utzoo}!utcsri!flaps, \// flaps@toronto on csnet, flaps at utorgpu on bitnet. "To be whole is to be part; true voyage is return."
guy%gorodish@Sun.COM (Guy Harris) (08/06/87)
> My copy of the draft C standard says (apropos of pointer comparisons): > > "If the objects pointed to are not members of the same aggregate object, the > result is undefined" The October 1, 1986 draft says this apropos of relational operators on pointers, but !NOT! apropos of equality operators. Does your draft says this even about comparisons for equality? > and that seems to apply here. Since this is a comparison for equality, it *doesn't* apply. It says, apropos of equality operators, If two pointers to objects or functions compare equal, they point to the same object or function, respectively. which pretty clearly indicates that comparison of pointers for equality is meant to work regardless of whether the two pointers point into the same array or not. > I assume this restriction is to allow segmented architectures to just > compare the segment offset (or is there another reason?). The restriction on *relational* operators is there because there may not be a straightforward order that can be imposed on addresses in general (either because the address space is segmented, or for any other reason); if the two addresses point to members of the same array, "less than" and "greater than" can be defined purely within the terms of the language by saying that pointer A is less than/greater than pointer B iff the array index of the object pointed to by pointer A is less than/greater than the array index of the object pointed to by pointer B. For *equality* operators, it is clear that they want to *forbid* segmented architectures from just comparing the segment offset; doing the comparison that way would be horribly bogus and stupid. So the answer is "yes, this is valid in all ANSI-conforming implementations." Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/07/87)
In article <5193@utcsri.UUCP> flaps@utcsri.UUCP (Alan J Rosenthal) writes:
- case '\n':
-#ifndef OS9
- case '\r':
-is WRONG. The test line should be:
-#if '\n' != '\r'
It's all wrong anyway -- in C, '\r' and '\n' represent distinct
(whitespace) characters.
drw@cullvax.UUCP (Dale Worley) (08/07/87)
guy%gorodish@Sun.COM (Guy Harris) writes: > [The Draft Standard] says, apropos of equality operators, > > If two pointers to objects or functions compare equal, they > point to the same object or function, respectively. > > which pretty clearly indicates that comparison of pointers for > equality is meant to work regardless of whether the two pointers > point into the same array or not. This statement leaves a really nasty ambiguity: If two pointers compare *unequal*, then do they *not* point to the same thing? It seems possible that an implementation could have multiple representations of pointers to a particular object, but how are we to make sure that int X[10]; int *a, *b; a = X; b = X; a++; b++; a == b always returns true? Dale -- Dale Worley Cullinet Software ARPA: cullvax!drw@eddie.mit.edu UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw OS/2: Yesterday's software tomorrow Nuclear war? There goes my career!
gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/08/87)
In article <1437@cullvax.UUCP> drw@cullvax.UUCP (Dale Worley) writes: >This statement leaves a really nasty ambiguity: If two pointers >compare *unequal*, then do they *not* point to the same thing? ... This is essentially the "aliasing" problem, which was addressed at the June X3J11 meeting. See my previous summary from that meeting (posted yesterday) for the current rules.
richard@aiva.ed.ac.uk (Richard Tobin) (08/09/87)
In article <25045@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes (in reply to me): >> "If the objects pointed to are not members of the same aggregate object, the >> result is undefined" >The October 1, 1986 draft says this apropos of relational operators >on pointers, but !NOT! apropos of equality operators. Does your >draft says this even about comparisons for equality? No, but it doesn't say this either (at least not in the section entitled "C.3.9 Equality operators"): > If two pointers to objects or functions compare equal, they > point to the same object or function, respectively. All it says is: "The == (equal to) and the != (not equal to) operators are analogous to the relational operators except for their lower precedence." So it seems that the standard has been clarified since the version I have. >For *equality* operators, it is clear that they want to *forbid* >segmented architectures from just comparing the segment offset; doing >the comparison that way would be horribly bogus and stupid. It certainly would, which is why I wanted to check up on it. Thanks for your help. -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
levy@ttrdc.UUCP (08/11/87)
In article <6242@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes: < In article <5193@utcsri.UUCP> flaps@utcsri.UUCP (Alan J Rosenthal) writes: < - case '\n': < -#ifndef OS9 < - case '\r': < -is WRONG. The test line should be: < -#if '\n' != '\r' < < It's all wrong anyway -- in C, '\r' and '\n' represent distinct < (whitespace) characters. What is an implementation of C supposed to do on an OS/machine/character-code combination that doesn't have the foggiest that there is such a thing as distinct "new line" and "carriage return" characters? From the looks of the discussion here, I'd gather that OS9 is just such a beast and its C compiler is making the best of this brain damaged situation that it can. [Or perhaps, as Guy Harris likes to say, it "ain't C." :-) ] -- |------------Dan Levy------------| Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa, | an Engihacker @ | vax135}!ttrdc!ttrda!levy | AT&T Computer Systems Division | Disclaimer: i am not a Yvel Nad |--------Skokie, Illinois--------|
guy%gorodish@Sun.COM (Guy Harris) (08/12/87)
> < It's all wrong anyway -- in C, '\r' and '\n' represent distinct > < (whitespace) characters. > > What is an implementation of C supposed to do on an OS/machine/character-code > combination that doesn't have the foggiest that there is such a thing > as distinct "new line" and "carriage return" characters? 1) Give up in despair. 2) Fake it. What they are NOT supposed to do, under ANY circumstance, is to make '\r' and '\n' be the same thing! Doug is right; they *are* distinct characters. From K&R: 2.4.3 Character constants Certain non-graphic characters, the single quote ' and the backslash \, may be represented according to the following table of escape sequences: newline NL (LF) \n horizontal tab HT \t backspace BS \b carriage return CR \r form feed FF \f backslash \ \\ single quote ' \' bit pattern <ddd> \<ddd> And from the ANSI C standard: 2.2.2 Character display semantics The *active position* is that location on a display device where the next character output by the "fputc" function would appear. The intent of writing a printable character (as defined by the "isprint" function) to a display device is to display a graphic representation of that character at the active position and then advance the active position to the next position on the current line. The direction of printing is locale-specific. If the active position is at the final position of a line (if there is one), the behavior is unspecified. Alphabetic escape sequences representing nongraphic characters in the execution character seet are intended to produce actions on display devices as follows: ... \n (*new line*) Moves the active position to the initial position of the next line. \r (*carriage return*) Moves the active position to the initial positiion of the current line. ... Each of these escape sequences shall produce a unique implementation-defined value which can be stored in a single "char" object. The external representations in a text file need not be identical to the internal representations, and are outside the scope of this Standard. On top of all this, having '\n' and '\r' be the same character violates the Principle of Least Surprise, at least if the system uses ASCII (which I presume OS9 does). > From the looks of the discussion here, I'd gather that OS9 is just such a > beast and its C compiler is making the best of this brain damaged situation > that it can. Uh-uh. No way. They had no excuse; they screwed up. If they could, in any way, cause a display device to do the aforementioned, and can represent these instructions to the display device in a file, they should have arranged that the C library produce the appropriate instructions on output. If they couldn't do that, then they either shouldn't have implemented C ("give up in despair") or they should have had '\n' be LF (i.e., '\012') and '\r' be CR (i.e., '\015'), and had the *C library* act the same way when told to output either character ("fake it"). In other words, it really *ain't* C. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
peter@sugar.UUCP (Peter da Silva) (08/15/87)
> What is an implementation of C supposed to do on an OS/machine/character-code > combination that doesn't have the foggiest that there is such a thing > as distinct "new line" and "carriage return" characters? From the looks of > the discussion here, I'd gather that OS9 is just such a beast and its C > compiler is making the best of this brain damaged situation that it can. There is no "new line" character in ASCII. UNIX uses "line feed" as the new line character. OS/9 uses "carriage return". I'd say the 'C' language itself is suffering from parochialism here. Disclaimer: I have never used OS/9, and I don't know anything more of this aspect of the operating system than what I have read here. I would say that since OS/9 is a UNIX lookalike, using CR for NL instead of LF was probably not the best choice... but at least it's better than using both of them. -- -- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter (I said, NO PHOTOS!)
am@cam-cl.UUCP (08/17/87)
In article <854@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes: >> char myio_no; >> char myio_null; >> And then include this in the include file for your library: >> #define NOFILE ((FILE *)&myio_no) >> #define NULLFILE ((FILE *)&myio_null) > >Hm. I see no guarantee that NOFILE != NULLFILE. Do it right: > >FILE myio_no; >FILE myio_null; >#define NOFILE (&myio_no) >#define NULLFILE (&myio_null) > Hm. I see no guarantee that FILE is not a function type, which invalidates your suggestion. :-)
mouse@mcgill-vision.UUCP (der Mouse) (08/18/87)
[someone recommends the following, for getting distinguished FILE * values] >> char myio_no; >> char myio_null; >> #define NOFILE ((FILE *)&myio_no) >> #define NULLFILE ((FILE *)&myio_null) [then I say] > Hm. I see no guarantee that NOFILE != NULLFILE. Do it right: > FILE myio_no; > FILE myio_null; > #define NOFILE (&myio_no) > #define NULLFILE (&myio_null) I got a letter from someone I can't reply to (aimt!breck - aimt isn't in our uucp maps), asking why this change is necessary: > I would have sworn that the original would have sufficed. The two > character variables {myio_no,myio_null} are different variables and > the compiler had better put them in different locations. If you > don't mind taking the time to reply, how can NOFILE and NULLFILE be > equal in the first case? Well, since I can't mail, and there are likely others suffering from the same confusion, I'll explain. &myio_no is definitely not equal &myio_null. However, this is not true of ((FILE *)&myio_no) and ((FILE *)&myio_null). (To be sure, most common machines are byte-addressable and have C implementations that would indeed result in NOFILE != NULLFILE, but portability only among the common byte-addressable machines isn't what we're after here.) Let us postulate a word-addressed 32-bit machine (by "word" I mean a 16-bit quantity). On this machine, let us say, a natural pointer is a 32-bit quantity, meaning the address space is 8 gigabytes (2^32 x 2 bytes). A char will surely be 8 bits. A char pointer will then be a 32-bit pointer plus at least one more bit indicating which of the two chars in the addressed word the pointer points to. Now let us suppose that structures must be word-aligned (very probable on a word-addressed machine). Then also casting from (char *) to (FILE *) will probably just consist of dropping the extra bit (this is permitted by the C definition ever since K&R, see below for supporting quote). Now let us suppose that the compiler optimizes for space (possibly at the direction of the user) and puts myio_no and myio_null in the same word. Then &myio_no != &myio_null, but the only difference is in the extra bit, so ((FILE *)&myio_no) == ((FILE *)&myio_null). The legality of the postulated property of pointer conversion goes clear back to K&R, and I doubt it's changed in ANSI, because they are paying even more attention to portability than K&R were (I don't have a copy of the draft, and won't until it's available in machine-readable form). From Appendix A: C Reference Manual, 14.1, Explicit pointer conversions: Certain conversions involving pointers are permitted but have implementation-dependent aspects. [...] A pointer may be converted to any of the integral types large enough to hold it. [...] An object of integral type may be explicitly converted to a pointer. The mapping always carries an integer converted from a pointer back to the same pointer, but is otherwise machine dependent. A pointer to one type may be converted to a pointer to another type. The resulting pointer may cause addressing exceptions upon use [...]. It is guaranteed that a pointer to an object of a given size may be converted to a pointer to an object of a smaller size and back again without change. Note that nothing is said about converting a pointer to an object of a given size to a pointer to an object of a larger size, unless the original pointer was obtained by the reverse cast, which is not the case here. der Mouse (mouse@mcgill-vision.uucp)
guy%gorodish@Sun.COM (Guy Harris) (08/19/87)
> There is no "new line" character in ASCII. UNIX uses "line feed" as the new > line character. OS/9 uses "carriage return". I'd say the 'C' language itself > is suffering from parochialism here. No, not really; this is no more parochial than using "0" (converted to the appropriate type) to represent a null pointer. In both cases, it is possible to properly implement C; you just have to clear your mind of the notion that the fact that "0" is used to represent a null pointer means that a null pointer must consist of all zero bits, or that the fact that '\n' stands for LF means that lines in the native OS's file system must end with an LF. If OS/9 uses CR as the line-terminator character, the C I/O routines for OS/9 (i.e., "printf", "fputs", etc.) should translate LF into CR on output, and translate CR into LF on input. This may, of course, require them not to ignore the "b" modifier on "fopen" modes, so that in "text" mode this translation is performed and in "binary" mode it isn't, but that's life. Of course this raises the question of what the C I/O routines should translate CR to on output, or what should be translated to CR on input, but then if OS/9 doesn't provide a mechanism to "move the active position (of an output device) to the initial position of the current line" they do, technically, have a problem with implementing ANSI C, at least. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
henry@utzoo.UUCP (Henry Spencer) (08/20/87)
> There is no "new line" character in ASCII. UNIX uses "line feed" as the new > line character. OS/9 uses "carriage return". I'd say the 'C' language itself > is suffering from parochialism here. Tsk, tsk. How many of the people pontificating about ASCII have actually read the ASCII standard? Down in the fine print, it says loud and clear that IF a single character is used for the new line function, it shall be the one otherwise known as linefeed, and "newline" is then a legitimate name for it. I don't vouch for the wording -- my copy of the standard isn't handy -- but the meaning is clear. OS/9 is simply wrong here. -- Support sustained spaceflight: fight | Henry Spencer @ U of Toronto Zoology the soi-disant "Planetary Society"! | {allegra,ihnp4,decvax,utai}!utzoo!henry
chips@usfvax2.UUCP (Chip Salzenberg) (08/20/87)
In article <503@sugar.UUCP>, peter@sugar.UUCP (Peter da Silva) writes: >> What is an implementation of C supposed to do on an OS/machine/character-code >> combination that doesn't have the foggiest that there is such a thing >> as distinct "new line" and "carriage return" characters? From the looks of >> the discussion here, I'd gather that OS9 is just such a beast and its C >> compiler is making the best of this brain damaged situation that it can. This compiler botched the job. > There is no "new line" character in ASCII. UNIX uses "line feed" as the new > line character. OS/9 uses "carriage return". I'd say the 'C' language itself > is suffering from parochialism here. Not quite; no useful language can pander to every (hostile? :-]) environment. > I would say that > since OS/9 is a UNIX lookalike, using CR for NL instead of LF was probably > not the best choice... but at least it's better than using both of them. I once re-targeted a UNIX C compiler from the Z-80 to the 6809, so as to compile OS-9 programs. My solution to the above-mentioned problem was to define '\n' as 0x0D and '\r' as 0x0A. This did not produce the correct behavior for '\r', but it did prevent the ('\r' == '\n') problem; and since the OS-9 I/O services consider 0x0D as `newline' (CR-LF), it would have been _very_ difficult to make '\r' behave correctly anyway. -- Chip Salzenberg UUCP: "uunet!ateng!chip" or "chips@usfvax2.UUCP" A.T. Engineering, Tampa Fidonet: 137/42 CIS: 73717,366 "Use the Source, Luke!" My opinions do not necessarily agree with anything.
peter@sugar.UUCP (Peter da Silva) (08/23/87)
In article <8444@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes: > Tsk, tsk. How many of the people pontificating about ASCII have actually > read the ASCII standard? Down in the fine print, it says loud and clear > that IF a single character is used for the new line function, it shall be > the one otherwise known as linefeed, and "newline" is then a legitimate > name for it. I don't vouch for the wording -- my copy of the standard > isn't handy -- but the meaning is clear. OS/9 is simply wrong here. The man from the Zoo comes through again. Wow. Amazing. I never knew that. I never even thought about looking at the Ascii standard. Where do you get it? -- -- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter (I said, NO PHOTOS!)
henry@utzoo.UUCP (Henry Spencer) (08/27/87)
> I never even thought about looking at the Ascii standard. Where do you get > it? Its full name is "ANSI X3.4-1977, American National Standard Code for Information Interchange". ANSI is at 1430 Broadway, NYC 10018; be warned that their prices are high. In case anyone's interested, the relevant item is in section 5.2, in the description of LF: "Where appropriate, this character may have the meaning "New Line" (NL), a format effector that advances the active position to the first character position on the next line. Use of the NL convention requires agreement between sender and recipient of data." There is no analogous clause under the description of CR. -- "There's a lot more to do in space | Henry Spencer @ U of Toronto Zoology than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry