greim@sbsvax.UUCP (Michael Greim) (07/03/89)
In article <192@dg.dg.com> in comp.bugs.sys5 rec@dg.dg.com (Robert Cousins) reported a bug in csh. Here is a description of it and a fix. Absorb, apply and enjoy, -mg Symptoms: >I have found that there are several basic bugs in the Cshell history >mechanism which can be quite problematic. These were originally called >to my attention by the work of a couple of researchers at the U of W in >Madison. Bug 1: > > !a%999999999f > >which causes 999999999 spaces to be output to the screen. (Fewer on >16 bit machines.) It seems that the error message in the history >module goes through the first argument of a printf() call and can be >interpreted as a format. A second bug, which happens on fewer machines >takes place with: > > !a%f%f > >which can cause csh to dump core from a floating point error. [...] Diagnosis: When printing an error message the csh function "error" passes its first argument to its printf. If this string contains a "%", printf tries to evaluate the following as a format accessing its argument list, which is empty quite probably. This will cause unpredictable behaviour, like core dumps, strange numbers, ... Therapy: Apply the following context diff to your source of 43BSD csh. This fix works like this: - in error only a string ending in "@" is passed as format string to printf. - only internal error messages needing arguments need format specifiers, so I add a trailing "@" to any such message - any string not ending with "@" is passed to printf with "%s" as first parameter. *** sh.dol.c.old Mon Jul 3 15:18:22 1989 --- sh.dol.c Mon Jul 3 15:12:39 1989 *************** *** 159,165 if (c == c1) break; if (c == '\n' || c == DEOF) ! error("Unmatched %c", c1); if ((c & (QUOTE|TRIM)) == ('\n' | QUOTE)) --wp, ++i; if (--i <= 0) --- 159,165 ----- if (c == c1) break; if (c == '\n' || c == DEOF) ! error("Unmatched %c@", c1); /* mg 3-jul-89 */ if ((c & (QUOTE|TRIM)) == ('\n' | QUOTE)) --wp, ++i; if (--i <= 0) *** sh.err.c.old Mon Jul 3 15:18:36 1989 --- sh.err.c Mon Jul 3 15:16:10 1989 *************** *** 34,39 { register char **v; register char *ep; /* * Must flush before we print as we wish output before the error --- 34,40 ----- { register char **v; register char *ep; + register int i; /* mg 3-jul-89 */ /* * Must flush before we print as we wish output before the error *************** *** 53,58 /* * A zero arguments causes no printing, else print * an error diagnostic here. */ if (s) printf(s, arg), printf(".\n"); --- 54,65 ----- /* * A zero arguments causes no printing, else print * an error diagnostic here. + * mg 3-jul-89 : If error is called with s containing a part of + * the users input, and if this contains a "%", printf tries to evaluate + * this as a format reference. Try : "!a%100s". + * I fixed this: internal error messages which should be processed as + * format strings, will end in "@", all else is NOT interpreted as a + * format string. */ if (s) { i = strlen (s) - 1; *************** *** 54,61 * A zero arguments causes no printing, else print * an error diagnostic here. */ ! if (s) ! printf(s, arg), printf(".\n"); didfds = 0; /* Forget about 0,1,2 */ if ((ep = err) && errspl) { --- 61,74 ----- * format strings, will end in "@", all else is NOT interpreted as a * format string. */ ! if (s) { ! i = strlen (s) - 1; ! if (s[i] == '@') { ! s[i] = '\0'; ! printf(s, arg), printf(".\n"); ! } else ! printf("%s\n", s); ! } didfds = 0; /* Forget about 0,1,2 */ if ((ep = err) && errspl) { *** sh.func.c.old Mon Jul 3 15:18:55 1989 --- sh.func.c Mon Jul 3 15:12:55 1989 *************** *** 1004,1010 while (*cp && *cp == *str) cp++, str++; if (*cp) ! error("Bad scaling; did you mean ``%s''?", str0); } plim(lp, hard) --- 1004,1010 ----- while (*cp && *cp == *str) cp++, str++; if (*cp) ! error("Bad scaling; did you mean ``%s''?@", str0); /* mg 3-jul-89 */ } plim(lp, hard) *** sh.glob.c.old Mon Jul 3 15:19:06 1989 --- sh.glob.c Mon Jul 3 15:13:10 1989 *************** *** 139,145 if (gpathp != gpath + 1) { *gpathp = 0; if (gethdir(gpath + 1)) ! error("Unknown user: %s", gpath + 1); (void) strcpy(gpath, gpath + 1); } else (void) strcpy(gpath, value("home")); --- 139,145 ----- if (gpathp != gpath + 1) { *gpathp = 0; if (gethdir(gpath + 1)) ! error("Unknown user: %s@", gpath + 1); /* mg 3-jul-89 */ (void) strcpy(gpath, gpath + 1); } else (void) strcpy(gpath, value("home")); *** sh.sem.c.old Mon Jul 3 15:19:21 1989 --- sh.sem.c Mon Jul 3 15:13:22 1989 *************** *** 401,405 return; if ((stb.st_mode & S_IFMT) == S_IFCHR) return; ! error("%s: File exists", cp); } --- 401,405 ----- return; if ((stb.st_mode & S_IFMT) == S_IFCHR) return; ! error("%s: File exists@", cp); /* mg 3-jul-89 */ } -- Michael Greim Email : greim@sbsvax.informatik.uni-saarland.dbp.de or : ...!uunet!unido!sbsvax!greim # include <disclaimers/std.h>
argv%eureka@Sun.COM (Dan Heller) (07/05/89)
Why are people so stuck on using printf? Michael Greim finds a bug in csh because it misuses printf, yet the fix (altho it works) continues to use printf -- I have nothing against using printf, but this is a very costly function when you compare it to something like fputs or puts. (Have you ever seen the source to printf()?) But more importantly, it causes severe bugs with programs that are sometimes hard to trace. In article greim@sbsvax.UUCP (Michael Greim) writes: > ! if (s) { > ! i = strlen (s) - 1; > ! if (s[i] == '@') { > ! s[i] = '\0'; > ! printf(s, arg), printf(".\n"); > ! } else > ! printf("%s\n", s); > ! } How about this: if (s[i] == '@') { s[i] = '\0'; printf(s, arg), puts("."); } else puts(s); Now don't get me wrong, I realize that this is a trivial "simple" thing that one might say, "give me a break." But the fact that people are not as conscientious about how printf is used is the reason that the bugs recently found is csh are created. A much more important bug that I've found as a result of the same errors is when programs write out data to files using fprintf. As soon as there is a %s in the data written, you just created a junk file. And this is also the type of bug that doesn't present itself frequently. dan <island!argv@sun.com> ----- My postings reflect my opinion only -- When on the net, I speak for no company.
chk@dciem.dciem.dnd.ca (C. Harald Koch) (07/05/89)
In article <113630@sun.Eng.Sun.COM> island!argv@sun.com (Dan Heller) writes: >Why are people so stuck on using printf? Because everybody's first C program, taken from K&R, is: main() { printf("Hello, World!\n"); } From step one, we are taught to use it for everything. -- C. Harald Koch NTT Systems, Inc., Toronto, Ontario chk@gpu.utcs.utoronto.ca, chk@dretor.dciem.dnd.ca, chk@chkent.UUCP University: Like a software house, except the software's free, and it's usable, and it works, and if it breaks they'll quickly tell you how to fix it, and ...
greim@sbsvax.UUCP (Michael Greim) (07/06/89)
In article <113630@sun.Eng.Sun.COM>, argv%eureka@Sun.COM (Dan Heller) writes: > Why are people so stuck on using printf? Michael Greim finds a bug > in csh because it misuses printf, yet the fix (altho it works) continues > to use printf -- I have nothing against using printf, but this is a very > costly function when you compare it to something like fputs or puts. > (Have you ever seen the source to printf()?) But more importantly, it > causes severe bugs with programs that are sometimes hard to trace. 1.) I did not find the bug in printf, Robert Cousins did. I rather found a fix for it. 2.) Yes stdio's printf is very costly, but ... 3.) ... csh uses its own printf. In fact, printf just calls _doprnt. _doprnt calls strout to emit parts of its stuff, which in turn calls putchar. There is no puts or fputs. 4.) Of course I could have used putchar. But for the strings I would have needed a loop -> more code and error messages occur fairly seldom, so there is not much gain in trying to safe some microseconds by taking the shortcut for the calling sequence. 5.) _doprnt is written in VAX assembler. So on other machines it must either be rewritten in the appropriate assembler, or a C implementation of printf must be used. There might not even be a putchar available after all. In my testversion I use the C printf from vi. 6.) Yes, I have seen the innards of printf. In my own programs I try to avoid printf (and scanf) if possible. (See "strings", recently published in comp.sources.misc) > Now don't get me wrong, I realize that this is a trivial "simple" thing > that one might say, "give me a break." But the fact that people are not > as conscientious about how printf is used is the reason that the bugs > recently found is csh are created. A much more important bug that I've > found as a result of the same errors is when programs write out data to > files using fprintf. As soon as there is a %s in the data written, you > just created a junk file. And this is also the type of bug that doesn't Normally one gets a core dump fairly quick. How can you create a file using fprintf ("%s", n); ? (Which is what I think you are writing about) And then there is lint ... -mg -- Michael Greim Email : greim@sbsvax.informatik.uni-saarland.dbp.de or : ...!uunet!unido!sbsvax!greim [.signature removed by the board of censors for electronic mail's main executive computer because it contained a four letter word ("word")]
argv%eureka@Sun.COM (Dan Heller) (07/07/89)
In article <771@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes: > In article <113630@sun.Eng.Sun.COM>, argv%eureka@Sun.COM (Dan Heller) writes: > > Why are people so stuck on using printf? Michael Greim finds a bug Basically, I apologize for the "flame" that people seem to think I posted. I wasn't flaming really. I just meant to point out that puts() will never be slower than printf(), so to avoid "other bugs" (see below), it's safer to use puts when that's what you really meant. Sorry, Michael. > > As soon as there is a %s in the data written, you > > just created a junk file. And this is also the type of bug that doesn't > Normally one gets a core dump fairly quick. > How can you create a file using > fprintf ("%s", n); > ? (Which is what I think you are writing about) > And then there is lint ... No, I was talking about the type of bug that lint can't catch. That is, I see this type of usage all the time: ... send_to_file("this is a string"); ... send_to_file(s) char *s; { extern FILE *fp; fprintf(fp, s); } It is this type of bug that eventually catches up to you. For example, a hypothetical mail program could extract the return address of a user and try to print it: extern char *get_return_address(); char *addr; addr = get_return_address(message_3); ... printf(addr); Now, suppose the address that was returned was: argv%island@sun.com What do you suppose will happen? This is the type of warning I was trying to convey to people. Sorry if I implied anything else... dan <island!argv@sun.com> ----- My postings reflect my opinion only -- not the opinion of any company.
allyn@hp-sdd.hp.com (Allyn Fratkin) (07/10/89)
In article <769@sbsvax.UUCP>, greim@sbsvax.UUCP (Michael Greim) writes: > This fix works like this: > - in error only a string ending in "@" is passed as format string > to printf. > - only internal error messages needing arguments need format specifiers, > so I add a trailing "@" to any such message > - any string not ending with "@" is passed to printf with "%s" as first > parameter. nothing personal, but what a disgusting hack. you use @ as the *last* character of the string to indicate the presence of a second parameter? wouldn't it have been much easier to use the first char? but why didn't you just change the calls error(singlearg); to error("%s", singlearg); seems like it would have been much cleaner. (you could have done it with sed! but i won't get into that now :-) ) -- From the virtual mind of Allyn Fratkin allyn@sdd.hp.com San Diego Division - or - Hewlett-Packard Company uunet!ucsd!hp-sdd!allyn
greim@sbsvax.UUCP (Michael Greim) (07/10/89)
In article <2218@hp-sdd.hp.com>, allyn@hp-sdd.hp.com (Allyn Fratkin) writes: > In article <769@sbsvax.UUCP>, greim@sbsvax.UUCP (Michael Greim) writes: > > This fix works like this: > > - in error only a string ending in "@" is passed as format string > > to printf. > > - only internal error messages needing arguments need format specifiers, > > so I add a trailing "@" to any such message > > - any string not ending with "@" is passed to printf with "%s" as first > > parameter. > > nothing personal, but what a disgusting hack. you use @ as the *last* > character of the string to indicate the presence of a second parameter? > wouldn't it have been much easier to use the first char? I never denied that it was a disgusting hack. Most changes in csh will result in such a thing. When I decided on how to fix it, I had the following goals in mind: - The fixes should be to as little files as possible. If people have done changes to csh, they may have to do the new fixes by hand, and will be really pissed off, if they have so much to do. (Our csh contains a lot of changes, e.g. it has a version command, which prints out csh's version, some special csh variables which change certain (mis-)behaviours; the format of "time" is changed ...) - there are a lot more invocations to error with one argument than there are to error with two arguments. - No new function unless necessary. - No new global variables unless necessary. - Didn't want to spend too much time on it. - I did not use "@" as the first character, because the command !@%10s would then still produce 10 spaces, as error would see "@%10s", think: "aha, it starts with @, so it is a format string", call printf ("%10s", ... which will produce 10 space, and maybe a core dump. I decided on "@" because no valid error message from inside csh can end with a "@". > > but why didn't you just change the calls > error(singlearg); > to > error("%s", singlearg); > > seems like it would have been much cleaner. Maybe, but there are several error function, which package the messages in several ways. If one uses "error(fmt, arg)", one must replace some calls to seterr2 and seterrc, one must trace possible calls to "error(av)" "Perror" or to "error" of the form "error(err)" to see, where the "av" really comes from, and probably change some other calls too, if the "av" can contain a "%". I preferred the simple, though ugly way, as I did not want to spend a lot of time on the investigation whether any alternative might yield a more beautiful fix. > From the virtual mind of Allyn Fratkin allyn@sdd.hp.com > San Diego Division - or - > Hewlett-Packard Company uunet!ucsd!hp-sdd!allyn -mg -- Michael Greim Email : greim@sbsvax.informatik.uni-saarland.dbp.de or : ...!uunet!unido!sbsvax!greim [.signature removed by the board of censors for electronic mail's main executive computer because it contained a four letter word ("word")]
greim@sbsvax.UUCP (Michael Greim) (07/10/89)
In article <114074@sun.Eng.Sun.COM>, argv%eureka@Sun.COM (Dan Heller) writes: = In article <771@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes: = > In article <113630@sun.Eng.Sun.COM>, argv%eureka@Sun.COM (Dan Heller) writes: = > > Why are people so stuck on using printf? Michael Greim finds a bug = = Basically, I apologize for the "flame" that people seem to think I = posted. I wasn't flaming really. I just meant to point out that = puts() will never be slower than printf(), so to avoid "other bugs" = (see below), it's safer to use puts when that's what you really meant. = Sorry, Michael. No offense taken. As to "puts() will never be slower than printf()": I recall reading somewhere that some people think about implementing puts() using printf(). = = No, I was talking about the type of bug that lint can't catch. That is, = I see this type of usage all the time: = = ... = send_to_file("this is a string"); = ... = = send_to_file(s) = char *s; = { = extern FILE *fp; = fprintf(fp, s); = } You must look at some strange progams :-) I aggree that this is bad practice and people should be made aware of it. -mg -- Michael Greim Email : greim@sbsvax.informatik.uni-saarland.dbp.de or : ...!uunet!unido!sbsvax!greim [.signature removed by the board of censors for electronic mail's main executive computer because it contained a four letter word ("word")]
frank@zen.co.uk (Frank Wales) (07/12/89)
In article <774@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes: >As to "puts() will never be slower than printf()": I recall reading >somewhere that some people think about implementing puts() using printf(). If I ever found that a C library I was using had puts() (or any other straightforward string function) implemented by calling printf(), I'd be round at the vendor's factory gates with a large shovel to reprogram the clown responsible. Just in case you know someone who's thinking about it. -- Frank Wales, Systems Manager, [frank@zen.co.uk<->mcvax!zen.co.uk!frank] Zengrange Ltd., Greenfield Rd., Leeds, ENGLAND, LS9 8DB. (+44) 532 489048 x217
greim@sbsvax.UUCP (Michael Greim) (07/14/89)
In article <1640@zen.co.uk>, frank@zen.co.uk (Frank Wales) writes: > In article <774@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes: > >As to "puts() will never be slower than printf()": I recall reading > >somewhere that some people think about implementing puts() using printf(). > > If I ever found that a C library I was using had puts() (or any other > straightforward string function) implemented by calling printf(), I'd > be round at the vendor's factory gates with a large shovel to reprogram > the clown responsible. Just in case you know someone who's thinking > about it. If it isn't too far away, I'd join you :-) I've dug in our archives, and lo! I have found the reference in question. Here it is: #From: bostic@ucbvax.BERKELEY.EDU (Keith Bostic) #Newsgroups: comp.bugs.4bsd #Subject: Re: [n]eqn(1) speedup #Keywords: [n]eqn munches too much CPU time #Message-ID: <25589@ucbvax.BERKELEY.EDU> #Date: 11 Aug 88 17:52:19 GMT #References: <583@sbsvax.UUCP> #Organization: University of California at Berkeley #Lines: 13 #Posted: Thu Aug 11 18:52:19 1988 # #In article <583@sbsvax.UUCP>, bs@sbsvax.UUCP (Bernard Sieloff) writes: #> Subject: [n]eqn(1) speedup #> Input to eqn with no effect to eqn itself will be passed unaltered to #> stdout, but by calling printf()! #> Eqn behavior is heavily sped up, if the "passing"-printf() calls are #> changed to fputs() calls. # #This is implementation dependent. On the current BSD system, printf #is significantly faster, for strings that don't contain percent signs, #than [f]puts calls. I expect to make [f]puts just call printf in the #near future. # #Keith Bostic > -- > Frank Wales, Systems Manager, [frank@zen.co.uk<->mcvax!zen.co.uk!frank] > Zengrange Ltd., Greenfield Rd., Leeds, ENGLAND, LS9 8DB. (+44) 532 489048 x217 -mg -- Michael Greim Email : greim@sbsvax.informatik.uni-saarland.dbp.de or : ...!uunet!unido!sbsvax!greim [.signature removed by the board of censors for electronic mail's main executive computer because it contained a four letter word ("word")]