bph@buengc.BU.EDU (Blair P. Houghton) (06/15/89)
In article <4529@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >scs@adam.pika.mit.edu (Steve Summit) writes: >>It is only a miserable problem when scanf >>is being used for interactive user input, which is what everybody >>uses it for. "Eeep", and "yoicks!" >Anyone using scanf directly for interactive input... or for any input at >all... should have their head examined. > >The only really safe way to use scanf() without freaking out the casual >user of your code is to do something like this: > > fgets(buffer, sizeof buffer, stdin); > sscanf(buffer, fmt, args...); I'll go along with the "don't use it for interactive input" idea, but not the "for any input at all"... When filtering tabular data from files, or dealing in a situation where a precise syntax is necessary, the `fgets(..); sscanf(..)' doublet just adds uncertainty and complexity to a simple problem to which scanf is suited ideally. If there is an error reading something in that case, you usually force a barf. It's irrelevant whether the input gets discarded. The point is, don't reject scanf() just because it's unsuited to a problem you aren't solving. --Blair "We pay csh for used car's..."
peter@ficc.uu.net (Peter da Silva) (06/15/89)
In article <3145@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: > When filtering tabular data from files, or dealing in a situation where > a precise syntax is necessary, the `fgets(..); sscanf(..)' doublet just > adds uncertainty and complexity to a simple problem to which scanf is > suited ideally. Eeep and Yoicks yourself. I suspect this is another religious issue, but given scanf's habit of trashing indeterminate amounts of input and ignoring newlines if you have anything wrong with your format string, well... precision and scanf just don't belong in the same sentence. I tend to stick with strspn() and strtok(), myself. -- Peter da Silva, Xenix Support, Ferranti International Controls Corporation. Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180. Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.
mpl@cbnewsl.ATT.COM (michael.p.lindner) (06/16/89)
In article <4563@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes: > In article <3145@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: > > When filtering tabular data from files, .... > > .... scanf is > > suited ideally. > > I tend to stick with strspn() and strtok(), myself. > > Peter da Silva, Xenix Support, Ferranti International Controls Corporation. Hope you never have to do anything complex. If you call strtok on e string in the middle of a strtok of another string it trashes its state information on the first string (a little known feature), which can cause extremely elusive bugs. My previous project got stuck with this when we started using some library code which called it. For this reason, I avoid strtok like the plague in all but the simplest applications. Mike Lindner attunix!mpl
daveh@marob.masa.com (Dave Hammond) (06/17/89)
In article <824@cbnewsl.ATT.COM> mpl@cbnewsl.ATT.COM (michael.p.lindner) writes: >> > When filtering tabular data from files, .... >> >in the middle of a strtok of another string it trashes its state information >on the first string (a little known feature), which can cause extremely >elusive bugs. Not only that, since it delimits the returned token by replacing the terminating blank with a null, you are forced to work with a copy of the input line, if for some reason the complete line must survive the call to strtok(). I prefer using strpbrk(line, "\s\t\n") and either copying, or just peeking at the token, whichever is required. -- Dave Hammond daveh@marob.masa.com
diamond@diamond.csl.sony.junet (Norman Diamond) (06/20/89)
In article <1134@vsi.COM> friedl@vsi.COM (Stephen J. Friedl) writes: >since strtok places NULs in the string, the >environment was getting corrupted for the child. >Neither of these are bugs -- they are documented parts of the >function -- but nevertheless we have been hit with these gotchas. You mean that these are design bugs instead of coding bugs. They are documented bugs instead of undocumented bugs. Just like gets() has some documented design bugs. Funny, existing practices that consisted of documented bugs really have been standardized. Only existing practices that consisted of quasi-documented but necessary features have been omitted from the standardization. -- Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net) The above opinions are claimed by your machine's init process (pid 1), after being disowned and orphaned. However, if you see this at Waterloo, Stanford, or Anterior, then their administrators must have approved of these opinions.
gwyn@smoke.BRL.MIL (Doug Gwyn) (06/28/89)
In article <10397@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes: >You mean that these are design bugs instead of coding bugs. >They are documented bugs instead of undocumented bugs. >Just like gets() has some documented design bugs. >Funny, existing practices that consisted of documented bugs really >have been standardized. Only existing practices that consisted of >quasi-documented but necessary features have been omitted from the >standardization. I don't know what you mean by that last sentence. Certainly some of the features inherited from the Base Documents were misdesigned in the eyes of many of us, including perhaps a majority of X3J11. Here are the most likely alternatives that faced the committee: (1) Omit the misdesigned function from the Standard. (2) Specify a different behavior for the function than it had in existing practice, to correct the problem. (3) Add a newly named function with an improved design. (4) Standardize the function the way it actually exists. Obviously, some of these alternatives are mutually exclusive. It should be pretty obvious what the pros and cons are for each of these alternatives. Since the primary charter of X3J11 was to standardize existing practice, when it was clear and unambiguous, alternative (4) was used for essentially all the functions in the Base Documents. Alternative (3) was avoided except when there was a pressing need, as with the localization support. Unlike many so-called "standardization" committees, X3J11 did not feel their job was to design a lot of new, unproven stuff then push it as a "standard". Alternative (1) for the most part would have been defaulting on the committee's primary responsibility. Alternative (2) would have caused major compatibility and transition problems. I have to say that I resent the tone of your criticism. X3J11 did an excellent job of standardizing the C programming language, and you could have participated if you had chosen to do so. There were many factors that had to be carefully evaluated in arriving at the final specification. It is easy to imply that you could have done better yourself, but I seriously doubt it.
jss@hector.UUCP (Jerry Schwarz) (06/28/89)
In article <10397@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes: [Some discussion of design flaws in "strtok" and "gets" omitted] > >Funny, existing practices that consisted of documented bugs really >have been standardized. Only existing practices that consisted of >quasi-documented but necessary features have been omitted from the >standardization. > I strongly object to the tone of the above paragraph. It suggests (without coming right out and saying it) that the deliberations of the ANSI C committee were subject to some systematic effect that damaged the design of ANSI C without suggesting what that influence was? Was it incompetence, improper goals, maliciousness, greed, haste, or something else? Since no specific charges are made they can't be refuted. Probably nobody agrees with all the decisions made by the committee. (I happen to agree with it on "strtok" and disagree on "gets", but that isn't particularly relevant.) For the record, I never served on the committee although I know some of the people who have. Jerry Schwarz
peter@ficc.uu.net (Peter da Silva) (07/22/89)
In article <824@cbnewsl.ATT.COM>, mpl@cbnewsl.ATT.COM (michael.p.lindner) writes: > In article <4563@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes about scanf: > > I tend to stick with strspn() and strtok(), myself. > Hope you never have to do anything complex. If you call strtok on e string > in the middle of a strtok of another string it trashes its state information > on the first string (a little known feature), which can cause extremely > elusive bugs. Sounds like something too fancy for scanf, too. But thanks for the info... I've never gotten that complex with strtok. By that time I'm usually stepping through the string by hand (while (strchr(*s, legalchars)) s++;)... [files away information that there are broken implementations of strtok...] -- Peter da Silva, Xenix Support, Ferranti International Controls Corporation. Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180. Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.
mcdonald@uxe.cso.uiuc.edu (07/22/89)
Re:Re:Re:Re:Re:Re:Problems with scanf (and strtok) I once (and only once!) wrote a "commercial" program. It is for use by dumb students (well, relatively - they have made it half way through a Junior level Quantum Mechanics course and survived many 3 dimensional PDE's, Hermite, Legendre, and Associated Laguerre polynomials and mathematical induction) and watched them, some of whom, still, in 1985, had never touched a computer before. I finally gave up trying to use ANY canned input routine, and wrote my own that scanned the input character by character, giving a hopefully appropriate, meaningful, error message as they typed the offending character (not requiring a carriage return before giving a message.) For programs that only I use, I use scanf all the time. For programs I buy, I like quick and efficient error messages. MAybe we need a new acronym: WYDIIWYSI: When you do it is when you see it! It still bothers me to see a C compiler issue 100 error messages when the program contains only one bug! Doug McDonald
friedl@vsi.COM (Stephen J. Friedl) (07/23/89)
In article <4563@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes: > I tend to stick with strspn() and strtok(), myself. In article <824@cbnewsl.ATT.COM>, mpl@cbnewsl.ATT.COM (michael.p.lindner) writes: > Hope you never have to do anything complex. If you call strtok on e string > in the middle of a strtok of another string it trashes its state information > on the first string (a little known feature), which can cause extremely > elusive bugs. In addition, strtok() considers multiple occurrences of the separating token to be one, so you can't use it for obvious things like parsing an /etc/passwd line. One more thing. If you use strtok to pick apart an environment variable, be sure to copy the environment string somewhere before you tear into it. We had a program whose child processes were always failing, and it drove us nuts until we realized that it was strtok again. We were picking apart $PATH earlier in the program, and since strtok places NULs in the string, the environment was getting corrupted for the child. Neither of these are bugs -- they are documented parts of the function -- but nevertheless we have been hit with these gotchas. Steve -- Stephen J. Friedl / V-Systems, Inc. / Santa Ana, CA / +1 714 545 6442 3B2-kind-of-guy / friedl@vsi.com / {attmail, uunet, etc}!vsi!friedl ---> vsi!bang!friedl <-- NEW "Friends don't let friends run Xenix" - me
dal@midgard.Midgard.MN.ORG (Dale Schumacher) (08/04/89)
In article <4596@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: |In article <824@cbnewsl.ATT.COM>, mpl@cbnewsl.ATT.COM (michael.p.lindner) writes: |> Hope you never have to do anything complex. If you call strtok on e string |> in the middle of a strtok of another string it trashes its state information |> on the first string (a little known feature), which can cause extremely |> elusive bugs. | |Sounds like something too fancy for scanf, too. But thanks for the info... I've |never gotten that complex with strtok. By that time I'm usually stepping |through the string by hand (while (strchr(*s, legalchars)) s++;)... Be careful here. Since strchr() will match the '\0' at the end of legalchars, you may walk right out of the string if all characters are legal! I use a macro like the following: #define IN_SET(set,c) ((c) && strchr((set), (c))) Note, this is NOT a "safe" macro, so be sure c has no side-effects... |[files away information that there are broken implementations of strtok...] The operation of strtok() described above is NOT broken, it's documented. It is also somewhat less useful than it could be due to it's "interesting" quirks, but it IS defined as working that way.
nather@ut-emx.UUCP (Ed Nather) (08/04/89)
In article <1122@midgard.Midgard.MN.ORG>, dal@midgard.Midgard.MN.ORG (Dale Schumacher) writes: > The operation of strtok() described above is NOT broken, it's documented. > It is also somewhat less useful than it could be due to its "interesting" > quirks, but it IS defined as working that way. Gosh, that makes programming really easy! Just throw something together, document all the bugs, and you're done! In my view, the operation of strtok() --- and, to a considerable extent, the operation of scanf() --- are both broken, documentation notwithstanding. I have totally avoided scanf() for 8 years, and will continue to do so. I wrote my own small versions of strtok() after reading its description, so I have never used the "official" one. -- Ed Nather Astronomy Dept, U of Texas @ Austin