dan@bbncd@sri-unix.UUCP (07/22/83)
From: Dan Franklin <dan@bbncd> UNIX developers were once justifiably proud of the fact that pipes were an INTERACTIVE way of coupling programs; if the standard input and output of a pipeline were the terminal, you could type a line at it and immediately see the result. But now that stdio is used everywhere, this feature has been "fixed": a pipeline is usually NOT interactive. I found this very annoying when I tried to use m4 with adb, and ended up adding a flag to m4 to force non-buffering. Bell must have encountered this problem too, since cat -u forces non-buffering. But cat -u is clearly wrong (even though Rob Pike blesses it); the problem is with stdio, not with individual commands, and we should not change all the UNIX commands that might be used in an interactive pipeline to implement -u. The question is, what should be done? I have just encountered this problem again with a different program, and I would like to solve it once and for all. One solution would be to have stdio test an fd to see if it is a pipe (somehow), and line-buffer stdout if it is, treating it like a terminal. This could be pretty expensive compared to writing in 1kbyte blocks, so it might be better to leave the default the way it is and provide a way to tell stdio to line-buffer on demand. The obvious way to communicate with stdio, without changing every command, is to use the environment. Before writing, stdio would look in the environment to see if there were any special buffering instructions. Example: STDOUT=LINE_BUFFERED m4 | adb Is this reasonable? Is there a better way? Dan Franklin
kdp@hplabs.UUCP (Ken Poulton) (07/25/83)
Yes! I remember interactive pipelines and they were very useful at times. 'cat -u' is certainly a kluge, and an environment variable does seem like the right way to control stdio buffering. Note that environment variables need not be set at all unless you wish to change from the default buffering. I guess I would set up an alias to do the environment setting, so I could make a pipe run interactively with less typing: % ip m4 | adb One might even consider having the shell set that environment variable for you whenever the final STDOUT came to the terminal. Ken Poulton ...!hplabs!kdp
noel@cubsvax.UUCP (07/27/83)
I have run into the stdout buffering headache several times, most painfully when piping from an interactive fortran program. (Who wants to hack a -u flag into a large f77 program? I reluctantly did the equivalent). My problem was when using "script" or "tee", that stdout wasn't flushed when the program did a read on stdin. Is there a more general problem? I suppose in some cases you might need LINE buffering even when it's a unidirectional communication path too (the pipe reader wants to read line by line (why??)). Considering just the "interactive" case.. why couldn't stdio be smart enough to check for stdout being a pipe, then flush it when stdin is read? This way you wouldn't pay the performance penalty of ALWAYS line buffering on pipes, but fix the (most common?) problem with the current scheme. What am I missing? Noel Kropf harpo!rocky2!cubsvax!noel 1002 Fairchild philabs!cmcl2!rocky2!cubsvax!noel Columbia University New York NY 10027 212-280-5517 -- Noel Kropf harpo!rocky2!cubsvax!noel 1002 Fairchild philabs!cmcl2!rocky2!cubsvax!noel Columbia University New York NY 10027 212-280-5517
dan@bbncd@sri-unix.UUCP (07/29/83)
From: Dan Franklin <dan@bbncd> Thanks for the comments. Since no one came up with anything better than using the environment, that's what I will do. In case anyone else is interested in solving this problem, here's exactly how I plan to do it (when I finally get around to it). I plan to have three choices, each set by assigning a value to STDOUT in the environment: STDOUT=NBUF # no buffering STDOUT=LBUF # line buffering STDOUT=BBUF # block buffering If there is an explicit setbuf(stdout,...) in the program, it seems like the environment setting ought to override it, at least in the case where the environment specifies less buffering than the setbuf. I plan to have it override all the time. The environment variable would also override stdio's determination of the buffering to use based on isatty(). The same mechanism will also work for STDERR. (STDERR=BBUF could be handy when you have a program producing lots of error output, all unbuffered.) Dan Franklin
gwyn%brl-vld@sri-unix.UUCP (07/31/83)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> There are two obvious choices to solve the problem of buffered output not appearing (i.e. being flushed out) before an input operation: (1) Write the program so it fflush()es relevant output before trying the read (this works with all versions of stdio and puts the decision where it really belongs, in the programmer's lap); (2) Write stdio itself so ALL output streams to terminals (and pipes) are flushed whenever input is attempted from ANY terminal (or pipe). There is no reason to handle stdin and stdout only, since that only helps in some cases but does not solve the general problem, if it is a problem. Personally I favor alternative (1). According to UNIX System V manuals Bell has adopted alternative (2) for terminals only, not pipes. Berkeley appears to have adopted (2) for stdout/stdio terminals only. I think programmers ought to learn to program carefully rather than rely on library code to make up for their sloppiness.
pdl@root44.UUCP (Dave Lukes) (08/01/83)
Agreed: stdio's buffering is painful (BTW: to all of you out there in the dark ages: SIII and upwards ALSO have an m4 flag to do line buffering (yuk)). The problem with the environment hack is that it doesn't really offer fine enough control, (you probably don't want any temp. files etc. being line buffered just because the standard output is a pipe). I suppose it could be extended to allow you to say `gimme block buffering on files, single character buffering on terminals and line buffering on pipes', but the syntax would no doubt be so horrendous that no-one would use it. (BTBTW: you can GUESS if it's a pipe by doing a seek on it and seeing if you get errno == ESPIPE, or, with SIII (rah, rah), you can see if ((statbuf.st_mode&S_IFMT) == S_IFIFO)). 'Nuff said (he he) Dave Lukes (...!vax135!ukc!root44!pdl).
henry@utzoo.UUCP (Henry Spencer) (08/03/83)
At one point, I believe the folks at Duke implemented an alternative to normal stdio buffering: all characters supplied by a single (e.g.) printf call went out as a single write, but there was no buffering across calls. They did this mostly to get the performance of big writes to ttys without the complexities of trying to guess when to flush the buffers, but they claimed that it bought most of the speed of full buffering without the annoyances. I don't believe they did this on pipes, but it might be worth considering. Any notions how much of a performance penalty it would involve? -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
mjl@ritcv.UUCP (Mike Lutz) (08/04/83)
I disagree with Doug Gwyn's suggestion that programmers should be responsible for fflush()ing "relevant" output before doing input; what is more, I don't think it's a question of sloppiness/neatness. Taken to an extreme, the use of libraries at all is sloppy (printf is the lazy way of getting around number conversions, and the window package is a crutch for those too indolent to keep track of what's on the screen). I don't think Doug meant to be so pedantic as to forbid all library routines, but I also believe the issues in an I/O package's functional design are subtle. If programmers embed flushing decisions in their code, users are denied the option of solving the buffering problem THEIR way. Second, every programmer has a slightly different notion of when to call fflush(), so the behavior of commands in a pipeline will be even more unpredictable than now. (One could invent "standards" for the use of fflush(), but then why not relieve the programmer of an unnecessary burden and implement the standard as part of the library?) Finally, who has the time or patience to go through every program on the system inserting calls to fflush()? All of this is not to suggest that I have the ultimate answer to the buffering problem; I don't (and I have reservations about the proposals that have been made). However, if and when a suitable answer becomes available, I hope it's part and parcel of the standard I/O package. Mike Lutz {allegra,seismo}!rochester!ritcv!mjl
trt@rti-sel.UUCP (08/05/83)
The stdout buffering discussion has occurred at least twice before on Usenet. Some blame stdio for being insufficiently clever, some blame programmers for not defending against stdio's cleverness, I blame Dennis Ritchie for not taking the buffering problem seriously. (He once said he never wanted people to use the 'setbuf' hack, that buffering in stdio should be transparent. Alas, it is not.) HOW TO LIVE WITH STDIO BUFFERING Programming 'correctly' in the presence of stdio buffering is a bit hard to define, but you probably need to put an fflush(stdout) before each of the following system calls: read, pause (also 'sleep'), fork, exec[lv], kill, wait, ioctl ('stty') Oddly, exit(II) is OK since it "knows" about stdio. Oh yes, do not use lseek. If you are debugging, put an fflush in just before it core dumps. Oh, and do an fflush before a long running computation. Do you need to fflush before a 'system'? It gets awful, doesn't it. I have found that portability and efficiency are well served by turning on stdio buffering at the very beginning of main(): setbuf(stdout, BUFFERED_STDIO); That means I have to put fflush()es all over the place so the program works, but that is OK because if I do not the program will not work in the presence of clever stdio systems, anyway. For portability, the following are handy (but not perfect): #ifndef BUFFERED_STDIO #define BUFFERED_STDIO (char *)malloc((unsigned)BUFSIZ) #endif #ifndef UNBUFFERED_STDIO #define UNBUFFERED_STDIO (char *)NULL #endif Tom Truscott P.S. I have an old article on what is wrong with stdio buffering and how it was fixed at Duke. Anyone want it re-submitted?
noel@cubsvax.UUCP (08/05/83)
How about a load-time option which would make stdio buffer line-by line? One could "cc dumb.c -lflush" and /lib/libflush.a would be a version of some subset of stdio which did whatever one considers best -- I vote for either line-by-line or per-printf flushes on ALL output files. This would obviate the need for inserting fflush()es to extract the output of a program which bombs due to illegal behaviour like "bus error" "segmentation violation", funny asynchronous behaviour, etc., etc.. Of course if you really felt you needed to optimize for file writes and only flush for pipes and ttys, you could have yet another version of the load library which did that. Of course the different versions of the library would be #ifdef compilations. -- Noel Kropf harpo!rocky2!cubsvax!noel 1002 Fairchild philabs!cmcl2!rocky2!cubsvax!noel Columbia University New York NY 10027 212-280-5517
gwyn@brl-vld@sri-unix.UUCP (08/13/83)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> Your approach certainly seems to assume the least about the details of any particular STDIO package's buffering algorithm, although it entails a lot of effort on the part of the programmer (which I find acceptable, but others haven't). HOWEVER, some brain-damaged STDIO packages will not let you buffer more than a line at a time to terminals (and maybe pipes) even via setbuf(). Your suggestions still work in such a case, and that is the best one could hope for given that STDIO implementation. It is a pity, though, that one can't buffer up a screenload of output to a terminal when he is already taking such pains to setbuf() and fflush() appropriately. It might be useful for you to re-post your memo on the subject, for those who haven't been receiving this list very long.