dana@rucs.runet.edu (Dana Eckart) (01/19/91)
Does there exist a piece of software (or is it even possible) to compile
a pipe?  In particular, suppose you had 
	ls -l | fgrep "Dec" | cut -f 4
is there anyway to compile the above pipeline so that the pieces can
communicate more quickly.  I am looking for a general solution, not
one that works only for the above example.
The question arises because I have constructed some small programs which 
become VERY slow when piped together.  It appears that if I can get around 
the slow speed of standard (character based) i/o that things will be MUCH 
faster.
Although I suspect I am stuck (unless I rewrite my code - combining the
pieces programs into a single program), perhaps some kind netter will be
able to save me a great deal of grief.
Thanks in advance...
J Dana Eckart     INTERNET: dana@rucs.runet.edu
                     SNAIL: P.O. Box 10865/Blacksburg, VA  24062-0865barmar@think.com (Barry Margolin) (01/19/91)
In article <1991Jan18.193234.216@rucs.runet.edu> dana@rucs.runet.edu (Dana Eckart) writes: >Does there exist a piece of software (or is it even possible) to compile >a pipe? In particular, suppose you had > > ls -l | fgrep "Dec" | cut -f 4 > >is there anyway to compile the above pipeline so that the pieces can >communicate more quickly. I am looking for a general solution, not >one that works only for the above example. I'm not really sure I (or you) understand what you expect the pipe to be compiled into. On Unix, each program has to be run in its own process, so they're going to have to use some form of inter-process communication to feed the data to each other. There are shell script compilers, but all they do is save the overhead of parsing the commands and interpreting shell built-ins; the compiled script still runs each command in its own process and sets up pipes for them to communicate. >The question arises because I have constructed some small programs which >become VERY slow when piped together. It appears that if I can get around >the slow speed of standard (character based) i/o that things will be MUCH >faster. If the programs that are used in the pipeline do character-at-a-time I/O, then speeding up the pipeline isn't going to help. Compiling the pipeline wouldn't change the programs; they'll still be doing character I/O. I strongly doubt that the speed of the pipe is the limiting factor; this is a pretty simple mechanism whose performance is extremely important to most Unix implementors. I just timed the following on a Sun-4/330 running SunOS 4.0.3: cat file file file | cat >/dev/null "file" is a 4Mb file on an NFS server. The SunOS version of "cat" uses mmap() to read in files named as arguments, so once it is all paged into memory (I ran the command until it got zero page faults) nearly all the overhead should be in the pipe (about 95% of the CPU time was system time, and I doubt I was spending much time in the null device driver). I was getting about 4Mbyte/CPU-second throughput. And I think most stdio implementations don't actually do character-at-a-time I/O. getc() and putc() are usually implemented as macros that read/write a buffer, and don't actually do any I/O until the buffer is empty/full (putc()'s output buffer will also be flushed if you call fflush()). >Although I suspect I am stuck (unless I rewrite my code - combining the >pieces programs into a single program), perhaps some kind netter will be >able to save me a great deal of grief. Have you actually profiled your programs and found that they are spending most of their time doing I/O to pipes? -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
darcy@druid.uucp (D'Arcy J.M. Cain) (01/19/91)
In article <1991Jan18.193234.216@rucs.runet.edu> Dana Eckart writes: >Does there exist a piece of software (or is it even possible) to compile >a pipe? In particular, suppose you had > ls -l | fgrep "Dec" | cut -f 4 >is there anyway to compile the above pipeline so that the pieces can >communicate more quickly. I am looking for a general solution, not >one that works only for the above example. I don't see how. Any program that was created from the above line would have to do everything the shell does when it sees that line and that program has to be loaded and run as well. If anything such a program would slow it down. Just a thought BTW - are you running out of memory? If you are right at the low limit you may be swapping when you get a large enough pipe. My motherboard died recently and I have been running on a borrowed one with less memory and I see a lot of slowdown with it. The swapping is quite noticeable. -- D'Arcy J.M. Cain (darcy@druid) | D'Arcy Cain Consulting | There's no government West Hill, Ontario, Canada | like no government! +1 416 281 6094 |
raja@bombay.cps.msu.edu (Narayan S. Raja) (01/19/91)
In article <1991Jan18.230530.9331@convex>, (Tom Christiansen) writes:
< From the keyboard of dana@rucs.runet.edu (Dana Eckart):
< :Does there exist a piece of software (or is it even possible) to compile
< :a pipe?  In particular, suppose you had 
< :
< :	ls -l | fgrep "Dec" | cut -f 4
< :
< :is there anyway to compile the above pipeline so that the pieces can
< :communicate more quickly.  I am looking for a general solution, not
< :one that works only for the above example.
< In general, the answer to whether things like this can be automagically
< compiled is no, because you can't know what all the pieces are a priori.
However, wouldn't pipes be speeded up considerably
on a Sun by mounting /tmp as a tmpfs filesystem
(i.e. memory-based filesystem)?  Apparently tmpfs
is *really* quick under SunOS 4.1.1.
Pardonnez-moi if this is a dumb suggestion.
Narayan Sriranga Raja.mike (Michael Stefanik) (01/20/91)
In article <1991Jan18.193234.216@rucs.runet.edu> rucs.runet.edu!dana (Dana Eckart) writes: > >Does there exist a piece of software (or is it even possible) to compile >a pipe? In particular, suppose you had > > ls -l | fgrep "Dec" | cut -f 4 > >is there anyway to compile the above pipeline so that the pieces can >communicate more quickly. I am looking for a general solution, not >one that works only for the above example. Unless I'm reading you wrong, you seem to think that pipes are some coded mechanism for communication between processes; it isn't. An (anonymous) pipe is a temporary entity created in the filesystem by the kernel on behalf of two related processes that want to communicate. It is useful to think of a pipe as a regular file, in which one process is writing to on one end, and another process is reading from on the other end. Typically, a pipe can buffer up to about 5K of data flowing through the pipe. When the pipe "fills up", the writing process is blocked until the reading process reads from the pipe. Similarly, the reading process will block on an empty pipe, until the writing process writes something. Should the reading process die and the writing process attempt to write on the pipe, a signal will be sent (SIGPIPE) to the offending writing process (which tells it that there is no longer anything out there to read from the pipe). If this wasn't done, the writing process would deadlock when the pipe buffer filled, waiting for a reading process that no longer existed. So, after this brief overview of piping, the answer is, no, you cannot "compile" pipes to increase the speed of reads and writes to the pipe. An excellent reference would be Bach's book on UNIX System V. -- Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation UUCP: ...!uunet!bria!mike -- technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly found to be saying things like "Well, it works on my DOS machine ..."
tchrist@convex.COM (Tom Christiansen) (01/20/91)
From the keyboard of uunet!bria!mike (Michael Stefanik): :Unless I'm reading you wrong, you seem to think that pipes are some coded :mechanism for communication between processes; it isn't. An (anonymous) :pipe is a temporary entity created in the filesystem by the kernel on :behalf of two related processes that want to communicate. No, it's not. On BSD systems, pipe(2) is implemented as a semi-disabled version of socketpair(2). It's all IPC -- no filesystem activity is involved. All the work is not a Vax, nor is it SysV. :It is useful :to think of a pipe as a regular file, in which one process is writing to on :one end, and another process is reading from on the other end. : :Typically, a pipe can buffer up to about 5K of data flowing through the 4k on my system. :pipe. When the pipe "fills up", the writing process is blocked until :the reading process reads from the pipe. Similarly, the reading process :will block on an empty pipe, until the writing process writes something. :Should the reading process die and the writing process attempt to write :on the pipe, a signal will be sent (SIGPIPE) to the offending writing :process (which tells it that there is no longer anything out there to :read from the pipe). If this wasn't done, the writing process would deadlock :when the pipe buffer filled, waiting for a reading process that no longer :existed. This is all true and useful information. (As far as I know.) :So, after this brief overview of piping, the answer is, no, you cannot :"compile" pipes to increase the speed of reads and writes to the pipe. But you can often rearrange your program so it doesn't shove the data through a bunch of processes' address spaces. A good example is the slow old makewhatis script, which runs much faster when coded do to the work entirely in one process. --tom -- "Hey, did you hear Stallman has replaced /vmunix with /vmunix.el? Now he can finally have the whole O/S built-in to his editor like he always wanted!" --me (Tom Christiansen <tchrist@convex.com>)
gwyn@smoke.brl.mil (Doug Gwyn) (01/20/91)
In article <1991Jan19.072755.3291@msuinfo.cl.msu.edu> raja@cpswh.cps.msu.edu writes: >However, wouldn't pipes be speeded up considerably >on a Sun by mounting /tmp as a tmpfs filesystem No, genuine pipes are NOT files in /tmp!
mike@bria (01/20/91)
Tom Christiansen writes: >From the keyboard of uunet!bria!mike (Michael Stefanik): >>Unless I'm reading you wrong, you seem to think that pipes are some coded >>mechanism for communication between processes; it isn't. An (anonymous) >>pipe is a temporary entity created in the filesystem by the kernel on >>behalf of two related processes that want to communicate. > >No, it's not. On BSD systems, pipe(2) is implemented as a >semi-disabled version of socketpair(2). It's all IPC -- no >filesystem activity is involved. All the work is not a Vax, >nor is it SysV. Yup, and as I was writing this I thought of mentioning it, but decided not to (someone who's having troubles with SV3 pipes ain't gonna glean much from the BSD socket mechanism.) However, let's not forget that member of the pipe family that is a certain member of the filesystem, namely, the "named pipe". Hmmm ... there are anonymous pipes and named pipes. How about the "incognito pipe"? -- Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation UUCP: ...!uunet!bria!mike -- technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly found to be saying things like "Well, it works on my DOS machine ..."
guy@auspex.auspex.com (Guy Harris) (01/21/91)
>However, let's not forget that member of the pipe family that is a >certain member of the filesystem, namely, the "named pipe". It may be a member of the file system, but in many flavors of UNIX - including SunOS and, I think, S5R4 - a named pipe may have a *name* that's in the file system, but I/O to or from a named pipe doesn't go through the file system.
klaus@cnix.uucp (klaus u schallhorn) (01/21/91)
In article <373@bria> uunet!bria!mike (Michael Stefanik) writes: >In article <1991Jan18.193234.216@rucs.runet.edu> rucs.runet.edu!dana (Dana Eckart) writes: >> >>Does there exist a piece of software (or is it even possible) to compile >>a pipe? In particular, suppose you had >> >> ls -l | fgrep "Dec" | cut -f 4 >> >>is there anyway to compile the above pipeline so that the pieces can >>communicate more quickly. I am looking for a general solution, not >>one that works only for the above example. > >Unless I'm reading you wrong, you seem to think that pipes are some coded >mechanism for communication between processes; it isn't. An (anonymous) >pipe is a temporary entity created in the filesystem by the kernel on >behalf of two related processes that want to communicate. It is useful >to think of a pipe as a regular file, in which one process is writing to on >one end, and another process is reading from on the other end. > But only to THINK of a pipe as a file, under unix there never IS a file. The fact that pipes are implemented as files on certain other operating systems probably lead to confusion someplace. klaus >-- >Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation >UUCP: ...!uunet!bria!mike >-- >technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly >found to be saying things like "Well, it works on my DOS machine ..." -- George Orwell was an Optimist
guy@auspex.auspex.com (Guy Harris) (01/23/91)
>But only to THINK of a pipe as a file, under unix there never IS a file.
Well, under *some* versions of UNIX (V7, S3, and I think it's still true
in S5, prior to S5R4), a pipe is sort-of implemented as a file, complete
with an inode *and* a list of N direct blocks pointed to by that inode;
those blocks really do end up containing the data in the file, although
if you're not unlucky, the data will be consumed before the block ever
has to be written to disk. rorex@locus.com (Phil Rorex) (01/24/91)
>In article <1991Jan18.193234.216@rucs.runet.edu> Dana Eckart writes: >>Does there exist a piece of software (or is it even possible) to compile >>a pipe? In particular, suppose you had >> ls -l | fgrep "Dec" | cut -f 4 >>is there anyway to compile the above pipeline so that the pieces can >>communicate more quickly. I am looking for a general solution, not >>one that works only for the above example. > > >I don't see how. Any program that was created from the above line would >have to do everything the shell does when it sees that line and that >program has to be loaded and run as well. If anything such a program >would slow it down. > >Just a thought BTW - are you running out of memory? If you are right I agree. Don't overlook this. ^^^^^^^^^^^^^^^^^^^^^ >at the low limit you may be swapping when you get a large enough pipe. >My motherboard died recently and I have been running on a borrowed one >with less memory and I see a lot of slowdown with it. The swapping is >quite noticeable. > >-- >D'Arcy J.M. Cain (darcy@druid) | >D'Arcy Cain Consulting | There's no government >West Hill, Ontario, Canada | like no government! >+1 416 281 6094 | I've split up many a long pipe because of excessive paging. On a heavily loaded machine, ls -l > /tmp/tmp.$$.1 fgrep "Dec" < /tmp/tmp.$$.1 > /tmp/tmp.$$.2 cut -f 4 < tmp.$$.2 rm /tmp/tmp.$$.[12] & can get in and out before all the pieces of the pipeline ls -l | fgrep "Dec" | cut -f 4 ever even get loaded in. BTW, I've found egrep to be faster than fgrep on the paging unix's I've been on for scenario's like yours. _ Your mileage may vary. +1 213 337-5062 |_) |_ . | ...!{ucla-se|uunet}!lcc!rorex Phillip Rorex | | ( | | rorex@locus.com Disclaimer: I speak only for myself -- _ +1 213 337-5062 |_) |_ . | ...!{ucla-se|uunet}!lcc!rorex Phillip Rorex | | ( | | rorex@locus.com Disclaimer: I speak only for myself