jwu@kepler.com (Jasper Wu) (05/15/91)
I have some problem when using pipelined tar in bash and hope someone can help me find out why. When I do zcat foo.tar.Z | tar tvfB - or uncompress < foo.tar.Z | tar tvfB - it gives me the table of contents correctly but reports an error message "Broken pipe" to stderr when it finishes. However, if i add a "cat" to the pipeline as cat foo.tar.Z | uncompress | tar tvfB - then it works fine (i.e., no error message). All commands above work fine in csh or sh. The machine is Sparcstation running SunOs 4.1.1; the compiler used to compile bash is Sun's cc. The particular file in my case is bison-1.14.tar.Z (245kb) if that matters. Any ideas? Has any one experienced the same problem? Did i miss something obvious (i'm new to bash) ? Any comments will be greatly appreciated. --jasper
weimer@garden.ssd.kodak.com (Gary Weimer (253-7796)) (05/15/91)
In article <586@kepler1.kepler.com>, jwu@kepler.com (Jasper Wu) writes: |> |> I have some problem when using pipelined tar in bash and hope someone |> can help me find out why. |> |> When I do |> zcat foo.tar.Z | tar tvfB - |> or uncompress < foo.tar.Z | tar tvfB - |> |> it gives me the table of contents correctly but reports an error |> message "Broken pipe" to stderr when it finishes. However, if i add |> a "cat" to the pipeline as |> cat foo.tar.Z | uncompress | tar tvfB - |> |> then it works fine (i.e., no error message). All commands above work fine |> in csh or sh. In csh I use: uncompress -c foo.tar.Z | tar tvf - ^^ (I've never had a problem not using B for tar). I don't know if this will fix your problem or not. weimer@ssd.kodak.com ( Gary Weimer )
heinz@cc.univie.ac.at (05/17/91)
In <1991May15.155040.19078@ssd.kodak.com> weimer@garden.ssd.kodak.com (Gary Weimer (253-7796)) writes: >In article <586@kepler1.kepler.com>, jwu@kepler.com (Jasper Wu) writes: >|> >|> I have some problem when using pipelined tar in bash and hope someone >|> can help me find out why. >|> >|> When I do >|> zcat foo.tar.Z | tar tvfB - >|> or uncompress < foo.tar.Z | tar tvfB - >|> >|> it gives me the table of contents correctly but reports an error >|> message "Broken pipe" to stderr when it finishes. However, if i add >|> a "cat" to the pipeline as >|> cat foo.tar.Z | uncompress | tar tvfB - >|> >|> then it works fine (i.e., no error message). All commands above work fine >|> in csh or sh. >In csh I use: > uncompress -c foo.tar.Z | tar tvf - > ^^ >(I've never had a problem not using B for tar). I don't know if this >will fix your problem or not. This does *not* fix the problem. I've been using bash for quite a long time now, and have made the same experience as Jasper. For bash, it doesn't make any difference whether you use 'compress -c' or 'zcat' (because it's the same program anyway - zcat, compress and uncompress are links to the same program), and it doesn't make any difference whether you use 'tar' with -B or not. I'm sorry I can't give a solution to the problem - all that I know is that bash reports a broken pipe if one of the processes making up the pipe is killed or terminated abnormally (well, it doesn't have to be terminated abnormally, it only needs to be just terminated). As I didn't have time to look into bash's source code yet, I don't know if the actual problem is a bug in bash or an abnormal behaviour of zcat which isn't re- vealed by the other shells. It also might be a 'timing prob- lem' as zcat surely terminates before tar, and this may cause bash to report the pipe as being broken. -- -- -------------------------------------------------------------------------------- ---/ Heinz M. Herbeck / Trust me, I know / /- --/ heinz@sophie.pri.univie.ac.at / what I'm doing ! / /-- -/ Vienna University, Austria / (Sledge Hammer) / /--- -------------------------------------------------------------------------------- -- -- --------------------------------------------------------------------------------
byron@archone.tamu.edu (Byron Rakitzis) (05/17/91)
In article <heinz.674472878@cc.univie.ac.at> heinz@cc.univie.ac.at writes: >In <1991May15.155040.19078@ssd.kodak.com> weimer@garden.ssd.kodak.com (Gary Weimer (253-7796)) writes: >>In article <586@kepler1.kepler.com>, jwu@kepler.com (Jasper Wu) writes: >>|> I have some problem when using pipelined tar in bash and hope someone >>|> can help me find out why. >>|> >>|> When I do >>|> zcat foo.tar.Z | tar tvfB - >>|> or uncompress < foo.tar.Z | tar tvfB - >>|> >>|> it gives me the table of contents correctly but reports an error >>|> message "Broken pipe" to stderr when it finishes. However, if i add >>|> a "cat" to the pipeline as >>|> cat foo.tar.Z | uncompress | tar tvfB - >>|> then it works fine (i.e., no error message). All commands above work fine >>|> in csh or sh. > >> uncompress -c foo.tar.Z | tar tvf - >> ^^ > >This does *not* fix the problem. [...] >I'm sorry I can't give a solution to the problem - all that I >know is that bash reports a broken pipe if one of the processes >making up the pipe is killed or terminated abnormally (well, it >doesn't have to be terminated abnormally, it only needs to be >just terminated). [...] >It also might be a 'timing prob- >lem' as zcat surely terminates before tar, and this may cause >bash to report the pipe as being broken. Come again? zcat surely terminates before tar? Surely not! The answer to all this is quite simple. When I do: cat /usr/dict/words | sed 10q what happens? sed reads 10 lines of input, and then quits. However, cat does not know it is writing to a pipe, so it keeps dumping stuff to its stdout. Something has to stop it, so Unix has a signal called SIGPIPE. cat therefore dies with the signal "SIGPIPE". Some shells report this with a "broken pipe" message, because the tail of the pipe died before the head. Now, I have not looked at bash source either, but my guess is that the code does not check the status of each exiting pipeline member, because in the line cat foo.tar.Z | uncompress | tar ft - the "uncompress|tar" section of the pipeline will break when tar finishes printing the table of contents of the tar file. Finally, if I type (echo hi; echo there) | sed 1q then a shell will most likely not report any error, since "hi\n" and "there\n" will fit into the pipe buffer. Hope this clears things up a little. (In my opinion, a shell should never report broken pipes, since they are usually a part of normal operation. However, if can be handy to check the exit status of all pipe members. I have written a shell which does this: cat /usr/dict/words | tail -r | exit 42 echo $status prints the output 0 sigpipe 42 but does not report the broken pipe explicitly) -- Byron Rakitzis byron@archone.tamu.edu
byron@archone.tamu.edu (Byron Rakitzis) (05/20/91)
Heinz (heinz@cc.univie.ac.at) sent me some personal mail which I could not reply to (is there another address I could use to get mail to you, Heinz?). However, he raised an interesting point: Given a pipeline foo | tar ft - it seems clear that tar must read to EOF in order to determine whether the tar file that foo writes has come to an end or not. Therefore a normal instance of foo | tar ft - should not cause a pipe to break, since tar will always terminate after foo. I have no clue why tar is exiting prematurely. If anyone can shed light on the matter, I think Heinz and I would appreciate it. (Servus, Heinz!) -- Byron Rakitzis byron@archone.tamu.edu
pfalstad@phoenix.princeton.edu (Paul Falstad) (05/21/91)
byron@archone.tamu.edu (Byron Rakitzis) wrote: >Heinz (heinz@cc.univie.ac.at) sent me some personal mail which I could >not reply to (is there another address I could use to get mail to you, >Heinz?). However, he raised an interesting point: > >Given a pipeline > > foo | tar ft - > >it seems clear that tar must read to EOF in order to determine whether >the tar file that foo writes has come to an end or not. Therefore a >normal instance of > > foo | tar ft - > >should not cause a pipe to break, since tar will always terminate after >foo. I have no clue why tar is exiting prematurely. If anyone can shed >light on the matter, I think Heinz and I would appreciate it. (Servus, >Heinz!) I don't know the tarfile format, since I don't have source, but let's see: % ls a b c % tar cvf foo a a/ a/b a/c % tar tvf foo drwxr-xr-x pfalstad/student 0 May 20 20:05 1991 a/ -rw-r--r-- pfalstad/student 29 May 20 20:05 1991 a/b -rw-r--r-- pfalstad/student 29 May 20 20:05 1991 a/c % ls -l foo -rw-r--r-- 1 pfalstad 10244 May 20 20:13 foo (that's quite a big file for only 58 bytes of data. Must be lots of padding at the end) % cat /etc/motd /usr/dict/words >>foo % tar tvf foo drwxr-xr-x pfalstad/student 0 May 20 20:05 1991 a/ -rw-r--r-- pfalstad/student 29 May 20 20:05 1991 a/b -rw-r--r-- pfalstad/student 29 May 20 20:05 1991 a/c % man tar | sed -n 242,246p If there are multiple archive files on a tape, each is separated from the following one by an EOF marker. tar does not read the EOF mark on the tape after it finishes reading an archive file because tar looks for a special header to decide when it has reached the end of the archive. Now if % ... -- Paul Falstad | 10 PRINT "PRINCETON CS" pfalstad@phoenix.princeton.edu | 20 GOTO 10
heinz@cc.univie.ac.at (05/22/91)
In <16345@helios.TAMU.EDU> byron@archone.tamu.edu (Byron Rakitzis) writes: >Heinz (heinz@cc.univie.ac.at) sent me some personal mail which I could >not reply to (is there another address I could use to get mail to you, >Heinz?). However, he raised an interesting point: Try one of the following: heinz@sophie.pri.univie.ac.at (<-- preferred) hh@eacpc1.tuwien.ac.at A4424GAF at AWIUNI11.BITNET herbeck@rice.edu >Given a pipeline > foo | tar ft - >it seems clear that tar must read to EOF in order to determine whether >the tar file that foo writes has come to an end or not. Therefore a >normal instance of > foo | tar ft - >should not cause a pipe to break, since tar will always terminate after >foo. I have no clue why tar is exiting prematurely. If anyone can shed >light on the matter, I think Heinz and I would appreciate it. (Servus, >Heinz!) Yep, I do appreciate it. (Servus, Byron ! :) I looked up the format of a tar-file (tar(5)), which is as follows: A ``tar tape'' or file is a series of blocks. Each block is of size TBLOCK. A file on the tape is represented by a header block which describes the file, followed by zero or more blocks which give the contents of the file. At the end of the tape are two blocks filled with binary zeros, as an EOF indicator. The header block looks like: #define TBLOCK 512 #define NAMSIZ 100 union hblock { char dummy[TBLOCK]; struct header { char name[NAMSIZ]; char mode[8]; char uid[8]; char gid[8]; char size[12]; char mtime[12]; char chksum[8]; char linkflag; char linkname[NAMSIZ]; } dbuf; }; (quoted from the man-page) This proves what was intuitively clear: there's no 'directory' contained in a tar-file (how would you efficiently maintain a directory on a physical tape ? :) So tar has to scan the entire output from the first process in the pipe and terminates after this process. This does not explain the broken pipe, though. I tried the following: cat <some_long_file> | more and killed 'more' by pressing 'q' at the first prompt (so more terminates first). No 'Broken Pipe'. Then I tried: echo Hallo | (sleep 10; more) # first process terminates first, since 'Hallo' should fit into the pipe's buffer No 'Broken Pipe' either. So is this problem specific to tar ???? I have not encountered it anywhere else yet. Maybe I should take the time and hack up the source code of bash, but I'm not sure if it's worth the effort. Anyone who might have a clue please let me know. It doesn't really bother me if a pipe brakes (unless it happens in my bathroom :), but it is something that shouldn't happen, and I wonder why it does. Greetings, HH -- -------------------------------------------------------------------------------- ---/ Heinz M. Herbeck / Trust me, I know / /- --/ heinz@sophie.pri.univie.ac.at / what I'm doing ! / /-- -/ Vienna University, Austria / (Sledge Hammer) / /--- --------------------------------------------------------------------------------
djm@eng.umd.edu (David J. MacKenzie) (05/23/91)
I think the reason for the broken pipe is that 'tar tf -' ignores the padding at the end of the tar file; tar files are always an even multiple of the blocksize in length, so they get padded with garbage or nulls. The 'foo' program writes the nulls into the pipe, but tar never reads them. -- David J. MacKenzie <djm@eng.umd.edu> <djm@ai.mit.edu>
chet@odin.INS.CWRU.Edu (Chet Ramey) (05/23/91)
In article <heinz.674916703@cc.univie.ac.at> heinz@cc.univie.ac.at () writes: >This does not explain the broken pipe, though. I tried the following: > > cat <some_long_file> | more > >and killed 'more' by pressing 'q' at the first prompt (so more terminates >first). No 'Broken Pipe'. I guess I'll take a shot at this one. First of all, other shells (csh and ksh for sure) special-case the message printed when a child process dies due to an interrupt (SIGINT) or a broken pipe (SIGPIPE). Bash does not skip over SIGPIPE, hence the unexpected `Broken Pipe' message. >Then I tried: > > echo Hallo | (sleep 10; more) # first process terminates first, since > 'Hallo' should fit into the pipe's buffer > >No 'Broken Pipe' either. Try slc2$ cat /etc/termcap | sleep 1 Broken pipe (I also get the `Broken Pipe' message when I do `cat /etc/termcap | more' and immediately hit `q'.) The broken pipe/SIGPIPE/EPIPE happens to the *first* process in a pipeline; the error occurs when an attempt is made to write on a pipe when no process has it open for reading. The process must exit due to the SIGPIPE, by the way -- no message will be printed if it catches the SIGPIPE and calls exit(), unless the fatal signal handler is coded like this: fatal(sig) int sig; { cleanup(); _exit(128+sig); } >Maybe I should take the time and hack up the source code of bash, >but I'm not sure if it's worth the effort. It's a several-minute job, to be sure ;-) Chet -- Chet Ramey Internet: chet@po.CWRU.Edu Case Western Reserve University NeXT Mail: chet@macbeth.INS.CWRU.Edu ``Now, somehow we've brought our sins back physically -- and they're pissed.''
martin@mwtech.UUCP (Martin Weitzel) (05/24/91)
In article <1991May22.192914.22142@usenet.ins.cwru.edu> chet@po.CWRU.Edu writes: >The process must exit due to the SIGPIPE, by the way -- no message will be >printed if it catches the SIGPIPE and calls exit(), unless the fatal signal >handler is coded like this: > >fatal(sig) >int sig; >{ > cleanup(); > _exit(128+sig); ^^^^^^^^^^^^^^ rather: kill(getpid(), sig); ???? >} Hmm, I know that the shell encodes the information that some program was terminated by a signal this way in $? - but the other way round should also be true? On which system and for which shell? I've just run a quick test, started a child sub-shell from the shell prompt and terminated that with exit N (for several N close above 128). I saw no message from the parent shell though the exit status was transferred correctly according to $?. (In case it should matter: It ran the test for the Bourne Shell on ISC's UNIX/386 2.2.) -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
chet@odin.INS.CWRU.Edu (Chet Ramey) (05/24/91)
In article <1147@mwtech.UUCP> martin@mwtech.UUCP (Martin Weitzel) writes: >> _exit(128+sig); > ^^^^^^^^^^^^^^ rather: kill(getpid(), sig); ???? You're right, but you have to deal with all the differences between the BSD/Posix signals and the Sys V signals. Simply replacing the call to exit with the kill signal will cause an infinite loop. >Hmm, I know that the shell encodes the information that some program >was terminated by a signal this way in $? - but the other way round >should also be true? On which system and for which shell? No system, and for no shell. I made a mistake. Chet -- Chet Ramey Internet: chet@po.CWRU.Edu Case Western Reserve University NeXT Mail: chet@macbeth.INS.CWRU.Edu ``Now, somehow we've brought our sins back physically -- and they're pissed.''