[comp.lang.c] dup

platt@ndla.UUCP (Daniel E. Platt) (02/03/90)

Greetings!

I have a question about what happens to the buffering of stdio
when stdio is re-directed via a dup() from a pipe()'ed file
descriptor before exec'ing.

for example, I've done the following:

	int	fd1[2], 	/* child read, parent write */
		fd2[2];		/* child write, parent read */

	/* ... */

	pipe(fd1);
	pipe(fd2);
	if(fork() == 0){

		close(fd1[1]);	/* close superfluous end */
		close(0);	/* close stdin */
		dup(fd1[0]);	/* redirect stdin to come from df1[0] */
		close(fd2[0]);	/* close superfluous end */
		close(1);	/* close stdout */
		dup(fd2[1]);	/* redirect stdout to go to df2[1] */
		execvp(*av, av);/* av is declared char **av; elsewhere */
	}

	close(fd1[0]);
	close(fd2[1]);

	/* ... */

The result of all of this is that the program in *av will be executed
with fd1[0] directed to the output of *av, and fd2[1] will write to
stdin of *av.  But the question is what happens in the child.

In the child, if I have something like:

	/* ... */

	while(scanf("%d", &i) == 1)
		printf("%d", i * i);

	/* ... */

without first calling:
	
	setbuf(stdin, NULL);
	setbuf(stdout, NULL);

what happens is the parent hangs.  It would appear that the child won't
write its buffer until it fills it up... just like it was writing to
a disk.  However, with the setbuf()'s present, there is no hangup.  I assume
that it knows to create the buffer when it determines that it has been
re-directed.  However, doesn't it know that it was redirected from
a pipe as opposed to being re-directed to or from a disk file?  If I'm
trying to do this to a program for which I only have the binary, and which
uses stdio buffered, is there a way to fool it into not using a buffer?

Thanks in advance! :-)

Dan Platt

chris@mimsy.umd.edu (Chris Torek) (02/04/90)

This has little to do with C per se, and belongs in comp.unix.questions.

In article <273@ndla.UUCP> platt@ndla.UUCP (Daniel E. Platt) writes:
>I have a question about what happens to the buffering of stdio
>when stdio is re-directed via a dup() from a pipe()'ed file
>descriptor before exec'ing.

exec() throws away the current program, *including all information
stdio has ever built up*.  (Some people attempt to shut off buffering
by adding a `setbuf(stdout, (char *)NULL)' call before the exec.  This
cannot work, because exec() throws out that information.)

>In the [child program with stdin & stdout being a pipe], if I have
>something like:
>
>	while(scanf("%d", &i) == 1)
>		printf("%d", i * i);
>
>without first calling:
>	
>	setbuf(stdin, NULL);
>	setbuf(stdout, NULL);

(Unless you have a machine that uses function prototypes, these should
be `setbuf(stdin, (char *)NULL)' and `setbuf(stdout, (char *)NULL)'.)

>what happens is the parent hangs.  It would appear that the child won't
>write its buffer until it fills it up... just like it was writing to
>a disk.

In some versions of Unix, a pipe *is* a disk file.  (In others it is a
`stream' or a `socket'; but in most if not all Unix systems, a pipe is
not a `tty' device.)  Stdio believe in buffering: with the exception of
`tty' devices, all output is buffered in large chunks.  On `tty's
(whatever those are: in 4BSD, a tty is a device that allows
ioctl(TIOCGETP)), stdio buffers output only to a newline (or a `large
chunk', whichever comes first).

>However, with the setbuf()'s present, there is no hangup.  I assume
>that it knows to create the buffer when it determines that it has been
>re-directed.

The child is a completely new program.  The first time it tries to
do something with stdio, it decides that it should buffer stdout.  If
stdout is a `tty', stdio buffers it up to newlines; otherwise stdio
buffers it fully.

>However, doesn't it know that it was redirected from
>a pipe as opposed to being re-directed to or from a disk file?

Stdio cares not in the least about anything except `tty' devices.

>If I'm trying to do this to a program for which I only have the binary,
>and which uses stdio buffered, is there a way to fool it into not using
>a buffer?

Such a program is, as far I am concerned, buggy.  Programs that
interact, but depend on `tty' style buffering, are in need of repair.
ANY program that intends to interact---whether with a person or another
program---should, if it uses stdio, use fflush() before each input
request to make sure that its output goes out.  I think it was a
mistake to add line-buffering to stdio at all: convenient perhaps, but
a mistake, for it voids the model `everything is a byte stream file'.

Programs that cheat---that call setbuf to disable all buffering---are
not so broken, but are performance disasters.  The right solution is to
put fflush calls into the offending program.  Of course, this is sometimes
not possible.

If all else fails, the last trick is to run the program after
connecting its stdout to what the program will think *is* a `tty'.
Most modern Unix systems have `pseudo terminals' that can be used for
this purpose.  In extreme situations, you can connect a real terminal
port back to a second terminal port, and make the machine talk to
itself over a serial line.  (This particular kludge is something you
would expect of an IBM O/S, not of Unix.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

platt@ndla.UUCP (Daniel E. Platt) (02/06/90)

To make a long story short:

I have a program that wants to FTP a slew of files from a box with
lots of disk space to a box with small disk space on demand.  Since
I have LOTS of files, I only want this program to invoke FTP once,
and then the parent can block on a read of 'ftp's output to see
when the file was transfered, rather than breaking the network
connection, and re-invoking 'ftp' everytime I wanted another file.

With this in mind:
I want to open two pipes between a parent and a child; the parent has
opened one pipe dup()'ed in the child process to stdin, and the other
pipe opened by the parent was dup()'ed in the child to stdout.  The
parent writes to the pipe duped to the child's stdin, and reads from
the pipe dup()'ed to the child's stdout.  The child then exec's 'ftp'.
'Ftp's stdout is buffered.  When it detects that it isn't attached to
a tty device, it acts like its buffering to disk, and never gets around
to writing the short responses back to the parent.  BESIDES putting
'setbuf()'s or 'fflush()'s into 'ftp' -- a program which I don't have
the code to anyway -- how can I get 'ftp' to send its responses
unbuffered?

In article <22266@mimsy.umd.edu>, chris@mimsy.umd.edu (Chris Torek) writes:
> This has little to do with C per se, and belongs in comp.unix.questions.

I receive a partial news feed at my home system.  I don't have lots of
resources to clutter up my system with; this was the most germain
group of those I subscribe to.  Sorry... I'm really not trying to
screw things up for other people...

> 
> In article <273@ndla.UUCP> platt@ndla.UUCP (Daniel E. Platt) writes:
> >I have a question about what happens to the buffering of stdio
> >when stdio is re-directed via a dup() from a pipe()'ed file
> >descriptor before exec'ing.
> 
> exec() throws away the current program, *including all information
> stdio has ever built up*.  (Some people attempt to shut off buffering
> by adding a `setbuf(stdout, (char *)NULL)' call before the exec.  This
> cannot work, because exec() throws out that information.)

I know that 'exec' throws out all buffer stuff.  I've not lost buffer
stuff.  The problem is that I'm trying to set up two way communications
between a parent and 'ftp', and 'ftp's stdout is buffered.  Besides,
I have not re-directed stdio in the parent process, only in the child
process.  Also, all I have in the parent before fdopen() is file
descriptors returned by pipe().  There's no buffer to set to zero
in this communication stream.

> 
> >In the [child program with stdin & stdout being a pipe], if I have
> >something like:
> >
> >	while(scanf("%d", &i) == 1)
> >		printf("%d", i * i);
> >
> >without first calling:
> >	
> >	setbuf(stdin, NULL);
> >	setbuf(stdout, NULL);
> 
> (Unless you have a machine that uses function prototypes, these should
> be `setbuf(stdin, (char *)NULL)' and `setbuf(stdout, (char *)NULL)'.)
> 
> >what happens is the parent hangs.  It would appear that the child won't
> >write its buffer until it fills it up... just like it was writing to
> >a disk.
> 
> In some versions of Unix, a pipe *is* a disk file.  (In others it is a
> `stream' or a `socket'; but in most if not all Unix systems, a pipe is
> not a `tty' device.)  Stdio believe in buffering: with the exception of
> `tty' devices, all output is buffered in large chunks.  On `tty's
> (whatever those are: in 4BSD, a tty is a device that allows
> ioctl(TIOCGETP)), stdio buffers output only to a newline (or a `large
> chunk', whichever comes first).

This is the problem; I am trying to invoke a program with pipes installed
both to stdin and stdout (in this, I've succeeded).  However, this program
has used buffered stdio routines for VERY low volume screen IO.  What
happens is the buffering blocks my read() in the parent.

> 
> >However, with the setbuf()'s present, there is no hangup.  I assume
> >that it knows to create the buffer when it determines that it has been
> >re-directed.
> 
> The child is a completely new program.  The first time it tries to
> do something with stdio, it decides that it should buffer stdout.  If
> stdout is a `tty', stdio buffers it up to newlines; otherwise stdio
> buffers it fully.

I know about this; it's in the man pages about stdio.  Again, I was
hoping that somebody would have a bright idea about how to unblock
buffers; perhaps to get it to think it was still attached to a tty?

> 
> >However, doesn't it know that it was redirected from
> >a pipe as opposed to being re-directed to or from a disk file?
> 
> Stdio cares not in the least about anything except `tty' devices.

This is sad; it makes it difficult to call 'ftp' from other programs;
or actually lots of other programs too.

> 
> >If I'm trying to do this to a program for which I only have the binary,
> >and which uses stdio buffered, is there a way to fool it into not using
> >a buffer?
> 
> Such a program is, as far I am concerned, buggy.  Programs that
> interact, but depend on `tty' style buffering, are in need of repair.
> ANY program that intends to interact---whether with a person or another
> program---should, if it uses stdio, use fflush() before each input
> request to make sure that its output goes out.  I think it was a
> mistake to add line-buffering to stdio at all: convenient perhaps, but
> a mistake, for it voids the model `everything is a byte stream file'.

I'm glad that you feel about this the same way I do.  From your
description, you're not saying the parent program is incorrect in
concept or in execution, but that the program that the parent
is invoking, namely 'ftp', is buggy (I'm asserting this since this
program was a *Program that interacts, but depends on `tty' style
buffering*).  From my standpoint, it has rendered 'ftp' unusable
as a solution to my problem.

> 
> Programs that cheat---that call setbuf to disable all buffering---are
> not so broken, but are performance disasters.  The right solution is to
> put fflush calls into the offending program.  Of course, this is sometimes
> not possible.

I don't have the source to 'ftp' to put in the 'fflush()'s.  The volume
of information that I'm passing between parent and child isn't that
great however, so I'm NEVER going to come close to 1K before I need
to flush the buffer in any event.

> 
> If all else fails, the last trick is to run the program after
> connecting its stdout to what the program will think *is* a `tty'.
> Most modern Unix systems have `pseudo terminals' that can be used for
> this purpose...

I think I may try it.  Unfortunately, this is VERY system specific.  I
believe they are called `pty00' - `ptyxx' on the systems at work,
but on my home system, they're called 'pts0', etc.

By the way, I haven't tried messing with this sort of kludge much before.
I assume this type of device is essentially 'special' file, and that it
acts sort of like a FIFO?

>                ...In extreme situations, you can connect a real terminal
> port back to a second terminal port, and make the machine talk to
> itself over a serial line.  (This particular kludge is something you
> would expect of an IBM O/S, not of Unix.)

This is gross and disgusting :-(  Besides, I don't want to connect
cables everytime I want to run my application on a new system.  I'd
run out of cables.

> -- 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
> Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

I appreciate the information in this article; it pretty much confirms
what I thought (I didn't mess up on my homework).  This is very
unfortunate however... I was hoping that I HAD screwed up, and that there
WAS an easy way around my problem.  The `pseudoterminals' sounds like the
best option to solve this problem in the short run.

Dan
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
||                                     1(914)945-1173            ||
||        Dan Platt                    1(914)941-2474            ||
||        Watson (IBM)                 PLATT@YKTVMV.BITNET       ||
||                           ..!uunet!bywater!scifi!ndla!platt   ||
||                                                               ||
||     The opinions expressed here do not necessarily reflect    ||
||                   those of my employer!                       ||
||                                                               ||
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-