[comp.lang.c] erroneous "hello" from forked "hello world" process!

mef@romulus.rutgers.edu (Marc Fiuczynski) (10/01/90)

For my Operating System Design class we had to write some trivial 
programs using the fork() and wait() commands.  The teaching assistant
came across an interesting problem.  After compiling the following:

 
#include <stdio.h>

main ()
{
	int pid = 1;
	printf("Hello\n");
	pid=fork();
	if(pid==0){
		printf("World\n");
	}
}

We should get the output:

Hello
World

however, when redirecting the output of a.out to a file and then doing a 
cat on the file we would get either:

a.out > filename
cat filename

Hello
Hello 
World

or 

Hello
World 
Hello

does anyone know what is going on?   Is there a problem because there are
two processes output being redirected to the filename or what?  Any help
with this would be appreciated.  
-- 
Marc Fiuczynski
mef@remus.rutgers.edu

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu>, mef@romulus.rutgers.edu (Marc Fiuczynski) writes:
> #include <stdio.h>
> main ()
>     {
>         int pid = 1;
>         printf("Hello\n");
>         pid = fork();
>         if (pid == 0) printf("World\n");
>     }
> [Output includes two "Hello"s when redirected to a file.]

Engrave this on the tablets of your mind:
	stdio is BUFFERED!
Neither the "Hello\n" nor the "World\n" is being written by either
process until the buffer fills or the buffer is flushed when the file
is closed at process termination.  The processes share a UNIX file
descriptor and that descriptor's file pointer, but the stdio buffer is
*copied*, not shared.  The simplest method is to call
	setbuf(stdout, (char*)NULL);
If you have setvbuf(),
	setvbuf(stdout, (char*)NULL, _IOLBUF, 0);
will flush the buffer after every line, which may suffice.

Best of all is to treat shared files as shared resources and interlock
them properly, using semaphores or whatever, and fflush when "ownership"
of the file passes from one process to another.

-- 
Fixed in the next release.

randy@csseq.tamu.edu (Randy Hutson) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu> mef@romulus.rutgers.edu (Marc Fiuczynski) writes:

>#include <stdio.h>
>
>main ()
>{
>	int pid = 1;
>	printf("Hello\n");
>	pid=fork();
>	if(pid==0){
>		printf("World\n");
>	}
>}

[Marc explains that when the above program is executed and redirected to
a file, the expected string "Hello\nWorld\n" is not in the file, but instead
either "Hello\nHello\nWorld\n" or "Hello\nWorld\nHello\n".]

After a fork, a child process will inherit the stdio buffers of its
parent.  In your case, printf("Hello\n") was not sufficient to flush
the stdio buffer of the parent process, so "Hello\n" was written to a
buffer but not to the file.  Right after the fork, "Hello\n" was in the 
stdio buffers of both processes.  Then after the child executed 
printf("World\n"), its buffer contained "Hello\nWorld\n" while the
parent's buffer still contained only "Hello\n".  The order in which
the two processes terminate (with their buffers being flushed) is not 
defined, hence you sometimes got "Hello\nHello\nWorld\n" (the parent
exited first) and other times got "Hello\nWorld\nHello\n" (the child 
exited first).

You will probably get the output you desire if you don't redirect
the output of your program.  This is because most (at least) terminal 
drivers are newline buffered, and the '\n' in printf("Hello\n") is
sufficient to flush the buffer.  In any case, a "correct" version
of your program follows with a fflush(stdout) executed right after
the first printf:


#include <stdio.h>

main ()
{
	printf("Hello\n");
	fflush(stdout);
	pid=fork();
	if(pid==0){
		printf("World\n");
	}
}


Randy Hutson
randy@csseq.tamu.edu

chris@mimsy.umd.edu (Chris Torek) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu>
mef@romulus.rutgers.edu (Marc Fiuczynski) writes:
>For my Operating System Design class we had to write some trivial 
>programs using the fork() and wait() commands.

fork() and wait() are not part of the C language, so this belongs in
comp.unix.programmer.  I have redirected followups.  (Old-timers may
note that I try to be impartial about all misdirected stuff, not just
MS-DOS stuff).

His TA expects

>	printf("Hello\n");
>	pid=fork();
>	if(pid==0){
>		printf("World\n");
>	}

to produce

>Hello
>World

but instead it produces one of `Hello\nHello\nWorld' or `Hello\nWorld\nHello'.

>Does anyone know what is going on?

Your TA should; he or she should certainly know about the boundaries
between `things in the operating system' and `things not in the
operating system' in order to work with an OS class in the first place.
printf() is part of the C runtime system, *not* the Unix OS; fork()
and wait() are part of the Unix OS, not the C runtime system.  Thus
there is no guarantee about the way the two interact.

In particular, when printing to a `buffered' file the runtime system
buffers (hence the name) the text until it has a `suitable amount' to
write at once.  It then sends it all out with one or more write() system
calls.  Since fork() simply duplicates all the data in the program, it
duplicates any buffered text.  Thus the program above becomes one
process with `Hello\n' buffered, and one with `Hello\nWorld\n' buffered.

The real oddity is not that the word `Hello' is ever duplicated, but
rather that it is sometimes *not* duplicated.  This happens whenever
the string `Hello\n' is written out before the program forks, which
includes whenever the standard output is line buffered or unbuffered.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

cpcahil@virtech.uucp (Conor P. Cahill) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu> mef@romulus.rutgers.edu (Marc Fiuczynski) writes:
>
>main ()
>{
>	int pid = 1;
>	printf("Hello\n");
>	pid=fork();
>	if(pid==0){
>		printf("World\n");
>	}
>}

The problem here is that when you redirect the program to a file, the
standard output is buffered.  So the hello is buffered until you flush
the buffer (something that is done automatically at exit) and is therefore
replicated when the program forks.

When the standard output is not redirected, the output will be line buffered
and therefore upon seeing the \n following the hello, it is written out and
hence gone before the program duplicates itself.

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170 

lanzo@wgate.UUCP (Mark Lanzo) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu> mef@romulus.rutgers.edu (Marc Fiuczynski) writes:
>
>For my Operating System Design class we had to write some trivial 
>programs using the fork() and wait() commands.  The teaching assistant
>came across an interesting problem.  After compiling the following:
>
> 
>#include <stdio.h>
>
>main ()
>{
>	int pid = 1;
>	printf("Hello\n");
>	pid=fork();
>	if(pid==0){
>		printf("World\n");
>	}
>}
>

[ ... Description of how output gets repeated twice in file if 
stdout is redirected ... ]

I believe your problem is that you are not flushing your open file buffers
before doing your fork() call.  Remember that the "stdio" package is a
buffered I/O package layered on top of the level-1 I/O calls (like "open",
"close", "read", and "write");  hence, some of the data you are writing
isn't really being written immediately, but is instead deferred until
a buffer gets filled.

Hence, your are doing a 'printf("Hello\n")' which is causing the character
string to be written into an anonymous buffer hanging off of the FILE
structure.  When the buffer is filled, the stdio package will do something
which pretty much amounts to an 'fflush(stdout)'.

What is happening in your case is this:

   1)  You print the characters "Hello\n" into a buffer.
   2)  You fork the process.  Both copies of the program still have
       this data in their buffer!
   3)  Both copies continue to run, eventually flushing their buffers.
       The order in which the output from the parent and child processes
       gets intermingled is undefined.

The thing which really confuses you is that you don't see this behavior
when you *aren't* redirecting your output.  It does exactly what you
expected.  This is because the stdio package normally uses a "line buffered"
or "unbuffered" mode of I/O when writing to a terminal and hence your
output does get flushed prior to the 'fork()' call.

The quick cure?  Flush the output stream(s):

	printf("Hello\n");
	fflush(stdout);
	pid = fork();

Your system may have a better means for flushing buffers, like 
'fflushall()' [ours doesn't].
-- 
Mark Lanzo                      | Wandel & Goltermann Technologies, Inc.
uunet!wgate!lanzo               | 1030 Swabia Court, Research Triangle Park
lanzo@wgate.wgate.com ??        | North Carolina 27709-3585
                                | Phone: (919) 941-5730  FAX: (919) 941-5751

tim@proton.amd.com (Tim Olson) (10/01/90)

In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu> mef@romulus.rutgers.edu (Marc Fiuczynski) writes:

[program performs a printf("Hello\n"), followed by a fork(), then a
printf("World\n"); Marc wonders why "Hello" is printed twice...]

| does anyone know what is going on?   Is there a problem because there are
| two processes output being redirected to the filename or what?  Any help
| with this would be appreciated.  

Here is what is going on:

Your program uses the stdio package to print the output.  This package
performs buffering in order to improve performance, as it is faster to
collect many characters to send to one write() system call, rather
than to perform a system call each time.

For tty output, stdio usually buffers only a line, calling write()
when a newline character is reached.  However, when writing to a file,
it usually buffers more, typically 1K characters (see the BUFSIZ
definition in <stdio.h>).

When you perform the fork() operation, the child process gets an exact
copy of the parent, including all open files, all buffered output,
etc.  Thus, when redirecting standard output to a file, the first
printf("Hello\n") is buffered when the fork() occurs, giving both
parent and child a copy of this buffer.  The parent goes on to
printf("World\n"), while the child simply exits.  In the process of
exiting, all buffered output is flushed.

The order of execution of the two processes is undefined, so the final
output may be either:

Hello			Hello
World		or	Hello
Hello			World


To fix this problem, any buffered output should be flushed before
forking:

include <stdio.h>

main()
{
	printf("Hello\n");
	fflush(stdout);
	fflush(stderr);
	if (fork() == 0) {
		printf("World\n");
	}
}



	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) (10/02/90)

>In article <Sep.30.21.09.46.1990.2881@romulus.rutgers.edu> mef@romulus.rutgers.edu (Marc Fiuczynski) writes:
>#include <stdio.h>
>
>main ()
>{
>printf("Hello\n");
>if(fork()==0)
>    printf("World\n");
>}
>

This is an excerise from chapter 7 in Bach's "Design on the Unix O/S".
The goal isn't to know how to correct the output, but to describe
why it acts the way it does.

Several people have already talked about stdio being buffered and fork()'s
child inheriating the file descriptors, so I won't re describe it.

However, there are lots of neat excersies in this book. Perhaps more
of them should be posted?

-- 
-------------------------------------------------------------
Jay @ SAC-UNIX, Sacramento, Ca.   UUCP=...pacbell!sactoh0!jak
If something is worth doing, it's worth doing correctly.