[comp.lang.c] File descriptors and streams and co

carroll@s.cs.uiuc.edu (04/17/89)

I must be missing something - given that
FILE *my_file;
has been properly set up (with fopen(), no errors, etc.), why can't you
switch stdin by having another variable
FILE *tmp;
and doing
tmp = stdin; stdin = my_file;

'stdin' is declared as a pointer, and so setting the pointer to point at
a different FILE I/O block should cause routines that use it to read from
that file instead. You can then restore by
stdin = tmp;

Mr. Salz indicated that things are more complex than this. Is this because
of library routines with file descriptor 0 wired in, or because file info
is kept in places other than the FILE I/O block stdin points to? While
this might not work in all cases, it seems to me it should in the original
case, if twiddling the file descriptor in the block works. (i.e., it must
not be hard-wired and must look in the FILE I/O block for things).

Alan M. Carroll                "And then you say,
carroll@s.cs.uiuc.edu           We have the Moon, so now the Stars..."  - YES
CS Grad / U of Ill @ Urbana    ...{ucbvax,pur-ee,convex}!s.cs.uiuc.edu!carroll

kremer@cs.odu.edu (Lloyd Kremer) (04/19/89)

In article <207600018@s.cs.uiuc.edu> carroll@s.cs.uiuc.edu (Alan M. Carroll)
writes:
>I must be missing something - given that
>FILE *my_file;
>has been properly set up (with fopen(), no errors, etc.), why can't you
>switch stdin by having another variable
>FILE *tmp;
>and doing
>tmp = stdin; stdin = my_file;

There are several concepts missing here.  Although any discussion of I/O
is, strictly speaking, not relevant to the C language, in practice almost
every C program does some I/O, and hopefully the commonly used interfaces are
sufficiently consistent across operating systems, at least conceptually, to
make a discussion here useful.  When I say "conceptually," I mean that even
if it isn't really implemented in this way, if you assume that it is, your
program will work properly in all cases.

In virtually any system that has a UNIX ancestry or that attempts to emulate
the UNIX I/O methodology, the following should be conceptually correct.  The
names of the various internal objects may vary or may not be defined.

Low level I/O consists of a number (often 20) of integer file descriptors that
can be returned from low level I/O calls such as open, creat, and dup.  In
some systems the symbol _NFILE is #define'd as this maximum number of open
files.  High level I/O consists of a low level file combined with associated
buffering.  The buffering avoids the necessity of a system I/O call to process
every character.

A FILE is typedef'd or #define'd as a struct containing a low level file
descriptor and a few other members pertaining to the buffering (type of
buffering, pointers to the buffer, count of characters in the buffer,
read/write capability, error flags, etc.).  The first three FILEs are
normally inherited from the parent process and are provided pre-opened.  They
are open to the same things and in the same modes as they were in the parent.

There is an array of _NFILE FILEs often called _iob or some similar name.

The names stdin, stdout, and stderr are usually #define'd as the addresses
of the first three of these FILEs (structs).

	#define stdin  (&_iob[0])
	#define stdout (&_iob[1])
	#define stderr (&_iob[2])

Hence stdin cannot, in general, be used as an lvalue.

This is the reason that "changing stdin" is, in general, non-trivial.
Stdin cannot be changed; it's the address of an absolute location in memory;
it's immovable.  When we speak of "changing stdin", we mean changing the
*contents* of the structure referenced by stdin.  This involves clearing out
the previous contents properly, with fflushing to preclude any data loss,
and then opening the new file such that *stdin (_iob[0]) will be selected
as its FILE structure, the FILE structure will contain file descriptor 0
(this is not automatic; it must be arranged), the file descriptor will be
validly open, it will be open for reading, the FILE will be set for reading
("r"), and all the other structure members will be properly and consistently
initialized for the new stream.

I have found that programmers who perform surgery on stdio without due regard
for these considerations produce programs whose I/O sort of works most of
the time, but suffers occasional lost data, misdirected data, invalid file
descriptor problems, and memory errors.  Moral: be sure you understand both
low level and high level I/O, and the relationships between them, before you
start rewiring them in the middle of an executable.

-- 

					Lloyd Kremer
					Brooks Financial Systems
					...!uunet!xanth!brooks!lloyd
					Have terminal...will hack!

bobmon@iuvax.cs.indiana.edu (RAMontante) (04/19/89)

I've approached this "changing stdin" idea from the other way around:
all my I/O calls are of the form

	fgets(buffer, size, MyIn);
	fputs(string, MyOut);

and the initialization logic is sort of like:

	if (input_name_supplied) {
		MyIn = fopen(input_name,"r");
		/* what? me, error-check? */
	} else
		MyIn = stdin;

	/* likewise for output */

For what kinds of situation is this approach inadequate?

hamilton@osiris.cso.uiuc.edu (04/26/89)

rds95@leah.Albany.Edu says:

> I want to be able to make "stdin" read from someplace besides, well,
> standard input in the middle of my program, and then go back to where
> it was again.

i would do it like this:

	void fexchange(a, b)
	FILE *a;
	FILE *b;
	{
		FILE tmp;

		tmp = *a;
		*a = *b;
		*b = tmp;
	}

	FILE *my_file;

	/* open my_file */

	/* exchange stdin & my_file */
	fexchange(stdin, my_file);

	/* ... use stdin ... */

	/* restore stdin */
	fexchange(stdin, my_file);

	fclose(my_file);

i've been using this technique for years in an application where i
wanted a nested "#include" capability for interactive input.

	wayne hamilton
	U of Il and US Army Corps of Engineers CERL
UUCP:	{convex,uunet}!uiucuxc!osiris!hamilton
ARPA:	hamilton@osiris.cso.uiuc.edu	USMail:	Box 476, Urbana, IL 61801
CSNET:	hamilton%osiris@uiuc.csnet	Phone:	(217)333-8703

chris@mimsy.UUCP (Chris Torek) (04/27/89)

In article <1239500006@osiris.cso.uiuc.edu>
hamilton@osiris.cso.uiuc.edu writes:
>void fexchange(a, b)
>FILE *a;
>FILE *b;
>{
>	FILE tmp;
>
>	tmp = *a;
>	*a = *b;
>	*b = tmp;
>}

This is not a good idea.  Consider the following (legal) extraction
from a hypothetical implementation of stdio:

stdio.h:
	typedef struct _file {
		int	putc_freespace;
		int	getc_unconsumed;
		unsigned char *putc_ptr;
		unsigned char *getc_ptr;
		unsigned char *buffer;
		int	bufsize;
		int	flags;
	} FILE;

	FILE	_iob[20];

	#define	stdin (&_iob[0])
	#define	stdout (&_iob[1])
	#define	stderr (&_iob[2])

	#define	fileno(fp) ((fp) - _iob)

The result of fexchange(stdin, f), in this implementation, is to replace
the counts and pointers for file descriptor 0 (stdin) with those for
file descriptor fileno(f) such that when stdin->getc_unconsumed (copied
from f->getc_unconsumed) goes to zero, the program reads from file
descriptor 0 ... stdin!

The only reason fexchange() works in existing stdio implementations is
that they happen to store the file descriptor in the FILE structure
(rather than making it implicit, as above).  Some SysV implementations
store some of their information outside the FILE structure, however,
making this doubly dangerous.  (Storing important information outside
the FILE objects is not illegal, merely stupid.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris