[comp.os.minix] Stdio

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (09/08/89)

Well, well, a little competition.  Here am I trying to put together
my very own stdio for network distribution when along comes Earl
Chew's package.  It looks well done, it runs very fast - but I still
think mine is better.  Here are the high points of mine:

+ ANSI compatibility
	As far as I can tell (working from secondary sources like K&R2),
	and with the exception of various small items noted below, the
	package is ANSI compatible.  That includes prototypes when
	 __STDC__ is defined and (almost) no name space pollution.
+ POSIX compatibility
	The package is also POSIX compatible, with the limitations that
	no kernel changes allow.
+ New include files
	New header files per ANSI (except float.h, locale.h,  and math.h).
	Includes a working stdarg.h.  Updated versions of POSIX header
	files (but lots of work still to be done).
+ Support for fast / debugging stdio functions
	Default compile generates code that does lots of argument checking
	(default action is to return whatever failure code the function
	is supposed to).  Compiling with -DNDEBUG produces fast library.
+ Functions behind all macros
	Functions accessible if the macros in stdio.h/ctype.h are #undef'ed.
? Statically allocated FILEs
	+ John Vaudin's complaints are moot
	- Extra statically allocated storage (~200 bytes)
+ Support for [rwa]+ modes in (f|fd|fre)open
	Specifying "a" means that writes are always at end of file
	(special code included for the standard Minix, works automatically
	if Simon Poole's patches are applied to FS).
	Specifying "+" means both reads and writes are possible, but
	code is included to only allow a change after fseek() (I think
	this is ANSI, but haven't checked).
+ ANSI additions to stdio
	fsetpos() / fgetpos()
	remove()
	rename() from Freeman Pascal
+ getc()/putc() are macros
	But they're unsafe (i.e. they evaluate their arguments more than once).
+ fgetc()/putc() can be safe macros
+ ungetc() works once on unbuffered files
? Different semantics for gets()
	Non-null strings that end without '\n' at EOF don't return EOF.
+ Flushing tty output on tty input
	If stdout is attached to a tty, it is flushed before reads on stdin.
? Line buffering not implemented
	In my opinion, _IOLBF is a flawed concept.  ANSI says it has to
	be there, so the hooks are there (i.e. requests are honoured but
	subsequently ignored).  Adding support for it in fputc() is
	reasonably easy, but (in my opinion) a waste of time.

That's about it.  With my stdio, compiled with -DNDEBUG, times on Earl
Chew's tests are comparable to his code.  My debugging code is slower,
but it checks for more error conditions (than either the existing stdio
or Earl's).

I could use four or five beta testers.  Anyone want to volunteer?

Norbert

ast@cs.vu.nl (Andy Tanenbaum) (09/09/89)

In article <18952@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
>
>Well, well, a little competition.  Here am I trying to put together
>my very own stdio for network distribution when along comes Earl
>Chew's package.  It looks well done, it runs very fast - but I still
>think mine is better.  Here are the high points of mine:

I am quite willing to replace Patrick's stdio with a better one.  In my view
"better" refers to the following characteristics, in order of importance
(1 = most important, 3 = least important)

1. POSIX/ANSI compatibility
2. Clean, straightforward, easy to understand and modify code
3. Performance, both size and speed

Since I am not in any great hurry, I hope that several beta testers
volunteer and report back their results.  Maybe after that some more
volunteers could try to compare both packages and post the results.
Are there any more rewrites of stdio waiting in the wings?

Andy Tanenbaum (ast@cs.vu.nl)

P.S. To Earl Chew: All my mail to you bounces.  Please send me a bang path
from, say, uunet.

cechew@bruce.OZ (Earl Chew) (09/09/89)

From article <18952@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
> 
> Chew's package.  It looks well done, it runs very fast - but I still
> think mine is better.  Here are the high points of mine:

That's hardly surprising ;-)

> 
> + ANSI compatibility

I think that if you look at mine, you will also find that I've gone to some
trouble to include function prototypes and the like when __STDC__ is defined.
Some more work has been done on this since the last set of patches to get gcc
to make less noise.

> + POSIX compatibility

I'm not in a position to say how Posix compatible my stdio is since I haven't
received my copy of the standard yet :-(.

> + New include files

The only header files I have included is stdio.h. It's the only one needed to
get stdio running.

> + Support for fast / debugging stdio functions

Fast - but not much debugging.

> + Functions behind all macros

I don't know how useful this is. fgetc and fputc are function versions of putc
and getc, but other than that...

> ? Statically allocated FILEs
> 	+ John Vaudin's complaints are moot
> 	- Extra statically allocated storage (~200 bytes)

I have already posted saying that John Vaudin's complaints have been
addressed without the need to allocate anything that wasn't allocated in the
original posting. stdin, stdout and stderr are allocated at compile time (apart
from their buffers).

> + Support for [rwa]+ modes in (f|fd|fre)open
> 	Specifying "a" means that writes are always at end of file
> 	(special code included for the standard Minix, works automatically
> 	if Simon Poole's patches are applied to FS).

I'm not clear about this. Is it specifically stated in Posix that `a' mode will
always write at the end of the file? Does that mean that fseek() on a stream
opened in `a' mode will fail? I am about to put in code for three argument
opens which are supported by Simon Poole's patches. Are these going into 1.4b?

> 	Specifying "+" means both reads and writes are possible, but
> 	code is included to only allow a change after fseek() (I think
> 	this is ANSI, but haven't checked).

Done.

> + ANSI additions to stdio
> 	fsetpos() / fgetpos()

Haven't seen these before --- I'd better go find out what they do.

> 	remove()
> 	rename() from Freeman Pascal

Are these in stdio? Seems to me to be a strange place to put them. What has
remove() and rename() got to do with stream input and output?

> + fgetc()/putc() can be safe macros

I thought that fgetc and fputc were _meant_ to be functions. If you want macros
you can use getc and putc.

> + ungetc() works once on unbuffered files

Yes, already in there.

> ? Different semantics for gets()
> 	Non-null strings that end without '\n' at EOF don't return EOF.

gets() should return NULL on EOF --- I think that was a typo. Are the semantics
that different? These are the semantics according the manual page on the Sun
here. A simple test program shows that both stdios on the Pyramid and Sun
systems perform the same actions on non-null strings without a \n before the
EOF, ie don't return EOF unless the string is empty. My stdio follows the same
action.

> + Flushing tty output on tty input
> 	If stdout is attached to a tty, it is flushed before reads on stdin.

Yes. This is standard procedure although not done by the old Minix stdio.
Caused me some heartache once --- I thought my program had crashed when I
didn't see a prompt. I did eventually see the prompt --- after I had typed the
input :-(.

> ? Line buffering not implemented

Line buffering is implemented. It does cause complications but the thing is that
you don't want to have to put an explicit fflush() after all console messages.
Then again you don't want the (short) messages to be stuck in the buffer until
the program terminates. You could instead try for nobuffering at all or a short
buffer (which just about amounts to the same thing but could also suffer from
the same shortcomings as a largish buffer) but there is a dramatic difference,
especially on Minix, between line buffering and _no_ buffering. Try it and
see.

Earl

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (09/09/89)

In article <1521@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>From article <18952@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):

Let me state up front that I have access only to unofficial versions of
the ANSI C standard (like K&R 2nd edition).  If I'm wrong about what I
write, I'd like to be corrected.  If you have answers to some of the
questions posed below, please email me.

I do have a copy of the Posix standard.  Integrating it into my fuzzy
notion of the C standard has been an interesting experience.

>> + New include files
>
>The only header files I have included is stdio.h. It's the only one needed to
>get stdio running.

I admit I've gone a bit overboard with the include files.  But I figured
that if I had to go through all the ANSI/Posix pain for stdio.h, I might
as well do the whole thing.

>> + Functions behind all macros
>
>I don't know how useful this is. fgetc and fputc are function versions of putc
>and getc, but other than that...

ANSI says you gotta do this.  It must be possible for a user to #undef
a standard library function's name and still have things work.

>> + Support for [rwa]+ modes in (f|fd|fre)open
>> 	Specifying "a" means that writes are always at end of file
>> 	(special code included for the standard Minix, works automatically
>> 	if Simon Poole's patches are applied to FS).
>
>I'm not clear about this. Is it specifically stated in Posix that `a' mode will
>always write at the end of the file? Does that mean that fseek() on a stream
>opened in `a' mode will fail? I am about to put in code for three argument
>opens which are supported by Simon Poole's patches. Are these going into 1.4b?

Posix says you have to do this.  ANSI says you have to do this.  If a file
is opened for append, any write() goes at the end of the file.  I'm not
sure whether fseek() fails - it might just be ignored for normal "a" mode.
That's what I did.  (Question: What does X3J11 require?)  For "a+" mode,
fseek() has to seek.  As an aside, Posix says that the underlying lseek()
in such a case works normally (i.e. it sets the file pointer according to
what the user requested).  But any subsequent write() resets the file
pointer to the current end of file.  Simon Poole's patches seemed to do
this correctly, but I haven't looked closely.

I second Earl's request for information on whether 1.4b will address this.
From ast's summary posting, it looks like 1.4b is a bunch of patches at
the command level, just like 1.4a, so I suspect the answer is NO.

>> + ANSI additions to stdio
>> 	fsetpos() / fgetpos()
>
>Haven't seen these before --- I'd better go find out what they do.

Cousins to fseek() and ftell(), respectively.  That's how I implemented them.

>> 	remove()
>> 	rename() from Freeman Pascal
>
>Are these in stdio? Seems to me to be a strange place to put them. What has
>remove() and rename() got to do with stream input and output?

My feelings exactly.  But the ANSI standard says they belong there.
So does perror(), by the way.

>> + fgetc()/putc() can be safe macros
>
>I thought that fgetc and fputc were _meant_ to be functions. If you want macros
>you can use getc and putc.

Well, my reading of the standard is that any function can be implemented as
a safe macro.  Some functions are explicitly given license to be done as
unsafe macros (like getc/putc).  Since there is a performance hit for using
functions, I thought it might be nice to have the option.

>> ? Different semantics for gets()
>> 	Non-null strings that end without '\n' at EOF don't return EOF.
>
>gets() should return NULL on EOF --- I think that was a typo. Are the semantics
>that different? These are the semantics according the manual page on the Sun
>here. A simple test program shows that both stdios on the Pyramid and Sun
>systems perform the same actions on non-null strings without a \n before the
>EOF, ie don't return EOF unless the string is empty. My stdio follows the same
>action.

(Question: What does the C standard say?)  I don't know what to make of this.
The systems to which I have access are BSD'ish and gets() doesn't return
EOF unless it's at EOF and it got exactly 0 characters.  But the old
stdio didn't act that way.  I have matched the BSD/Chew semantics; I just
don't know whether it's right.  And I thought I'd better say so.

>> ? Line buffering not implemented
>
>Line buffering is implemented. It causes complications but the thing is that
>you don't want to have to put an explicit fflush() after all console messages.
>Then again you don't want the (short) messages to be stuck in the buffer until
>the program terminates. You could instead try for nobuffering at all or a short
>buffer (which just about amounts to the same thing but could also suffer from
>the same shortcomings as a largish buffer) but there is a dramatic difference,
>especially on Minix, between line buffering and _no_ buffering. Try it and
>see.

Well, I've thought about this a little more and can understand that point
of view.  Now, if we could just get people to stop writing programs that
dumped core or went into infinite loops, we wouldn't need _IOLBF -:)  As
that seems a little unlikely, I suppose I'll put the code in. (Sounds of
me throwing a tantrum should be interpolated here.)  This wouldn't bother
me so much if there was a performance hit only for _IOLBF streams; the
point is that the code has to check everywhere, especially in the putc()
macro that I worked at so hard to make fast!

>Earl

Before this turns into a war, let me say that this exchange is going to
improve both of our stdio packages.  So far, Earl's package has gotten
me to dynamically allocate the buffers for stdin and stdout, an addition
that I think is worthwhile.  My notes above will probably goad Earl into
improvements for his code too.  Thanks a bunch.

Goads from Earl and others gratefully received.

Norbert

ast@cs.vu.nl (Andy Tanenbaum) (09/10/89)

In article <18971@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
>In article <1521@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>From ast's summary posting, it looks like 1.4b is a bunch of patches at
>the command level, just like 1.4a, so I suspect the answer is NO.

You are correct. 1.4b will be the second release of Bruce Evan's kernel
plus a lot of stuff I have been accumulating in commands, lib etc.  It will
not start on POSIX.  It won't include either of the new stdio packages
because I would like to wait for Norbert's and then see what the majority
opinion of them is.  Starting in a couple of weeks I will begin the change
to POSIX, starting with the headers.  Is the ANSI C standard actually
published yet?  I have an old draft of it, but I don't know the status.
I do have the final, published POSIX standard, however.

On a related note, we have converte the ACK compiler to be ANSI standard
and it is too big to fit in 64K + 64K.  Such is life.  Thus V2.0 will
continue with the K&R compiler, however with floating point and a separate
assembler and linker, as well as bug fixes.  While this is s pity, it is
not so awful.  I think K&R C will continue to exist for many years, given
the large volume of code in it.  Does anyone have a list of features that
are legal in K&R C and forbidden or discouraged in ANSI C?  If so, please
post them, and I will try to free the code of them.  If I can manage to
write MINIX in the subset of C that is common to K&R and ANSI, then the
code will compile with the K&R compiler and be legal ANSI as well.

Earl: Please send me your bang path.

Andy Tanenbaum (ast@cs.vu.nl)

henry@utzoo.uucp (Henry Spencer) (09/11/89)

In article <3196@ast.cs.vu.nl> ast@cs.vu.nl (Andy Tanenbaum) writes:
>... Is the ANSI C standard actually
>published yet?  I have an old draft of it, but I don't know the status.

They're still working on pushing it through the bureaucracy.  There have
been some unexpected major delays; a turkey whose submission to one of
the public-comment periods somehow got lost has been making trouble.
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

chasm@attctc.Dallas.TX.US (Charles Marslett) (09/13/89)

In article <1521@bruce.OZ>, cechew@bruce.OZ (Earl Chew) writes:
> From article <18952@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
> I don't know how useful this is. fgetc and fputc are function versions of putc
> and getc, but other than that...
> I'm not clear about this. Is it specifically stated in Posix that `a' mode will
> always write at the end of the file? Does that mean that fseek() on a stream
> opened in `a' mode will fail? I am about to put in code for three argument
> opens which are supported by Simon Poole's patches. Are these going into 1.4b?

As I understand the standard (corrections will be appreciated if anyone has
a copy of a recent draft handy!), Posix requires that all writes go to
the end of the file, but fseek is allowed (if the open included "r"?), and
reads will be from the repositioned file pointer.

I hope someone will correct this if I am wrong on any of the points (I do
not have a copy of the Posix standard).

> Earl

Charles

cechew@bruce.OZ (Earl Chew) (09/13/89)

I am about to post a much revised version of stdio. AST, I have sent you
numerous bits of mail with my address embedded within, but I haven't heard from
you. Is this because you didn't receive them, or the return address is still
wrong?

Some comments:

From article <18971@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
> In article <1521@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>>From article <18952@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
>>> + Functions behind all macros
>>
>>I don't know how useful this is. fgetc and fputc are function versions of putc
>>and getc, but other than that...
> 
> ANSI says you gotta do this.  It must be possible for a user to #undef
> a standard library function's name and still have things work.

What I had originally meant was, "I don't know how useful it is to have fgetc
and fputc as macros". I haven't received my Posix stuff yet so I don't know
just how many of the macros have to have functions behind them. I have worked
on the assumption that all need to have macros. Are getc and putc exceptions?

>>> + Support for [rwa]+ modes in (f|fd|fre)open
> 
> Posix says you have to do this.  ANSI says you have to do this.  If a file

I have reworked my code (a little) to attempt to mimic this behaviour. Minix
(without Simon Poole's patches) does not currently support O_APPEND. This means
behaviour can be mimicked by calling lseek() before any write. This won't get
it quite right if two processes compete for access to the file since one
process may lseek() before the other gets a chance to write. Support for
O_APPEND is in my code also and it is compiled in if your fcntl.h indicates that
it is there.

>>> + ANSI additions to stdio
>>> 	fsetpos() / fgetpos()
> 
> Cousins to fseek() and ftell(), respectively.  That's how I implemented them.

Cousins -- how close? Do they have exactly the same calling sequence and the
same semantics?

>>> 	remove()
>>> 	rename() from Freeman Pascal
>>
> My feelings exactly.  But the ANSI standard says they belong there.
> So does perror(), by the way.

Inserted perror, remove and rename.

I have been having problems with the semantics of fdopen. What _should_ happen
in the following cases. Note that the question is not "What _does_ happen on
machine x in the following cases?"

1. if (fdopen(0, "r") == stdin)
     puts("Y");
   else
     puts("N");

   So what gets printed?

2. fclose(fdopen(0, "r"));
   len = read(0, buf, 1);

   Does the read succeed (ie is file descriptor 0 still active)?

3. fclose(fdopen(0, "r"));
   printf("%d\n", getchar);

   What gets printed by the printf? EOF? Is stdin still active?

4. for (; fdopen(0, "r") != NULL; ) ;

   Does the loop terminate? When?

Earl

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (09/14/89)

In article <1529@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>...
>From article <18971@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
>> In article <1521@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>>>From article <18952@princeton.Princeton.EDU>, by nfs@notecnirp.Princeton.EDU (Norbert Schlenker):
>>>> + Functions behind all macros
...
ANSI requires that every function defined as being in the standard library
must be implemented as a function.  If you want to implement some standard
functions as safe macros, that's OK, but the function still has to be
available.  Any user must be able to get at the function by #undef'ing the
macro name or taking the address of the function name (I haven't tried this,
but I bet the ACK compiler won't manage the last one).

>>>> + Support for [rwa]+ modes in (f|fd|fre)open
>
>I have reworked my code (a little) to attempt to mimic this behaviour. Minix
>(without Simon Poole's patches) does not currently support O_APPEND. This means
>behaviour can be mimicked by calling lseek() before any write. This won't get
>it quite right if two processes compete for access to the file since one
>process may lseek() before the other gets a chance to write. Support for
>O_APPEND is in my code also and it is compiled in if your fcntl.h indicates that
>it is there.

I hadn't thought of the problem with shared file descriptors.  This is
not even a little bit nice.  My code will have the same problem.

>>>> + ANSI additions to stdio
>>>> 	fsetpos() / fgetpos()
...
int fgetpos(FILE *stream, fpos_t *pos);
	Stores the current value of the file position in *pos.
int fsetpos(FILE *stream, fpos_t *pos);
	Sets the file position for the stream to that defined by *pos.

My code just has a typedef long fpos_t and appropriate macros that
invoke ftell() and fseek().
...
>I have been having problems with the semantics of fdopen. What _should_ happen
>in the following cases. Note that the question is not "What _does_ happen on
>machine x in the following cases?"
>
>1. if (fdopen(0, "r") == stdin)
>     puts("Y");
>   else
>     puts("N");
>
>   So what gets printed?

Whatever you like.  Posix doesn't say "shall" or "should" or "must", so 
do whatever you want.

>2. fclose(fdopen(0, "r"));
>   len = read(0, buf, 1);
>
>   Does the read succeed (ie is file descriptor 0 still active)?

No.  ANSI says that fclose() closes the file, not just the stream
associated with it.

>3. fclose(fdopen(0, "r"));
>   printf("%d\n", getchar);
>
>   What gets printed by the printf? EOF? Is stdin still active?

Surely you mean getchar().  As for the questions, here's my opinion:
The file is closed, but stdin looks like a valid stream (assuming
that the answer to question 1 above is no).  The getchar() should
return EOF, so printf() should print it.

>4. for (; fdopen(0, "r") != NULL; ) ;
>
>   Does the loop terminate? When?

Depends on the answer to 1 again.  I think it should terminate after
FOPEN_MAX streams have been opened.

>Earl

Norbert

ast@cs.vu.nl (Andy Tanenbaum) (09/14/89)

In article <1529@bruce.OZ> cechew@bruce.OZ (Earl Chew) writes:
>I am about to post a much revised version of stdio. AST, I have sent you
>numerous bits of mail with my address embedded within, but I haven't heard from
>you. 

I didn't get anything.  On the other hand, I am not going to do anything about
stdio now.  After Norbert posts his, I'll wait until the dust settles and see
which seems better.  I'm not in any hurry, so you have plenty of time to work
on yours.  It seems odd that we have so many mail problems.  I send and
receive several messages a day to/from Bruce Evans, who is in your general
neck of the woods, and that works flawlessly.

Andy Tanenbaum (ast@cs.vu.nl)

jburnes@pnet02.gryphon.com (Jim Burnes) (09/19/89)

Andrew:

I guess I sent you a letter previous to this posting, but here goes just in
case you miss that one.  Could you please tell me if ACK can be used as a
cross compiler and if the sources ~O'are available.  I would like to use it to
cross compile to both Intel and non-intel, segmented and non-segmented
architectures.  Is this a doable thing and where can I get ACK?  Also...can I
get a discount if I get both the 68000 and Intel versions of MINIX?

Jim Burnes

UUCP: {ames!elroy, <routing site>}!gryphon!pnet02!jburnes
INET: jburnes@pnet02.gryphon.com

cechew@bruce.OZ (Earl Chew) (09/29/89)

The next six kits will contain an updated version of my stdio. I have gone to
some lengths to try to make it ANSI and Posix conformant. There was a bug
report to do with fwrite --- that code has been rewritten. There is a problem
with ((int) ((unsigned char) (x))) under Minix cc. This has been dealt with by
explicitly masking out the unwanted bits.

Other problems that need resolving are listed in 0_bugs.txt.

The code has been hacked a bit to compile under ANSI compilers which support
function prototypes. stdarg.h is provided although stdio will compile with
either varargs.h or stdarg.h.

An configuration script is provided but I have not verified its operation under
Minix. It is not necessary to use this script.

Earl
-- 
Earl Chew, Dept of Computer Science, Monash University, Australia 3168
ARPA: cechew%bruce.cs.monash.oz.au@uunet.uu.net  ACS : cechew@bruce.oz

Fred van Kempen <waltje@minixug.hobby.nl> (08/02/90)

Hello All,

This may be a somewhat silly question for someone who maintains a
MINIX archive, but: can anyone send me a stdio(3) package with a
fopen(3) that understands the r+ and w+ operations, as well as the
wb and rb ones?

for(i = 0; i < 1000; i++)
	printf("Thankxxx a lot!\n");

Fred.
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| MINIX User Group Holland   UUCP: waltje@minixug.hobby.nl      |
| c/o Fred van Kempen,         or: hp4nl!hgatenl!minixug!waltje |
| Hoefbladhof  27                                               |
| 2215 DV  VOORHOUT         "Love is - what you want it to be.  |
| The Netherlands                               Alannah Myles"  |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+