[comp.os.minix] Zen and the Art of Library Programming

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (12/27/89)

I think that a philosophical point has to be discussed, and the recent
posting of v1.5 makes me think that now is a good time.  I'll get to the
philosophy at the end; at the moment, let me motivate the discussion.

Consider the recent implementation of assert.c (a helper function for
the assert() macro defined in <assert.h>).  As posted, it reads thus:

----------------------------------------------------------------------------
#include <lib.h>
/* This is the routine called by <assert.h>. */

#include <stdio.h>
#include <stdlib.h>

void __assert(file, line)
char *file;
long line;
{
  fprintf(stderr, "Assertion error in file %s on line %u\n", file, line);
  abort();
}
----------------------------------------------------------------------------

What's wrong with this routine?  There is a problem with the declaration
of "line", which appears here as a long but which is likely to be compiled
as an int in the calling routine (the PC Minix 1.3 compiler does this).

But that's not the major problem at all.  The BIG problem is that __assert()
calls fprintf() and abort() and I assert that this is NOT A GOOD THING!

For one thing, it is possible to write an ANSI C conforming program which
will not work with this definition of __assert().  Imagine:

----------------------------------------------------------------------------
#include <assert.h>
#include <stdio.h>

abort()
{
  while (1)
	fprintf(stderr, "Stupid jerk!\n");
}

main()
{
  assert(0==1);
}
----------------------------------------------------------------------------

This is a valid ANSI C program which should terminate with a SIGABRT signal.
Instead, the linker will use the program's definition of abort(), and this
program will go into an infinite loop when run.  (Just as an aside, this is
a useful test program for many compilers that claim ANSI compatibility.)

It can be argued that this is a rather contrived example, and I suppose it is.
But consider a piece of a rather more reasonable program and a situation that
I have actually run into when porting to Minix:

----------------------------------------------------------------------------
long int len(path)
char *path;
{
/* some baloney */
  return (long) buf.st_size;
}

... more functions, some of which call len() ...
----------------------------------------------------------------------------

This won't link under PC Minix 1.3.  Why?  Because buried in lib/call.c is
a definition of a function called len(), and call.s is going to be linked
into any non-trivial Minix program.  The name conflict causes the link to
fail.  The len() function in the program has to be renamed, for no good
reason other than a sloppy implementation of the Minix library.

Similar problems are easy to find.  Fwrite() in <stdio> uses write() [from
POSIX <unistd>], but the program will probably fail if a program has a 
function called write().  Calloc() calls malloc() - what if a program 
supplies its own?  Malloc calls brk() - what if a program supplies its own?
Such programs are perfectly possible and perfectly legal.  They will not
work under Minix and I think that something has to be done to fix this up.

Here is my suggestion.  It's not pretty, and it is going to add a procedure
call layer onto a path that is probably already too long.  But it will work,
and I don't think anything simpler will do the trick.

Every (!) library function becomes a simple front-end for another function
which does the work.  Every library function can call ONLY the underlying
functions (I like to call them underware :-)  To avoid ANSI namespace
conflicts, I suggest the underware's names be prefixed with two underscores.
Definitions to aid in this should be added to <lib.h>.  To make this a
little more concrete, let's try the following:

--------------------------------- lib.h ------------------------------------
...
#define LIB_ABORT	__abort
#define LIB_GETPID	__getpid
#define LIB_KILL	__kill
...
-------------------------------- abort.c -----------------------------------
/* abort.c						ANSI 4.10.4.1
 *	void abort(void);
 *
 *	Causes abnormal program termination by raising SIGABRT.
 */

#include <stdlib.h>
#include "lib.h"

#ifdef abort
#undef abort
#endif

PUBLIC void abort()
{
  LIB_ABORT();
}
-------------------------------- ABORT.c -----------------------------------
/* ABORT.c						Implementation
 *	void LIB_ABORT(void);
 *
 *	Provides underlying functionality for ANSI abort().
 */

#include <signal.h>
#include "lib.h"

PUBLIC void LIB_ABORT()
{
  LIB_KILL(LIB_GETPID(), SIGABRT);
}
------------------------------- GETPID.c -----------------------------------
-------------------------------- KILL.c ------------------------------------
etc. etc.
----------------------------------------------------------------------------

See how abort() works?  Now we can fix __assert() pretty easily too - rather
than calling abort(), we make it call LIB_ABORT() (i.e. __abort()) instead.
(Fprintf() is another problem, but we can fix that similarly.)  No program
that follows ANSI dictates can mess up this scheme.  No other reasonable
program should :-)

I believe that the long run benefits of this approach outweigh the costs of
doing it once correctly.  I am going to do something like this for myself,
no matter what.  I can even see a simple method, using the peephole optimizer
posted once upon a time, of removing virtually all the space and time overhead
that this method entails.

Does anyone have any comments and/or suggestions?

Norbert

meulenbr@cstw68.prl.philips.nl (Frans Meulenbroeks) (01/02/90)

In article <22630@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
>Every (!) library function becomes a simple front-end for another function
>which does the work.  Every library function can call ONLY the underlying
>functions (I like to call them underware :-)  To avoid ANSI namespace
>conflicts, I suggest the underware's names be prefixed with two underscores.

I like the idea & hate the performance loss.
One potential problem exists: 
implementations may restrict the significance of an identifier to 6
characters, and may ignore case distinctions. (ansi draft sec. 3.1)
I don't know if this applies to minix, but if so, name clashes might
occur by prepending the names with __
(eg fputc and fputs could become __fput).
Frans Meulenbroeks        (meulenbr@cst.prl.philips.nl)
	Centre for Software Technology
	( or try: ...!mcvax!phigate!prle!cst!meulenbr)

Leisner.Henr@xerox.com (marty) (01/03/90)

Norbert Schlenker says: {{


----------------------------------------------------------------------------
#include <assert.h>
#include <stdio.h>

abort()
{
  while (1)
	fprintf(stderr, "Stupid jerk!\n");
}

main()
{
  assert(0==1);
}

----------------------------------------------------------------------------

This is a valid ANSI C program which should terminate with a SIGABRT
signal.
Instead, the linker will use the program's definition of abort(), and this
program will go into an infinite loop when run.  (Just as an aside, this is
a useful test program for many compilers that claim ANSI compatibility.)
}}

What is the compiler supposed to do with the program?  Generate a warning?
Generate a fatal error?  Its valid C code.  Why should the compiler (for
the language) know anything about the library functions (this ain't PL/1).

I quickly glanced through the May, 88 dpans and found nothing to confirm or
deny it.  Anyone have a reference?

While having no proof one way or another, I maintain   redefining library
functions is within the spirit of C.  I often do it since the library
version doesn't do what I want or has problems or whatever.  Recompiling
the source code with magic macros to use different functions (i.e. cc
-Dabort=myabort) is insufficient -- what if I want to library functions to
use my version ?


A common mistake by beginning C programmers is to use library functions
names for their own functions and then watch the chaos that follows (read
and write are good examples, especially when the program does STDIO) .

Part of the spirit of C I enjoy so much is the assumption the programmer
knows what he's doing (maybe he wants to redefine write() -- i.e. perhaps
do some special profiling and then generate the syscall).

One more note:
Does the Minix compiling system properly understand  public/private?  I
understood at one time static really didn't do anything regarding
name-space pollution.

marty
ARPA:	leisner.henr@xerox.com
GV:  leisner.henr
NS:  leisner:wbst139:xerox
UUCP:  hplabs!arisia!leisner

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (01/03/90)

In article <830@prles2.UUCP> meulenbr@cst.prl.philips.nl (Frans Meulenbroeks) writes:
>In article <22630@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
>>Every (!) library function becomes a simple front-end for another function
>>which does the work.  Every library function can call ONLY the underlying
>>functions (I like to call them underware :-)  To avoid ANSI namespace
>>conflicts, I suggest the underware's names be prefixed with two underscores.
>
>I like the idea & hate the performance loss.

The performance loss can be made very minor.  The simplest way is to change
the headers so that every library function has a corresponding macro, and
the macros themselves call the underware functions.  I personally disapprove
strongly of this, because of the increased complexity of the headers, but
the possibility exists.

Another possibility, to which I alluded in my original posting, is to use
the posted peephole optimizer to turn something like:

unsigned int alarm(seconds)
unsigned int seconds;
{
  return __alarm(seconds);
}

into (in alarm.s):

jmp ___alarm

The code generated by cc for alarm() and other front-end routines is very
easy to recognize, and the peephole optimizer would have no trouble making
such a substitution.

A third possibility that springs instantly to mind is the abolition
of callm1() and callm3() in favour of macros.  They have to go anyway,
because of their namespace pollution (along with len() and send() and ...).
That compensates for the level added by the ANSIfication.

>One potential problem exists: 
>implementations may restrict the significance of an identifier to 6
>characters, and may ignore case distinctions. (ansi draft sec. 3.1)
>I don't know if this applies to minix, but if so, name clashes might
>occur by prepending the names with __
>(eg fputc and fputs could become __fput).

Granted.  This is already a problem with Minix, since asld only uses
eight characters.  My original proposal hid the __* names in a header,
with them replaced in the library code by macros like LIB_ALARM.  (Upon
further thought, I believe it would be better to use macros of the
form __ALARM for alarm()'s underware routine.)  Then the header writer
has to be cognizant of name space collisions, but the problem doesn't
arise elsewhere.

>Frans Meulenbroeks        (meulenbr@cst.prl.philips.nl)
>	Centre for Software Technology
>	( or try: ...!mcvax!phigate!prle!cst!meulenbr)

Norbert

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (01/03/90)

In article <7259@nigel.udel.EDU> Leisner.Henr@xerox.com (marty) writes:
>Norbert Schlenker says: {{
>
>
>----------------------------------------------------------------------------
>#include <assert.h>
>#include <stdio.h>
>
>abort()
>{
>  while (1)
>	fprintf(stderr, "Stupid jerk!\n");
>}
>
>main()
>{
>  assert(0==1);
>}
>
>----------------------------------------------------------------------------
>
>This is a valid ANSI C program which should terminate with a SIGABRT
>signal.
>Instead, the linker will use the program's definition of abort(), and this
>program will go into an infinite loop when run.  (Just as an aside, this is
>a useful test program for many compilers that claim ANSI compatibility.)
>}}
>
>What is the compiler supposed to do with the program?  Generate a warning?
>Generate a fatal error?  Its valid C code.  Why should the compiler (for
>the language) know anything about the library functions (this ain't PL/1).

The compiler should compile the program without error or warning, because
the program has no errors in it.  The point is that the assert() macro has
a guaranteed effect (according to ANSI), and the last part of that effect
is to terminate the program by calling the abort() function.  The abort()
function cited by ANSI is that of the ANSI standard, not a programmer
supplied one (i.e. ANSI expects the program to die after assert(0==1)).
A C compiler/library/linker combination that can't do this is BROKEN.

>I quickly glanced through the May, 88 dpans and found nothing to confirm or
>deny it.  Anyone have a reference?

From the Dec'88 draft:

  Section 4.2.1.1 - The assert macro
	...
	When it is executed, if expression is false, the assert macro writes
	information about the particular call that failed ...  It then calls
	the abort function.
	...
	Forward references: the abort function (4.10.4.1).

>While having no proof one way or another, I maintain   redefining library
>functions is within the spirit of C.
  But that was before ANSI.

>					  I often do it since the library
>version doesn't do what I want or has problems or whatever.  Recompiling
>the source code with magic macros to use different functions (i.e. cc
>-Dabort=myabort) is insufficient -- what if I want to library functions to
>use my version ?
  But that was before ANSI.

>A common mistake by beginning C programmers is to use library functions
>names for their own functions and then watch the chaos that follows (read
>and write are good examples, especially when the program does STDIO) .
  This is a common mistake, and the chaos is real.  This is a side effect
  of the Unix philosophy best summarized as "Manual! What manual?  This is
  Unix, son, you just gotta know!" (cribbed from a Usenet signature).  It
  sure is fun to watch someone make this mistake - it's no fun to do it.

>Part of the spirit of C I enjoy so much is the assumption the programmer
>knows what he's doing (maybe he wants to redefine write() -- i.e. perhaps
>do some special profiling and then generate the syscall).
  Ah, but you are guilty of reminiscing now.  That's the spirit of K&R -
  C is just a high level assembler and if you want to do something, you
  can do it.  But ANSI C is a different language - it was defined partly
  to guarantee that even schmucks could write programs that work.

  If you want to redefine write(), it's easy to do in Minix since you have
  the source code.  It's easy to do in most Unix systems today, since they
  aren't ANSI aware.  It will even be easy to do in the ANSIfied world.
  But it won't be easy to make an ANSI fwrite() use your rewritten write()
  then.  And that's as it should be.

  Consider, for example, the code for a remarkable new program that dials
  random bank computers and has them send you money.  It is obvious that
  everyone should have this program.  You NEED this program.  It is written
  in C for a WHIZZBANG Model 67C/X and works fine there.  It uses <stdio.h>.
  It also has an internal routine named write(), which is just fine since
  a German wrote the WHIZZBANG's operating system, and the system call
  underlying fwrite() isn't write() but schreib() instead.  Write(),
  regrettably, dials the FBI and informs them of the scam.  Now don't you
  wish you had an ANSI conforming library?

>One more note:
>Does the Minix compiling system properly understand  public/private?  I
>understood at one time static really didn't do anything regarding
>name-space pollution.

  Not as far as I know.  This is a sore point with me (and I assume many
  others).  Eventually, there has got to be another ACK compiler that
  fixes this very significant defect.

>marty

Norbert

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (01/03/90)

In article <22702@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
>    (regarding the utility of static with the ACK C compiler)
>  Not as far as I know.  This is a sore point with me (and I assume many
>  others).  Eventually, there has got to be another ACK compiler that
>  fixes this very significant defect.

This just goes to show you that a little testing is always worthwhile.
(At least it saves you from having to extricate a foot from your mouth.)
I tried a static function ... lo and behold, it worked just fine.  When
was this fixed?

Norbert

tim@nucleus.amd.com (Tim Olson) (01/03/90)

In article <22702@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes:
| In article <7259@nigel.udel.EDU> Leisner.Henr@xerox.com (marty) writes:
| >Norbert Schlenker says: {{
| >
| >
| >----------------------------------------------------------------------------
| >#include <assert.h>
| >#include <stdio.h>
| >
| >abort()
| >{
| >  while (1)
| >	fprintf(stderr, "Stupid jerk!\n");
| >}
| >
| >main()
| >{
| >  assert(0==1);
| >}
| >
| >----------------------------------------------------------------------------
| >
| >This is a valid ANSI C program which should terminate with a SIGABRT
| >signal.

No, it isn't valid.

| >Instead, the linker will use the program's definition of abort(), and this
| >program will go into an infinite loop when run.  (Just as an aside, this is
| >a useful test program for many compilers that claim ANSI compatibility.)

This is legal, since the operation of the program is undefined.

| >What is the compiler supposed to do with the program?  Generate a warning?
| >Generate a fatal error?  Its valid C code.  Why should the compiler (for
| >the language) know anything about the library functions (this ain't PL/1).
| 
| The compiler should compile the program without error or warning, because
| the program has no errors in it.

Yes, it does have an error (see below):

| The point is that the assert() macro has
| a guaranteed effect (according to ANSI), and the last part of that effect
| is to terminate the program by calling the abort() function.  The abort()
| function cited by ANSI is that of the ANSI standard, not a programmer
| supplied one (i.e. ANSI expects the program to die after assert(0==1)).
| A C compiler/library/linker combination that can't do this is BROKEN.

This is not correct.  The program is erroneous because it defines an
identifier with the same name as an identifier that is reserved
(namely abort).  The behavior of the program is undefined.  See
Section 4.1.2.1 (Reserved Identifiers).

	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (01/04/90)

In article <28591@amdcad.AMD.COM> tim@amd.com (Tim Olson) writes:
| After a lengthy quote from <22702@princeton.Princeton.EDU> and
|			     <7259@nigel.udel.EDU>, parts of which follow:
|| >This is a valid ANSI C program which should terminate with a SIGABRT
|| >signal.
|
|No, it isn't valid.
|| The compiler should compile the program without error or warning, because
|| the program has no errors in it.
|
|Yes, it does have an error (see below):
|
|| The point is that the assert() macro has
|| a guaranteed effect (according to ANSI), and the last part of that effect
|| is to terminate the program by calling the abort() function.  The abort()
|| function cited by ANSI is that of the ANSI standard, not a programmer
|| supplied one (i.e. ANSI expects the program to die after assert(0==1)).
|| A C compiler/library/linker combination that can't do this is BROKEN.
|
|This is not correct.  The program is erroneous because it defines an
|identifier with the same name as an identifier that is reserved
|(namely abort).  The behavior of the program is undefined.  See
|Section 4.1.2.1 (Reserved Identifiers).
|
|	-- Tim Olson

Quite right - thanks for pointing this out.  So __assert() can be left
as originally posted by Andy Tanenbaum.  Mea culpa ...

There is still a problem with a great deal of library code, however.
None of the Minix (Unix) system calls are currently valid for use by
the library.  Although abort() is reserved by ANSI, write() [for example]
is not.  So an ANSI fwrite() cannot use a POSIX write() - it will have to
use some underlying function like __write().  And the bottom Minix level
(callm1/callm3/len/sendrec/begsig/etc.) certainly has to be fixed.

Norbert