[net.unix] gripes about error reporting

tanner@ki4pv.UUCP (Tanner Andrews) (05/10/86)

Well, there are a few other things that also generate fairly standard
signals, and which are not strictly pdp-11 problems.  I am in general
avoiding the old pdp names for them:
	HANG_UP		loss of carrier
	USER_INT	^C (or whatever)
	CORE_DMP	^\ (or whatever)
	PLEASE_DIE	kill proc
	DIE_YGS_PIG	kill -9 proc (YGS --> you gravy sucking)
	TIME_OUT	alarm() can cause this

I admit that most of these aren't hardware faults, but they certainly
are signals that need to be included in most every eunuchs
implementation.

-- 
<std dsclm, copies upon request>	   Tanner Andrews

chris@umcp-cs.UUCP (Chris Torek) (05/11/86)

In article <406@houligan.UUCP> daemon@houligan.UUCP writes:
>Another problem: Recently, I wrote a device driver for an exclusive-use
>device [...].  I finally settled on EBUSY [for an `in use' error],
>but if you run that through perror, you get "mount device busy"....

4.3BSD's perror says "Device busy".

>The second gripe involves utilities.  A number of them don't bother to do
>perror or anything like it; they just say "can't open file" or something
>like that.

I have a feeling this is caused by two things:  First, perror() is
not always sufficient.  Second, it is not as well known as it should
be, perhaps because of the first problem.  I have been pushing my
own `error' routine.  It seems to work pretty well.  There are a
few things I might change, but it currently works as follows:

    ERROR(3)            UNIX Programmer's Manual             ERROR(3)

    NAME
	 error - error message output routine

    SYNOPSIS
	 error(quit, syserr, format [, arg ] ...  )
	 int quit, syserr;
	 char *format;

    DESCRIPTION
	 Error prints an error message on the standard error output,
	 just as fprintf(stderr, format [, arg ] ...  ) would, but
	 preceded by the program's name, and followed by the error
	 message that perror(3) would print if the system error
	 number syserr were in the global variable errno.  After
	 printing a terminating newline, if quit is nonzero, error
	 performs an exit(quit); .

	 If syserr is zero, the system error message is suppressed;
	 error may thus be used for user errors.

	 Examples
	 To abort a program if a file cannot be opened for reading:

	      extern int errno;
	       ...
	      if ((fp = fopen(filename, "r")) == NULL)
		      error(1, errno, "can't open %s for reading", filename);

	 To abort after a user error:

	      error(1, 0, "usage: foo bar");

	 To gripe about a system error, without exiting:

	      extern int errno;
	       ...
	      if (unlink(tempfile))
		      error(0, errno, "warning: couldn't remove %s", tempfile);

    AUTHOR
	 Chris Torek

    SEE ALSO
	 printf(3S), perror(3), exit(3)

    BUGS

    Printed 5/10/86           U of MD Local                         1

The changes I might make involve the following:

	- error() cannot print to stdout (perhaps it should take
	  a `FILE *' or a file descriptor); and

	- error() cannot get the system error number itself (this
	  is a minor nuisance); and

	- the name `error' may be too presumptuous.

However, it is refreshing to be able to go through my old code
and replace umpteen occurrences of

	if ((fd = open(file, mode)) < 0) {
		(void) fprintf(stderr, "myprog: cannot open");
		perror(arg);
		exit(1);
	}

with

	if ((fd = open(file, mode)) < 0)
		error(1, errno, "cannot open %s", arg);
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

phaedrus@eneevax.UUCP (Praveen Kumar) (05/11/86)

I use Chris Torek's error routine all the time in my code.  Let me
tell you, it is great.  It makes writing error handling code a hell
of a lot easier.  It also makes the code easier to read and the
routine works on all Un*x systems that I have worked on...4.x, SysV,
and SysIII.

praveen

-- 
"This ain't my goddamn planet, understand monkey-boy!"

phaedrus@eneevax.umd.edu or {seismo,allegra}!umcp-cs!eneevax!phaedrus

henry@utzoo.UUCP (Henry Spencer) (05/14/86)

While not wanting to run down what Chris Torek has done, or discourage
people from using it (*any* serious effort to do good error-handling is
a win over most existing Unix programs!), it is most unfortunate that
it uses the same name as the error() routine in Kernighan&Pike, but with
a different and incompatible calling sequence and semantics.
-- 
Join STRAW: the Society To	Henry Spencer @ U of Toronto Zoology
Revile Ada Wholeheartedly	{allegra,ihnp4,decvax,pyramid}!utzoo!henry

jso@edison.UUCP (John Owens) (05/14/86)

> When was the last time
> you got an "IOT trap - core dumped" on, say, your NCR machine?  I got one
> on my Gould PN9000 recently, and I still haven't figured out where it came
> from.

It's rather obscure, but the abort() library function causes a process
to send a SIGIOT to itself....  I believe this was chosen because it
was very rare to get an IOT in a normal user-mode process; you almost
had to be writing in assembler.

	John Owens @ General Electric Company	(+1 804 978 5726)
	edison!jso%virginia@CSNet-Relay.ARPA		[old arpa]
	edison!jso@virginia.EDU				[w/ nameservers]
	jso@edison.UUCP					[w/ uucp domains]
	{cbosgd allegra ncsu xanth}!uvacs!edison!jso	[roll your own]

phaedrus@eneevax.UUCP (Praveen Kumar) (05/16/86)

>From: henry@utzoo.UUCP (Henry Spencer)

>While not wanting to run down what Chris Torek has done, or discourage
>people from using it (*any* serious effort to do good error-handling is
>a win over most existing Unix programs!), it is most unfortunate that
>it uses the same name as the error() routine in Kernighan&Pike, but with
>a different and incompatible calling sequence and semantics.

Agreed, but Chris's code is better.  I mean we can't keep saying that
K&R are the Holy Stone Tablets anymore.  They don't even have typedefs
or enums in there.
-- 
"This ain't my goddamn planet, understand monkey-boy!"

phaedrus@eneevax.umd.edu or {seismo,allegra}!umcp-cs!eneevax!phaedrus

john@polyof.UUCP ( John Buck ) (05/18/86)

> 
>     Is there anyone out there who is as frustrated as I am about the way
> that UN*X reports errors?  I'm getting pretty tired of trying to, say,
> .
> .
> .
> Stack overflow - obvious.
> Boundary condition ...
>
> Dave Cornutt
> Gould Computer Systems 

Seems to me that you are guilty of the same "crimes" you accuse the "pdp11"
folks of.  Several of your so-called "machine-independent" traps, like
Stack overflow, Boundary condition are slanted toward the (clumsy) Gould-like
architechure.  Many machines know how to handle stack violations correctly (by
growing the stack; in fact to not grow the stack automatically is probably a
flaw with the implementation or architecture.)
Imagine getting a stack overflow on a pdp11? what does that mean? (in a user
program that is)  Is it the same as a segmentation violation? (stack collides
with data?) or what?  Very unclear.

Boundary violations (doubleword, whatever) also fall into this category.
Some machines allow 'odd' addresses for words (Intel, for example), others
do not.  Hence, Boundary condition is not machine independent.

The bottom line is, it is very difficult to isolate the individual
flaws/features of every architecture and have accurate error reporting
under a single OS (that is portable).
It is not impossible, but difficult.  Software compatibility must be
maintained as well.  If the signal structure is changed for each machine,
this becomes difficult.

John Buck
Polytechnic Inst. of NY
Route 110
Farmingdale, NY 11735
trixie!polyof!john
iguana!polyof!john

aglew@ccvaxa.UUCP (05/20/86)

~> Chris Torek's error() routine

I've been pushing a similar version of this routine that I call errorf()

    ERRORF(3)            UNIX Programmer's Manual             ERRORF(3)

    NAME
	 errorf - error message output routine

    SYNOPSIS
	 errorf(quit, format [, arg ] ...  )
	 int quit;
	 char *format;

    DESCRIPTION
	...

Pretty much the same, except that I never printed errno and the error
message - I left it up to the user to do that.  I did, however, provide a
function errnostr(errno) that returned a (char *) message like perror(),
suitable for embedding: 
	
    if( read(fd,buf,len) < 0 ) 
	errorf(-1,"%s trouble reading",errnostr(errno));

I have also nearly always found it useful to make the command line
parameters of a program into globals. I usually have Argc and Argv globals
set up in crt0.

I think that a good standard for error messages is a worthwhile idea. I am
certainly willing to accept Chris's error function, if it seems that it will
be accepted. However, as Chris remarks, the name `error()' is unfortunately
confusing - I think `errorf()' is a bit less confusing, and is agreeably
similar to `printf'.  I would also be strongly against adding more
parameters, like a FILE * for output stream - the beauty of it is being
short and simple, not very distracting. Providing the errno as an argument
is probably good, although I will continue to use errnostr for other reasons.

As for output redirection, may I suggest that the thing to do is to provide
a global pointer to an output function of your choice? I used this to
provide an easy way of directing output to a full screen menu display - a
lot easier than setting up a stream that behaved correctly on full screen
displays.

Andy "Krazy" Glew. Gould CSD-Urbana.    USEnet:  ihnp4!uiucdcs!ccvaxa!aglew
1101 E. University, Urbana, IL 61801    ARPAnet: aglew@gswd-vms

henry@utzoo.UUCP (Henry Spencer) (05/23/86)

> >... same name as the error() routine in Kernighan&Pike...
> 
> ... I mean we can't keep saying that
> K&R are the Holy Stone Tablets anymore.  They don't even have typedefs
> or enums in there.

True, but I wasn't referring to K&R.  K&P is much more recent and is
much copied.
-- 
Join STRAW: the Society To	Henry Spencer @ U of Toronto Zoology
Revile Ada Wholeheartedly	{allegra,ihnp4,decvax,pyramid}!utzoo!henry