[comp.lang.c] stdio error detection

ok@quintus.UUCP (Richard A. O'Keefe) (11/26/87)

I am trying to make my programs as bullet-proof as possible.
One thing I am worried about is fclose().
- Under what conditions (other than not-open file or invalid address &c)
  can this return an error result?
- How can I tell what went wrong?
- What, if anything, can I *do* about it?

Another thing is fopen().  How can I find out whether a NULL result means
- fopen can't obtain another file descriptor, or
- fopen can't malloc() another struct _iob, or
- fopen is not allowed to open the file
PORTABLY?  At the moment, if I really want to bullet-proof my code, I
am reduced to
- checking access() to see if the file exists
- malloc()ing a FILE block and freeing it to ensure that there is room
- then calling fopen()
which is so painful that I usually don't do it, and just say
	<actual program name>: cannot (read|write|append) <file name>

Yes, I have read the manual.  I have read the V7, 4.2BSD, V.2, and
SunOS 3.2 manual section 2 and 3 from divider to divider, also the
SVID and volume 1 of the SAS Lattice C manual for IBM MVS & VM/CMS
AND the October '86 draft of the ANSI C standard.

Someone in this newsgroup earlier suggested checking errno.  If I
remember correctly, his code had the form
	if (--stdio function fails--) { perror(--suitable argument--); ...
This doesn't seem like a good idea.  The effect of the stdio functions
on errno is totally undefined in all cases; errno is only defined when a
*system call* fails (not when it succeeds).  It would be really nice if
the stdio functions were defined to set errno (any ANSI C people care to
comment?). Of course there is no official errno code which means that
"malloc() ran out" -- another nasty gap.

chris@mimsy.UUCP (Chris Torek) (11/27/87)

In article <289@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A.
O'Keefe) writes:
<I am trying to make my programs as bullet-proof as possible.
<One thing I am worried about is fclose(). ... Another thing is fopen().
<... Someone in this newsgroup earlier suggested checking errno.  If I
<remember correctly, his code had the form
<	if (--stdio function fails--) { perror(--suitable argument--); ...
<This doesn't seem like a good idea.  The effect of the stdio functions
<on errno is totally undefined in all cases; errno is only defined when a
<*system call* fails (not when it succeeds).

Yes; this is a problem.  The 4.3BSD fopen routine sets errno to
ENOMEM or EMFILE when it runs out of `FILE's.  In general, though,
there is no way to tell why a library routine failed and what can
be done to fix it.

I dislike the errno mechanism, but given its existence, I think that
library routines ought to set it on errors.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@brl-smoke.ARPA (Doug Gwyn ) (11/27/87)

In article <289@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>- Under what conditions (other than not-open file or invalid address &c)
>  can this return an error result?

Assuming no bugs in your code, fclose() can fail only if the stream had some
unwritten buffered data and an I/O error occurred while writing it out.

>- How can I tell what went wrong?

Clear errno before invoking fclose(), then if it fails, examine errno.

>- What, if anything, can I *do* about it?

Probably not much.  A disk filled up or went off-line or something like that.

>Another thing is fopen().  How can I find out whether a NULL result means
>...

Again, the simplest thing is to let errno tell you.

>which is so painful that I usually don't do it, and just say
>	<actual program name>: cannot (read|write|append) <file name>

That is the traditional way to report the failure.  Finer discrimination
between causes of stdio errors is, as you remark, a pain.  There is also
no standardization for low-level error causes outside the UNIX environment,
so a fully portable approach to reporting details of errors is impossible

>	if (--stdio function fails--) { perror(--suitable argument--); ...
>This doesn't seem like a good idea.

True in general, but if you assume that the stdio function failed because
some system call failed, it is fairly reliable.

>It would be really nice if
>the stdio functions were defined to set errno (any ANSI C people care to
>comment?).

Set errno to what?  There is no way the C standards committee is going to
attempt to identify all OS-dependent low-level error causes and provide
standard encodings for all of them!

> Of course there is no official errno code which means that
>"malloc() ran out" -- another nasty gap.

In a UNIX environment, ENOMEM (or, less likely, EAGAIN) are the errno
codes for inability to obtain more memory for the process.  If malloc()
cannot obtain enough memory, one of these codes will be set in errno.

ok@quintus.UUCP (Richard A. O'Keefe) (11/28/87)

In article <6748@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn)
replied to my question about stdio error handling.
One of the things he said was
> Assuming no bugs in your code, fclose() can fail only if the stream had some
> unwritten buffered data and an I/O error occurred while writing it out.
and I'm very pleased to see this, because I'd never thought of that.
But when he says
> Again, the simplest thing is to let errno tell you.
I was disappointed, because the whole point of my message is that it
DOESN'T.  He says of this use of errno:
> True in general, but if you assume that the stdio function failed because
> some system call failed, it is fairly reliable.
No way is it reliable.  (a) I have no way of telling whether a stdio
function failed because some system call failed or for some other
reason -- that is what I was ASKING for!  (b) There is nothing to stop
a valid implementation of stdio calling some other system call after
the failed one, and that undefines errno even if the later system call
succeeds.
> In a UNIX environment, ENOMEM (or, less likely, EAGAIN) are the errno
> codes for inability to obtain more memory for the process.  If malloc()
> cannot obtain enough memory, one of these codes will be set in errno.
In SunOS 3.2 the malloc() family is explicitly defined to set errno, and
the different cases are clearly distinguished.  However, this is NOT in
the SVID, and it is NOT in any System V manual I've checked.  POSIX
(the IEEE 1003.1 standard) inherits malloc() from the C standard, which
is silent on this point.  (It says that some functions *must* set errno,
and that some functions *needn't*, but is silent about most and never
says *which* value errno is set to.)

Here at Quintus, I don't have to worry about porting my code to MS-DOS,
but in general I *do* have to worry about 4.2, 4.3, V.2, V.3, VMS, and
SAS Lattice C for MVS and VM/CMS.  Modulo the choice of setbuf, setvbuf,
or setbuffer, all these implementations of stdio look pretty close, and
I was hoping that maybe they had more in common than the manuals said.
One of the things that bothers me about fclose() is this:
	suppose f is a valid pointer to an open output stdio stream,
	but that fclose(f) returns EOF -- perhaps because of a write
	error when flushing remaining buffered output, or perhaps
	because of an unlucky interrupt (check man 2 close).
	IS f CLOSED?
When this is inside a loop processing a couple of hundred files, one
after the other, if f is *not* closed I can run out of streams despite
having taken care to close everything as soon as possible.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (11/28/87)

In article <290@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
- In article <6748@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn) ... said:
- > ... if you assume that the stdio function failed because
- > some system call failed, it is fairly reliable.
- No way is it reliable.  (a) I have no way of telling whether a stdio
- function failed because some system call failed or for some other
- reason -- that is what I was ASKING for!  (b) There is nothing to stop
- a valid implementation of stdio calling some other system call after
- the failed one, and that undefines errno even if the later system call
- succeeds.

With the exception of fopen() failing because there are no statically-
allocated FILE structures left, and ungetc() beyond the available space
for pushback, (again assuming no bugs in the application code) the way
that stdio functions fail is for some system call to fail.  A system
call failure sets errno and practically nothing in the C library clears
errno, so it is "sticky".  Errno can be relied on to indicate the reason
for the most recent system call (or math routine) failure.  In the case
of stdio, this will be the low-level reason the stdio function failed.

- > In a UNIX environment, ENOMEM (or, less likely, EAGAIN) are the errno
- > codes for inability to obtain more memory for the process.  If malloc()
- > cannot obtain enough memory, one of these codes will be set in errno.
- In SunOS 3.2 the malloc() family is explicitly defined to set errno, and
- the different cases are clearly distinguished.  However, this is NOT in
- the SVID, and it is NOT in any System V manual I've checked.  POSIX
- (the IEEE 1003.1 standard) inherits malloc() from the C standard, which
- is silent on this point.

UNIX implementations of malloc() can fail to obtain needed space only if
an attempted sbrk() system call fails, and sbrk() failure sets one of the
errno values I indicated (BSD systems may have slightly different values).

Again, the key is to realize that eventually these functions make one
or more system calls in the process of servicing the request, and the
reason for failure will be recorded in errno, even though this isn't
advertised by the official interface definition.

- One of the things that bothers me about fclose() is this:
- 	suppose f is a valid pointer to an open output stdio stream,
- 	but that fclose(f) returns EOF -- perhaps because of a write
- 	error when flushing remaining buffered output, or perhaps
- 	because of an unlucky interrupt (check man 2 close).
- 	IS f CLOSED?

That depends on the implementation.  A quality implementation will make
sure that the OS channel (file descriptor, on UNIX) is deallocated by
fclose() no matter what happens during the final write().

atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) (11/30/87)

Complete error handling is intrinsically OS and machine dependent.
If you are worried about a bullet proof portable solution, then
you define your own cover functions for fopen(), fclose() etc
with the semantics you want.  Typically, these will be 5-20 lines of code.
Then as part of the porting exercise for the program, you code an
implementation specific versions of the functions making use of the
implementation documentation of how these things can fail, and how
you detect this.

markb@bucc2.UUCP (12/01/87)

/* Written  9:47 pm  Nov 26, 1987 by brl-smoke.ARPA!gwyn in bucc2:comp.lang.c */
In article <289@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>- Under what conditions (other than not-open file or invalid address &c)
>  can this return an error result?

Assuming no bugs in your code, fclose() can fail only if the stream had some
unwritten buffered data and an I/O error occurred while writing it out.

>- How can I tell what went wrong?

Clear errno before invoking fclose(), then if it fails, examine errno.

>- What, if anything, can I *do* about it?

Probably not much.  A disk filled up or went off-line or something like that.

>Another thing is fopen().  How can I find out whether a NULL result means
>...

Again, the simplest thing is to let errno tell you.

>which is so painful that I usually don't do it, and just say
>	<actual program name>: cannot (read|write|append) <file name>

That is the traditional way to report the failure.  Finer discrimination
between causes of stdio errors is, as you remark, a pain.  There is also
no standardization for low-level error causes outside the UNIX environment,
so a fully portable approach to reporting details of errors is impossible

>	if (--stdio function fails--) { perror(--suitable argument--); ...
>This doesn't seem like a good idea.

True in general, but if you assume that the stdio function failed because
some system call failed, it is fairly reliable.

>It would be really nice if
>the stdio functions were defined to set errno (any ANSI C people care to
>comment?).

Set errno to what?  There is no way the C standards committee is going to
attempt to identify all OS-dependent low-level error causes and provide
standard encodings for all of them!

> Of course there is no official errno code which means that
>"malloc() ran out" -- another nasty gap.

In a UNIX environment, ENOMEM (or, less likely, EAGAIN) are the errno
codes for inability to obtain more memory for the process.  If malloc()
cannot obtain enough memory, one of these codes will be set in errno.
/* End of text from bucc2:comp.lang.c */

richard@aiva.ed.ac.uk (Richard Tobin) (12/01/87)

In article <290@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe)
points out some problems in determining what went wrong if a standard io
call failed.

>One of the things that bothers me about fclose() is this:
>	suppose f is a valid pointer to an open output stdio stream,
>	but that fclose(f) returns EOF -- perhaps because of a write
>	error when flushing remaining buffered output, or perhaps
>	because of an unlucky interrupt (check man 2 close).
>	IS f CLOSED?
>When this is inside a loop processing a couple of hundred files, one
>after the other, if f is *not* closed I can run out of streams despite
>having taken care to close everything as soon as possible.

On a unix system (I don't know about others) you could use open()
followed by fdopen() instead of fopen().  Then you'd know the underlying
file descriptor, so that you could call close() after fclose() just
to be sure.  (There are also other unportable ways to get the file
descriptor associated with a stream.)  That will stop you running out
of file descriptors, at least.

-- 
Richard Tobin,                         JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,             ARPA:  R.Tobin%uk.ac.ed@nss.cs.ucl.ac.uk
Edinburgh University.                  UUCP:  ...!ukc!ed.ac.uk!R.Tobin

henry@utzoo.UUCP (Henry Spencer) (12/02/87)

> One thing I am worried about is fclose().
> - Under what conditions (other than not-open file or invalid address &c)
>   can this return an error result?

As others have mentioned, the buffer flush that often accompanies fclose()
can produce an I/O error.  Not common.  However, checking for it is *vital*
on systems with disk quotas, because that flush may blow the quota and fail.
If you don't check the fclose() result, you get a silently truncated file.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

ok@quintus.UUCP (Richard A. O'Keefe) (12/03/87)

If you're interested in fclose(), here is the conclusion so far
from mail several people were kind enough to send me:
    -	because of buffered output,
	anything that can go wrong with fputc() can go wrong with fclose().
    -	even if you have successfully fflush()ed just before fclose()ing,
	some sort of device control error might happen
    -	when an error occurs,
	everyone thinks the stdio stream ought to be closed,
	but there is no guarantee that it is.
    -	in EUUG V7 UNIX, fclose() called fflush(), close(), and free()
	in that order.  An error might be signalled because of any of
	them, so errno could be anything, but at least in that version
	of UNIX fclose() always freed up the stdio stream.
My conclusion is that I ought to check for an error return from fclose(),
because there *might* be lost data.  In fact, I ought to fclose() or
fflush() stdout before exiting, to be sure I don't miss lost data.  But
there is nothing I can do to tell the difference between lost data and
a device that won't close.  At least with respect to UNIX, I'm not
worrying about lost streams any more.

I have definitely learned something from this:  it had never occurred
to me before that my programs ought to end with something like
	if (fflush(stdout))
	    error_exit("probable data loss from stdout");
Ouch.

If you're interested in a rambling discussion of error codes,
continue, otherwise stop now.

In article <289@cresswell.quintus.UUCP> I said
> It would be really nice if the stdio functions were defined to set
> errno (any ANSI C people care to comment?).
In article <13100001@bucc2>, markb@bucc2.UUCP replied:
> Set errno to what?  There is no way the C standards committee is going to
> attempt to identify all OS-dependent low-level error causes and provide
> standard encodings for all of them!

This is a red herring:  maybe the latest draft is radically different,
but in the October '86 draft of the ANSI C standard *NO* values of
errno are defined at all.  The standard *does* say that certain
specified functions set errno to indicate an error, but says nothing
about what they might set it TO.  All I was suggesting here was that
ANSI C should do the same sort of thing for fclose() that it already
does for signal().  Here is what the text says:
	Otherwise, a value of SIG_ERR is returned, and
	>> the integer expression errno is set to indicate the error. <<

The October 86 draft says of fclose:
	The fclose function returns zero if the stream was successfully
	closed or nonzero if any errors were detected or if the stream
	was already closed.
All I'm asking for here is for something like
	If the stream was successfully closed, the fclose function
	returns zero.
  NEW	If any errors were detected or if the stream was already
  NEW   closed, the fclose function returns nonzero, and
  NEW	the integer expression errno is set to indicate the error.

Several people told me that all my problems could be solved by
setting errno to 0 beforehand and checking it afterwards.  This is
not defined to work for ANYTHING, even a straight system call.  A
successful system call is entitled to set errno to anything it pleases.

Even supposing an implementation of stdio uses straight V7 system
calls and nothing else and that those system calls do not change
errno when they are successful, there is no reason to expect the
last unsuccessful system call done by an implementation of some
stdio routine to be the one which is responsible for the failure.
Suppose an implementation of fopen(3s), having found that open(2)
failed, tried to call stat(2) in an attempt to diagnose the error
for me, and that while executing stat(2) an I/O error occurred.
Then errno would be set to reflect the incomplete status of some
operation I neither know nor care about, AND THAT WOULD BE LEGAL.

How difficult could it be for a C implementor to ensure that the
value given to errno reflected the error responsible for the failure
of the stdio operation?  [Strictly speaking it is impossible, but
that's true of everything, not just stdio.  See signal().]  Quite
easy:  just put errno in a safe place and move it back just before
returning the failure code.  Yes, the actual numbers would be
implementation defined, but ALL errno numbers are implementation
defined.  That's a great deal better than not having any errno at all.

To repeat markb's comment:
> Set errno to what?  There is no way the C standards committee is going to
> attempt to identify all OS-dependent low-level error causes and provide
> standard encodings for all of them!
Why not?  The COBOL committee made the attempt (:-).
Seriously, provided that host-specific details are accessible some other
way (which might as well be host-specific itself), there aren't all that
many different cases.  With a little bit of mental effort, it is easy to
find error classes which are not just host-independent, but aren't even
tied to files as such.  Here are some, with UNIX examples.

E_DEADLY_PARAMETERS
    The parameters are so scrambled we nearly died trying to read them
	EFAULT	(typically means wild address, e.g. 0)

E_INVALID_PARAMETERS
    We could find the parameters, but they didn't make sense
	EINVAL	(file name syntax error)

E_NO_SUCH_OBJECT
    We found the parameters and they made sense but there is no such object
	ENOENT	(missing file or directory)

E_OBJECT_ALREADY_EXISTS
    We found the parameters and they made sense but there already is
    an object of that sort so you can't create one
	EEXIST, EADDRINUSE

E_WRONG_TYPE
    We found the object ok, but it is not the right type of object
	ENOTDIR, EISDIR, ENOTTY, ESPIPE, EPROTOTYPE, ENOTSOCK,

E_NOT_ALLOWED
    We found the object ok, but you aren't allowed to do that operation
	ENOPERM	(no write permission), EROFS, EACCES

E_BUSY_TRY_LATER
    We found the object ok, and you can do that, but it's busy with
    another caller right now.  Try again later.
	EBUSY, ETXTBSY, EALREADY, EWOULDBLOCK, EOPNOTSUPP

E_WRONG_STATE
    We found the object ok, and you might be able to do that, but you
    have the object in the wrong state just now.
	EISCONN, ENOTCONN, 

E_RESOURCE_EXCEEDED
    We found the object and you can do that, but you ran out of X
	(processes: EAGAIN, memory: ENOMEM, per process file
	table: EMFILE, system file table: ENFILE, link count field:
	EMLINK, file size: EFBIG, socket buffers: ENOBUFFS, &c)

E_OPERATION_FAILED
    We found the object and you can do that, but something went wrong
	EIO,	(physical I/O error, RFS locks lost, &c)
	ETIMEDOUT, ECONNREFUSED, ...

You want more information about various errors.  For example, if you
can't find a file:  is the device off-line, which directory is missing,
or is it the file proper?  But you can't pack it all into one integer,
and the availability of further details would be system-specific.
That doesn't mean that a coarse classification such as I'm sketched
above would be any less use, or that it would be system-specific.

This was something of a digression.  It was NOT the job of the ANSI
C committee to redesign the errno mechanism.  For ANSI C, specifying
that errno is set, without saying what it is set to, is the right
choice.  Given that the description of <stdio> has always been
incredibly vague, ANSI C's leaving it undefined whether error is set
to indicate a stdio error, or some other error, or the phase of the
moon, is probably the right choice too:  no VALID existing program,
it appears, could have depended on the setting of errno after a
stdio operation, so no valid existing program will be broken by
leaving it undefined.  But it would be a help to new code if the
stdio operations were defined to set errno on error, and it would be
interesting to know whether the ANSI C committee considered doing
this, and what practical obstacles they found.

I used to be rather fond of C, but this error stuff is quite
incredibly bad.  The problem isn't really the language; it's
the libraries.  For a really horrible example of an under-specified
library package, look up "hsearch" in the SVID or a System V manual.

daveb@geac.UUCP (12/03/87)

In article <9022@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>As others have mentioned, the buffer flush that often accompanies fclose()
>can produce an I/O error.  Not common.  However, checking for it is *vital*
>on systems with disk quotas, because that flush may blow the quota and fail.

  I might add that it is just plain good practice, on Unix or
whatever, whether or not you have disk quotas:
  1. Some machines are inherently small (desktop stuff),
  2. Some machines are run at close to 100% full (lots!), and
  3. Some machines suffer horrible faults at 100% full...

  David (Mr Reality) Haynes, the systems administrator here, noted
that we fall into class three (we've got a Berzerkly box).

--dave (oh my aching disk drive) c-b
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

dsill@NSWC-OAS.arpa (Dave Sill) (12/03/87)

In article <336@cresswell.quintus.UUCP>, Richard O'Keefe <quintus!ok> writes:

	[library functions should set errno on error]

I've always thought there should be more attention paid to libraries,
they're such an important part of the language.  It's almost common
knowledge that FORTRAN floating point libraries are superior to those
found with C compilers.  This is understandable, given that FORTRAN is
used almost exclusively for scientific computations.  So it should
follow, then, that C's libraries, being for a "systems programming"
language would be adept at handling errors.

Unfortunately, they're not.  The defacto level of error detection is a
binary flag; a function can say either "here's the result, everything
worked" or "oops, something's wrong, that didn't work".  I don't know
how much the dpANS says about library functions, but I can understand
why X3J11 wouldn't want to touch this with a ten-foot can opener.  It
might have been more in line with the POSIX standard to change the
behavior of standard library functions, but it still would have
required breaking new ground, i.e., there is none of the "prior art"
that standards groups like so much.

>I used to be rather fond of C, but this error stuff is quite
>incredibly bad.  The problem isn't really the language; it's
>the libraries.

I agree, but come to think of it, I don't know of a language with
error handling significantly better.  BASIC?  FORTRAN?  Pascal?  When
a file `open' or `close' fails, can you find out why?

Rather than messing with errno, I think a new variable, say, liberr,
should be used.  An include file, say liberr.h, could contain macro
definitions for the various types of errors.  A macro named LIBERR
could also be defined in liberr.h so code could be written that would
take advantage of liberr if it was available or handle errors in the
usual way if it's not.  Even better would be to have LIBERR be a
predefined macro like ANSI, unix, vax, et cetera.

There might be a better way.  Any ideas?

----
These are my personal opinions.
----

"Don't let existing sub-optimal solutions cloud your vision."
			-- Fred Fish

rpd@F.GP.CS.CMU.EDU (Richard Draves) (12/03/87)

stdio error detection should certainly be improved.  However, please
don't consider errno-like schemes that use global variables.  They
fail miserably in multi-threaded environments.

Rich

decot@hpisod2.UUCP (12/05/87)

     EEEERRRRRRRRCCCCTTTTLLLL((((2222))))                 EEEExxxxppppeeeerrrriiiimmmmeeeennnnttttaaaallll                  EEEERRRRRRRRCCCCTTTTLLLL((((2222))))




     NNNNAAAAMMMMEEEE
          errctl - specify what to do when system calls fail

     SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
          iiiinnnntttt ((((((((****eeeerrrrrrrrccccttttllll)))) ((((_f_u_n_c))))))))
          iiiinnnntttt ((((****_f_u_n_c))))(((())));;;;

          ####iiiinnnncccclllluuuuddddeeee <<<<ssssiiiiggggnnnnaaaallll....hhhh>>>>
          ####iiiinnnncccclllluuuuddddeeee <<<<eeeerrrrrrrrnnnnoooo....hhhh>>>>

          iiiinnnntttt _f_u_n_c ((((_c_a_l_l_i_d,,,, _s_y_s_e_r_r_n_o,,,, _r_e_t_v_a_l,,,, _a_r_g_s))))
          iiiinnnntttt _c_a_l_l_i_d,,,, _s_y_s_e_r_r_n_o,,,, ****_r_e_t_v_a_l,,,, ****_a_r_g_s;;;;

     DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
          _E_r_r_c_t_l specifies what do to when the calling process calls a
          function listed in Section 2 of this manual and the function
          fails.  This occurs whenever the system would ordinarily set
          the  value  of the external integer variable eeeerrrrrrrrnnnnoooo to a non-
          zero value.  Using this facility, the programmer can  supply
          a custom function to provide modular handling of exceptional
          errors.

          The value of _f_u_n_c indicates the  action  to  be  taken  when
          subsequent system calls fail.  The possible values are:

          EEEERRRRRRRR____DDDDFFFFLLLL        Upon the failure of a system  call,  set  the
                         value  of  eeeerrrrrrrrnnnnoooo  accordingly, and return the
                         value  ordinarily  returned   by   the   call
                         (usually ----1111) unmodified.  This is the default
                         action.

          EEEERRRRRRRR____IIIIGGGGNNNN        Upon the failure of a  system  call,  do  not
                         change  the  value  of  eeeerrrrrrrrnnnnoooo, but return the
                         value  ordinarily  returned   by   the   call
                         (usually ----1111) unmodified.

          _f_u_n_c_t_i_o_n _a_d_d_r_e_s_s
                         Upon the failure of a system call, invoke the
                         indicated  handling  function.  The arguments
                         passed  to  the  handling  function  are   as
                         follows:

                         _c_a_l_l_i_d         A   system   call   identifier
                                        representing  the  system call
                                        that failed.  Values  for  the
                                        identifier  are  of  the  form
                                        SSSSYYYYSSSS_____C_A_L_L, where  _C_A_L_L  is  the
                                        name   of   the   system  call
                                        converted   to   upper   case.
                                        These  values  are  defined in
                                        <<<<ssssiiiiggggnnnnaaaallll....hhhh>>>>.



     Hewlett-Packard Company       - 1 -                   Dec 4, 1987






     EEEERRRRRRRRCCCCTTTTLLLL((((2222))))                 EEEExxxxppppeeeerrrriiiimmmmeeeennnnttttaaaallll                  EEEERRRRRRRRCCCCTTTTLLLL((((2222))))




                         _s_y_s_e_r_r_n_o       The eeeerrrrrrrrnnnnoooo value  corresponding
                                        to the failure (see _e_r_r_n_o(2)).

                         _r_e_t_v_a_l         A  pointer   to   a   location
                                        containing  the  integer value
                                        that   ordinarily   would   be
                                        returned  by the failed system
                                        call.

                         _a_r_g_s           An integer  pointer  parameter
                                        whose   usage   is   currently
                                        implementation-defined.

                         The external integer variable  eeeerrrrrrrrnnnnoooo  is  not
                         affected unless explicitly changed during the
                         execution of the handling function.

                         The integer value  left  in  *_r_e_t_v_a_l  by  the
                         handling  function  is  returned  to the user
                         program as the apparent return value  of  the
                         system call.

                         The integer value returned  by  the  handling
                         function determines whether the failed system
                         call is to be restarted after  completion  of
                         the handling function.  If the returned value
                         is  non-zero,  the  system   call   will   be
                         restarted.

          The status of system call error  handling  is  inherited  by
          child  processes  created by _f_o_r_k(_2) or _v_f_o_r_k(_2), and is set
          to EEEERRRRRRRR____DDDDFFFFLLLL on successful calls to _e_x_e_c(2).

     RRRREEEETTTTUUUURRRRNNNN VVVVAAAALLLLUUUUEEEE
          _E_r_r_c_t_l(2) returns the most recently installed previous value
          of _f_u_n_c installed by the calling process.

     EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS
          In a program that occasionally creates additional processes,
          changing  demand  for  system  resources  could  prevent the
          processes from being created.  In such a case, _e_r_r_c_t_l(2) can
          be used to handle such exceptions flexibly:

               ####iiiinnnncccclllluuuuddddeeee <<<<ssssiiiiggggnnnnaaaallll....hhhh>>>>
               ####iiiinnnncccclllluuuuddddeeee <<<<eeeerrrrrrrrnnnnoooo....hhhh>>>>

               iiiinnnntttt eeeerrrrrrrrhhhhaaaannnnddddlllleeeerrrr(((())));;;;

               mmmmaaaaiiiinnnn(((())))
               {{{{
                   ............



     Hewlett-Packard Company       - 2 -                   Dec 4, 1987






     EEEERRRRRRRRCCCCTTTTLLLL((((2222))))                 EEEExxxxppppeeeerrrriiiimmmmeeeennnnttttaaaallll                  EEEERRRRRRRRCCCCTTTTLLLL((((2222))))




                    ((((vvvvooooiiiidddd)))) eeeerrrrrrrrccccttttllll((((eeeerrrrrrrrhhhhaaaannnnddddlllleeeerrrr))));;;;
                   ............
                    ssssyyyysssstttteeeemmmm((((""""ssssoooommmmeeeetttthhhhiiiinnnngggg""""))));;;;
                   ............
               }}}}

               iiiinnnntttt eeeerrrrrrrrhhhhaaaannnnddddlllleeeerrrr((((ccccaaaalllllllliiiidddd,,,, ssssyyyysssseeeerrrrrrrrnnnnoooo,,,, rrrreeeettttvvvvaaaallll,,,, aaaarrrrggggssss))))
               iiiinnnntttt ccccaaaalllllllliiiidddd,,,, ssssyyyysssseeeerrrrrrrrnnnnoooo,,,, ****rrrreeeettttvvvvaaaallll,,,, ****aaaarrrrggggssss;;;;
               {{{{
                   sssswwwwiiiittttcccchhhh ((((ccccaaaalllllllliiiidddd))))
                   {{{{
                    ccccaaaasssseeee SSSSYYYYSSSS____FFFFOOOORRRRKKKK::::
                    ccccaaaasssseeee SSSSYYYYSSSS____VVVVFFFFOOOORRRRKKKK::::
                        iiiiffff ((((ssssyyyysssseeeerrrrrrrrnnnnoooo ======== EEEEAAAAGGGGAAAAIIIINNNN))))
                        {{{{
                         sssslllleeeeeeeepppp((((4444))));;;;
                         rrrreeeettttuuuurrrrnnnn ((((1111))));;;;
                        }}}}
                   }}}}

                   rrrreeeettttuuuurrrrnnnn ((((0000))));;;;
               }}}}

     SSSSEEEEEEEE AAAALLLLSSSSOOOO
          intro(2), errno(2), exec(2), fork(2).





























     Hewlett-Packard Company       - 3 -                   Dec 4, 1987

simon@its63b.ed.ac.uk (ECSC68 S Brown CS) (12/07/87)

In article <10649@brl-adm.ARPA> dsill@NSWC-OAS.arpa (Dave Sill) writes:
>>I used to be rather fond of C, but this error stuff is quite
>>incredibly bad.  The problem isn't really the language; it's
>>the libraries.
>
>Rather than messing with errno, I think a new variable, say, liberr,
>should be used.  An include file, say liberr.h, could contain macro
>definitions for the various types of errors.  A macro named LIBERR
>could also be defined in liberr.h so code could be written that would
>take advantage of liberr if it was available or handle errors in the
>usual way if it's not.  Even better would be to have LIBERR be a
>predefined macro like ANSI, unix, vax, et cetera.
>

This still has the same problem as with "errno"- namely that you're trying
to describe a general ``error condition'' using a single number! I'm told
that VMS (but it's a good idea for all that...) provides a stack of error 
values which allows a program to search backward to find out what the "real" 
error was, depending on what kind of detail is required. If you have several
levels of library calls between you and the system call that failed, this
can be extremely useful- it's not really much use having an error-value
if you can't even tell what system call it came from (let alone what parameters
were *passed* to that system call to cause it to fail!).

A *decent* error-returning mechanism would describe:

	1. What call (syscall or library call) failed.
	   This could be a number- you could use something like internet
	   addressing to put some kind of structure into it:
		libc.stdio.fopen
	2. Why it failed.
	   Simple E-numbers will do for this (although I suppose they'd
	   have to be grouped for different libraries):
		E_STDIO.E_CANNOT_OPEN_FILE
	3. What value it returned.
		(FILE *)NULL
	3. What parameters were passed to it.
	   This is the most difficult one, because it would have to have
	   some kind of idea as to the types involved. It could (I suppose)
	   deal only with string types (and convert any other type into
	   "printable" form by doing the equivalent of sprintf()'ing it).
	   It also has to be a "list", which means it would probably have
	   to be done using something like "argc,argv":
		argc: 2
		argv: "mumble.splat", "r"

If the error is not "dealt with", then this information should propogate
down (together with the info from the callee's failure), and so on...

So, If you do a
	fopen("mumble.splat","r")
and it fails, then the following would be left on the stack (in some format
or other) to be dealt with by some error-diagnosing function:

	kernel.open:
		param 1: "mumble.splat" [string]
		param 2: 0 [int]
		returns: -1 [int]
		error: E_KERNEL.ENOENT
	libc.stdio.fopen:
		param 1: "mumble.splat" [string]
		param 2: "r" [string]
		returns: 0 [FILE *]
		error: E_LIBC.E_STDIO.E_CANNOT_OPEN_FILE

The error-diagnosing stuff could then print something *useful* such as
	stdio fopen: couldn't open file "mumble.splat" for reading, because:
	    kernel open: no file or directory "mumble.splat"

(and of course the format of these messages could be user-configurable, so
that noddies would just get the information they need, whereas people who
understand what they're doing could get reams and reams of info- just by setting
some environment parameter to the appropriate value).

Of course, all this stuff would have to be known by the compiler, and I'm sure
it'd be dead slow to execute!

-- 
--------------------------------------------------
| Simon Brown                                    |
| Laboratory for Foundations of Computer Science |
| Department of Computer Science                 |
| University of Edinburgh, Scotland, UK.         |
--------------------------------------------------
 UUCP:  uunet!mcvax!ukc!lfcs!simon
 ARPA:  simon%lfcs.ed@nss.cs.ucl.ac.uk      "Life's like that, you know"
 JANET: simon@uk.ac.ed.lfcs

henry@utzoo.UUCP (Henry Spencer) (12/07/87)

> >It would be really nice if the stdio functions were defined to set errno...
> 
> Set errno to what?  There is no way the C standards committee is going to
> attempt to identify all OS-dependent low-level error causes and provide
> standard encodings for all of them!

It's not necessary to do that.  It suffices to say that the function in
question sets errno and that strerror() can be used to turn the value of
errno into a human-readable string.  This does reduce the usefulness a
bit, since your program can't analyze the precise cause and do something
intelligent about it, but it's better than nothing.  Analysis of precise
causes is tricky anyway, since there is some variation in errno codes
even between different Unixes.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

oleg@quad1.quad.com (Oleg Kiselev) (12/09/87)

In article <290@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>Here at Quintus, ... [I] have to worry about 4.2, 4.3, V.2, V.3, VMS, and
>SAS Lattice C for MVS and VM/CMS. 

The best solution would be to write your own portable libraries with your own
error handling mechanisms.  That's what was done here, at Quadratron.  So far,
this approach seems to be the best way to solve portability problems of complex
software systems.
-- 
Oleg Kiselev  --  oleg@quad1.quad.com -- {...!psivax|seismo!gould}!quad1!oleg
HASA, "A" Division

DISCLAIMER:  I don't speak for my employers.

hydrovax@nmtsun.nmt.edu (M. Warner Losh) (12/16/87)

In article <814@its63b.ed.ac.uk>, simon@its63b.ed.ac.uk (ECSC68 S Brown CS) writes:
> I'm told
> that VMS (but it's a good idea for all that...) provides a stack of error 
> values which allows a program to search backward to find out what the "real" 
> error was, depending on what kind of detail is required. If you have several
> levels of library calls between you and the system call that failed, this
> can be extremely useful- it's not really much use having an error-value
> if you can't even tell what system call it came from (let alone what parameters
> were *passed* to that system call to cause it to fail!).
VMS provides this facility AND a way of printing out the nested stack.  For
example, if a file can't be copied using the COPY command (unix types read cp)
then you get something like
%COPY-F-NOTCOPIED, File FOO.ARD;1 not copied
-SYSTEM-F-READERR, Fatal read error in device.

The second part varies tremendously, and can thus tell you that the file
can't be copied because of file protection, or file lock, or a write error
in the output, or .......

The point of all of this is that it makes error handling VERY EASY and 
almost ENJOYABLE.  You know what wnet wrong where, and can usually
do something half way intelligent about it.  (But not always, alas....)

-- 
bitnet:	lush@nmt.csnet			M. Warner Losh
csnet:	warner%hydrovax@nmtsun
uucp:	...{cmcl2, ihnp4}!lanl!unmvax!nmtsun!warner%hydrovax
	...{cmcl2, ihnp4}!lanl!unmvax!nmtsun!hydrovax
Warning:  Hydrovax is both a machine, and an account, so be careful.

peter@sugar.UUCP (Peter da Silva) (12/20/87)

In article <206@aiva.ed.ac.uk>, richard@aiva.ed.ac.uk (Richard Tobin) writes:
> (There are also other unportable ways to get the file
> descriptor associated with a stream.)

What's so unportable about fileno(fp)?
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

peter@sugar.UUCP (Peter da Silva) (12/20/87)

hydrovax@nmtsun.nmt.edu (M. Warner Losh) writes:
> %COPY-F-NOTCOPIED, File FOO.ARD;1 not copied
> -SYSTEM-F-READERR, Fatal read error in device.
> 
> The second part varies tremendously, and can thus tell you that the file
> can't be copied because of file protection, or file lock, or a write error
> in the output, or .......

What's the advantage of this over perror?

cp: foo.ard: I/O error.

The only problem being convincing programmers to USE perror. If you use
perror *as it exists* correctly (i.e., only immediately after an error
return) you have no problems.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.