[comp.lang.c] Managing error strings in C

blambert@lotus.lotus.com (Brian Lambert) (01/11/91)

Hi:

I was wondering if anyone out there had any clever ways of handling
error messages in C.  That is, in small/medimum size programs one
usually winds up with all sorts of:

    PrintLog("Memory allocation error");

lines in the program.  I have seen code where people define an array of
char pointers to error messages used in the program such as:

    char *errors[] = {
        "Memory allocation error",
        "Can't open file",
        ...
    };

    #define NO_MEMORY    0
    #define CANT_OPEN    1

and then prints out the error message by:

    PrintLog(errors[NO_MEMORY]);

I'm sure there are a zillion ways to do this.  I have used this method
in the past, but was never very happy with it.  (It's not very elegant,
and is difficult to maintain as one must be sure to use the proper
number in the #define.)

Got a better idea?

Thanks!

--

Brian Lambert
Lotus Development Corporation
blambert@lotus.com

dave@cs.arizona.edu (Dave P. Schaumann) (01/11/91)

In article <1991Jan10.122227@lotus.lotus.com> blambert@lotus.lotus.com (Brian Lambert) writes:
>Hi:
>
>I was wondering if anyone out there had any clever ways of handling
>error messages in C. [...]
>
> [ method deleted ]
>
>I'm sure there are a zillion ways to do this.  I have used this method
>in the past, but was never very happy with it.  (It's not very elegant,
>and is difficult to maintain as one must be sure to use the proper
>number in the #define.)
>
>Got a better idea?

Sure do.  I don't know how 'clever' it is, but it's worked well for me in
the past.  Really, it's just a variation on what you had, but I think a bit
cleaner.

In some .h file, I have an enum type:
typedef enum { NO_MEM, FOO_BARRED, BAR_FOOED, CODE_SPAMMED } error_t ;

Then, I can just say something like 'error( NO_MEM, <helpful strings> )'
and the routine error will have a switch on every name in 'error_t'.

Additionally, for a large program, it may be helpful to catagorize the
errors.  For example, last spring I wrote a small compiler for a class, and
I had syntax_error(), semantic_error(), and system_error().  Each one of
these dealt with an error condition in a different way (besides just printing
a message): syntax_error() also dealt with token skipping, semantic_error()
dealt with manipulating the symbol table, and system_error() called exit(1). :)

The main advantage of this method over what you had is that it is easily
expandable.  If I want to add a new error name, I just add it to the enum list,
and add a new entry in the switch.  The way you had it, you would also have to
make sure the value of the error name corresponded with the position of the
error text in your array.

>Thanks!

De nada.

>Brian Lambert
>Lotus Development Corporation
>blambert@lotus.com


Dave Schaumann		| You are in a twisty maze of little
dave@cs.arizona.edu	| C statements, all different.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/11/91)

In article <636@caslon.cs.arizona.edu> dave@cs.arizona.edu (Dave P. Schaumann) writes:
> In some .h file, I have an enum type:
> typedef enum { NO_MEM, FOO_BARRED, BAR_FOOED, CODE_SPAMMED } error_t ;
> Then, I can just say something like 'error( NO_MEM, <helpful strings> )'
> and the routine error will have a switch on every name in 'error_t'.

While this does let you add errors easily, it doesn't handle
user-defined error messages. This version does, and is portable to
compilers without enum:

#define NO_MEM 1
#define NO_MEM_ERR "Out of memory"
#define CANT_OPEN 2
#define CANT_OPEN_ERR "Can't open"
..

struct { int n; char *s; } errors[] = {
{ NO_MEM, NO_MEM_ERR } ,
{ CANT_OPEN, CANT_OPEN_ERR } ,
..
}

(``const char'' would be better under ANSI.) The error checker can just
do a linear search. The only thing you have to make sure of is that all
the error numbers are different.

---Dan

gamerine@bluemoon.uucp (Glenn Amerine) (01/11/91)

blambert@lotus.lotus.com (Brian Lambert) writes:

> Hi:
> 
> I was wondering if anyone out there had any clever ways of handling
> error messages in C.  That is, in small/medimum size programs one
> usually winds up with all sorts of:
> 
>     PrintLog("Memory allocation error");

The C Users Group had a real nifty way of doing this with conditional
macros to either get the value macro or the message macro, but I
tossed it on my "neat-stuff-I-gotta-try" stack and can't find it
right now. I'll look for it for you if you want.
The thing that really struck me as neat about the method is that it is
very easy to keep the right error values with the right error messages.

Glenn

chip@tct.uucp (Chip Salzenberg) (01/12/91)

According to dave@cs.arizona.edu (Dave P. Schaumann):
>In some .h file, I have an enum type:
>typedef enum { NO_MEM, FOO_BARRED, BAR_FOOED, CODE_SPAMMED } error_t ;

We use a similar approach.

>Then, I can just say something like 'error( NO_MEM, <helpful strings> )'
>and the routine error will have a switch on every name in 'error_t'.
>
>Additionally, for a large program, it may be helpful to catagorize the
>errors.

Our errors are classified using an ErrClass structure, which contains
(among other things) the name of the error class and the address of a
function that, given the error code, returns a descriptive string.
The error() function has this prototype:

    typedef long ErrCode;
    void error(long action, const ErrClass *class,
               ErrCode code, const char *fmt, ...);

The remainder of the parameters are used a la printf() to generate
the error message.

The action is a bit map of things to do, such as writing the message
to a log (typically a circular file), writing it to stderr, and/or
exiting.

Applications can have application-specific error handlers that use and
modify the action value.

This error handling support has been used in two large applications;
it seems to work quite well.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
       "If Usenet exists, then what is its mailing address?"  -- me
             "c/o The Daily Planet, Metropolis."  -- Jeff Daiell

browns@iccgcc.decnet.ab.com (Stan Brown) (01/14/91)

In article <2779.278d8741@iccgcc.decnet.ab.com> I wrote:
   [scheme of storing messages in an external file]

In article <645@caslon.cs.arizona.edu>, dave@cs.arizona.edu (Dave P. Schaumann) writes:
> Agreed.  If you've got 50K of error messages, this is the way to go.  A
> refinement I would make is to make the error file random access, so the
> whole file wouldn't have to be read in to get one message.

Messages are of the form XKKNN, where NN is serial within XKK.  A small
program reads the message file, sorts it, and prepends the fseek( )
numbers for all the XKK values that occur.  The real program reads the
table in at the beginning of execution, then if it needs a message it
randomly accesses XKK00 and reads sequentially from there.  This means
that only the message codes, not their locations, are compiled into the
program.  (The message file changes frequently during development.)

> |   2. Messages can be translated to other languages without 
> |changing the program.
> What about other program text?  You wouldn't be able to take advantage of
> this unless *all* of your messages were stored externally.

Right. Actually all text _is_ in the file, but I mentioned only errors
because that was the subject of the thread.  BTW, the decimal point '.'
is in the file because in some non-English languages that would be ','.

Please do not attribute these remarks to any other person or company.
                                       email (I think): browns@ab.com
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA    +1 216 371 0043

salomon@ccu.umanitoba.ca (Dan Salomon) (01/16/91)

In article <1991Jan10.122227@lotus.lotus.com> blambert@lotus.lotus.com (Brian Lambert) writes:
>Hi:
>
>I was wondering if anyone out there had any clever ways of handling
>error messages in C.  That is, in small/medimum size programs one
>usually winds up with all sorts of:
>
>    PrintLog("Memory allocation error");
>
>lines in the program.  I have seen code where people define an array of
>char pointers to error messages used in the program such as:
>
>    char *errors[] = {
>        "Memory allocation error",
>        "Can't open file",
>        ...
>    };
>
>    #define NO_MEMORY    0
>    #define CANT_OPEN    1

One method that worked fairly well for me was to write an error
message preprocessor program.  The error messages are stored in
an easy to maintain data file, and a program generates .h files
from that data file.   I.e:

		-------------------
		|  Error message  |
		|  data file      |
		|  "errors.dat"   |
		-------------------
			|
			V
			|
		-------------------
		|  Error message  |
		|  preprocessor   |
		-------------------
	       /                   \ 
	      /                     \ 
	     /                       \
    -------------------       -------------------
    |   Error code    |       |  Error message  |
    | definition file |       |  init file      |
    |  "err_codes.h"  |       |  "err_mess.h"   |
    -------------------       -------------------

The files will look something like the following:

errors.dat
----------
NO_MEMORY  "Memory allocation error"
CANT_OPEN  "Can't open file"
  ...          ...

err_codes.h
-----------
#define NO_MEMORY    0
#define CANT_OPEN    1
   ...

err_mess.h
----------
char *errors[] = {
    "Memory allocation error",
    "Can't open file",
       ...
    };

The preprocessor is trivial to write, since it is just copying strings
from one file to another and maintaining a counter.  The data file is
easy to maintain, since it doesn't need to include all the messy C
punctuation.  A utility like make can be used to rerun the preprocessor
automatically every time the data file is modified.
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

harrison@necssd.NEC.COM (Mark Harrison) (01/18/91)

In article <1991Jan10.122227@lotus.lotus.com>,
blambert@lotus.lotus.com (Brian Lambert) writes:

> I was wondering if anyone out there had any clever ways of handling
> error messages in C.

> I have seen code where people define an array of
> char pointers to error messages used in the program such as:
> 
> I have used this method
> in the past, but was never very happy with it.  (It's not very elegant,
> and is difficult to maintain as one must be sure to use the proper
> number in the #define.)

On several compiler related projects I have worked on, we did it
something like this:

1.  Create a text file that incorporates an identifier, error message,
    and help text:

ERR_NOSYM "Undefined symbol: %s"
          "Undefined symbol: \fIsymbol-name\fP"
{
	You have referenced a symbol which you have not defined.
	blah blah...
}

2.  Generate a .h file that has the #defines:

	#define ERR_NOSYM 101
	etc...

3.  Generate a .c file that has the initialized strings:

	char *errmessages[] = { "Undefined symbol: %s", ... };

4.  Use a standard error function to display error messages.  Use
    varargs to grab the parms passed.:

	err(ERR_NOSYM, symname);

5.  Generate a beautifully formatted troff document that can
    be put into the user docs.  (This is the good part.)
    Documentation people will like you, you will like not
    having to proofread some obsolete error appendix, etc.

One major benefit of this is that it makes it easy to move from
(human) language to language.  Just give the text file to your
translator, and when he gives it back, generate a new version
of your product.

We used nawk to generate the .h, .c, and .doc files.

Mark.
-- 
Mark Harrison             harrison@necssd.NEC.COM
(214)518-5050             {necntc, cs.utexas.edu}!necssd!harrison
standard disclaimers apply...

gwyn@smoke.brl.mil (Doug Gwyn) (01/18/91)

In article <689@caslon.cs.arizona.edu> dave@cs.arizona.edu (Dave P. Schaumann) writes:
>I have heard arguments against using enum which go "it wasn't there when I
>started using C".

The arguments that I have heard are:
	(1) It wasn't in K&R 1st Edition, so many existing compilers
	don't support it.
	(2) It was misimplemented in many compilers (notably some
	releases of PCC), so it is difficult to use.

I agree that if you can count on enum support that works right, it is
a valuable language construct.  For maximum portability at present you
might want to avoid using it, however.

sja@sirius.hut.fi (Sakari Jalovaara) (01/20/91)

> a) "Error %{1}d on line %{2}d" and b)  "Line %{2}d: error %{1}d detected"

Pretty much the tracks I was thinking along - except that the type of
the object (here "d" for "int") being printed does not really belong
in the message format string: if a user can customize his own messages
imagine what happens with

	"Line %{2}s: error %{1}f detected"
		  ^ oops       ^ oops

A program should not crash if it is given bad input and a user should
not need to know things like the types used internally in a program.
Here is how I thought an error text might be printed:

	message (HELLO_WORLD, (char *) 0);
	message (ERROR_ON_LINE,
			/* error name / type / argument */
			"line",         "d",   line_number,
			"error",        "s",   error_text,
			(char *) 0);

and the error text database would look something like

	HELLO_WORLD Hello, world!
	ERROR_ON_LINE Error {error} on line {line}

with automatically generated indexes in the beginning of the file for
faster access.

Implementing that is left as an exercise for the reader.

(Even that isn't enough if you want really fancy messages - how can
you pluralize words or determine if you need "a" or "an" articles etc.
For a multi-lingual system doing that takes separate binaries for
different human languages or writing the message system in an
interpreted language.  Yuck.)

***

The MIT Athena project has some kind of an error reporting tool that
takes specifications like

	ec ZERR_AUTHFAIL, "Server could not verify authentication"

and turns them to .c and .h files (the messages are compiled in in the
a.out.)  It is called "et" and is distributed at least as a part of
Zephyr (ftp athena-dist.mit.edu).  I haven't tried it.
									++sja

adrian@mti.mti.com (Adrian McCarthy) (01/22/91)

In article <2271@wzv.win.tue.nl> wietse@wzv.win.tue.nl (Wietse Venema) writes:
>Error numbers, file names, line numbers etc. are often available as
>global variables. These do not need to be passed to the message
>formatter at all, if it understands format strings that look like:
>
>	"Error %{error_number} on line %{line_number}"
>
>and consults the corresponding global variables to do %{...} expansion.

This has its good and bad sides.

Good:  It's efficient and solves the problem at hand.

Bad:  If the message printer has to know about global variables specific to
a particular program, it can't be re-used in another without modification.

(This could be solved by using a structure that maps names used in the error
messages to global variables used in the particular program.  Handling the
various types could be tricky.)

Bad:  It tempts programmers to put too much information into global variables
which renders the program difficult to modify and maintain.

Aid.  (adrian@gonzo.mti.com)

ted@isgtec.uucp (Ted Richards) (01/22/91)

In article <1991Jan19.181522.23871@santra.uucp> sja@sirius.hut.fi (Sakari Jalovaara) writes:
 
> (Even that isn't enough if you want really fancy messages - how can
> you pluralize words or determine if you need "a" or "an" articles etc.
> For a multi-lingual system doing that takes separate binaries for
> different human languages or writing the message system in an
> interpreted language.  Yuck.)

No it doesn't.  All you need are separate messages for singular and
plural, for each type of word that could appear following "a" or "an",
etc.

--
Ted Richards          ...uunet!utai!lsuc!isgtec!ted         ted@isgtec.UUCP
ISG Technologies Inc.   3030 Orlando Dr. Mississauga  Ont.  Canada   L4V 1S8

ted@isgtec.uucp (Ted Richards) (01/22/91)

In article <		co(*15[i])871@utctra.uucp> sja@sirius.hut.fi (Sakari Jalovaara) writes:
 
> (Even that isn't enough if you want really fancy messages - how can
> you pluralize words or determine if you need "a" or "an" articles etc.
> For a multi-lingual system doing that takes separate binaries for
> different human languages or writing the message system in an
> interpreted la
}
ge.  Yuck.)

No it doesn't.  All you need are se|        e m " for singular and
plural, for each type of word that could appear following "aombinatan",
etc.

--
}
}Richards          ...uunet!utai!lsuctr[i] = 0;	ptr[j         t sig.	i pointer Point )
ch Technologies Inc.   3030 Orlando Dr	
	ssiprintfga  Ont.  Canada   L4V 1S8
#! rnews 1351up =r.columbia.edu!spool2.mu.
        ||  g!munnari.oz.au!csource!david
From: david@csource.oz.au (david nugent(i <Subject: Re: An Ubiquitous C bug (b51.ecnurce.o

karl@ima.isc.com (Karl Heuer) (01/22/91)

In article <1991Jan19.163652.9203@hemel.bull.co.uk> boyce@hemel.bull.co.uk (David Boyce) writes:
>>[Moving all your text strings, including printf formats, into an external
>>file also allows it to be translated into other natural languages.]
>
>There is a problem here, in that different human languages do not
>necessarily have the same sentence structure.  [So you need to allow argument
>reordering as well.]  Is there anything like this already around?

X/Open has it.  "%4$d" means print the 4th argument in decimal.  Note that the
naive implementation of pulling the nth word from the stack doesn't work even
on a vaxlike architecture, since you really need the nth argument rather than
the nth word.  (E.g. printf("%2$d %3$d %1$f %4$f", 1.0, 2, 3, 4.0) must work.)

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint

brad@SSD.CSD.HARRIS.COM (Brad Appleton) (01/23/91)

Just wanted to add my $.02 ...

So far, All Ive seen on this thread talks only  about associating text, and a
unique number to an error as in:

  { EFOO, "unable to foo" },
          .
          .
          .

The stuff on parameter expansion/substitution in the format string is
interesting, however I believe one more thing should be included as part
of the "error" object - namely a severity (or level), as in:


  { EFOO, WARNING, "unable to foo" },
          .
          .
          .

The most common severity types are:
   1) Diagnostic (information messages only)
   2) Warnings
   3) Non-Fatal Errors
   4) Fatal Errors

I have also seen more "severity" levels used by adding an element of
user control to Warnings and Non-fatal errors such as, conditionally prompting
to continue or forcing termination after warnings and/or non-fatal errors
(such behavior may have been specified on the command-line and/or in an 
environment variable).

I think it is important to have some sense of severity associated with the
error in addition to its text and its id! As to the kind and number of 
severities or levels, that by itself is a worthy topic for discussion!

______________________ "And miles to go before I sleep." ______________________
 Brad Appleton           brad@ssd.csd.harris.com       Harris Computer Systems
                             uunet!hcx1!brad           Fort Lauderdale, FL USA
~~~~~~~~~~~~~~~~~~~~ Disclaimer: I said it, not my company! ~~~~~~~~~~~~~~~~~~~

gwyn@smoke.brl.mil (Doug Gwyn) (01/24/91)

In article <1991Jan22.021205.29632@dirtydog.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes:
>X/Open has it.  "%4$d" means print the 4th argument in decimal.

Apart from the implementation issues you mentioned, I'd like to point
out that the character '$' is not universally available.