[comp.lang.c] passing NULL to functions

dave@sds.UUCP (dave schmidt x194) (04/26/87)

In article <5794@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <888@viper.UUCP> john@viper.UUCP (John Stanley) writes:
> >	myfunc( i1, NULL, i2 );
> No matter what NULL is defined as, the above usage is non-portable and
> will break on some systems.  The problem is that the widths of pointers
> of various types are in general different (and different from an (int)),
> so that the wrong parameter alignment will occur unless one happens to
> be lucky.  If you pass NULL as a function parameter, you should always
> cast it to the correct pointer type, as in
> 	myfunc( i1, (struct foo *)NULL, i2 );
> 
> It is certainly true that there is a lot of code that makes this
> mistake; I must have fixed several hundred occurrences of this
> particular bug by now.  That does not make it any less erroneous.

I agree; the problem is compounded by the fact that certain C library
routines encourage this sort of thinking.  For example, the MicroSoft
Xenix manual says of time():
	long time(tloc)
	long *tloc;

... If 'tloc' (taken as an integer) is nonzero, the return value is also
stored in the location to which 'tloc' points.

While I think that this clearly indicates you should write:
"longvar = time((long *)NULL);" rather than "longvar = time(NULL);",
I have seen more instances of the latter than the former in other people's
code.  Note too that what the manual says is very misleading; assume
that 'tloc' is a 32-bit pointer on a 64K byte boundary so that its address
is, for example, 0xffff0000.  Taking this as an integer (it doesn't say
LONG integer) which is 16-bits will give you 0.   I assume the
manual means 

... if 'tloc' (taken as an integer of the requisite size) is nonzero, ...

but that's not what it says.

Worse yet, consider what the manual says about tmpnam():

	char *tmpnam(s)
	char *s;

... If 's' is NULL, ...  If 's' is not NULL ...

This CLEARLY implies that "tmpnam(NULL)" is CORRECT, which it is ***NOT***.
I think we should roast weenies who write manual pages like this before
we roast programmers who pass NULL as an argument to a function.

To ease the problems associated with this, I advocate a #define that a friend
of mine suggested:

#define NIL(type)	((type)NULL)

so that a programmer can write

 	myfunc( i1, NIL(struct foo *), i2 );

and have it be portable and the intention clear.  Along a similar vein,

#define	ALLOC(type, qty)	((type *)calloc((qty), sizeof(type)))

which I have seen in several programs.



Dave Schmidt

edw@ius2.cs.cmu.edu (Eddie Wyatt) (04/27/87)

In article <150@sds.UUCP>, dave@sds.UUCP (dave schmidt x194) writes:
> 
> In article <5794@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> > In article <888@viper.UUCP> john@viper.UUCP (John Stanley) writes:
> > will break on some systems.  The problem is that the widths of pointers
> > of various types are in general different (and different from an (int)),
> > 

  Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
sizeof(long *) == sizeof(whatever *)?  I'm amazed at this because I have
never run across this problem but then again I haven't worked on that many
different architectures.  Not that I doubt you, but could you show me
where you benifit or need pointers of different size?

> 
> #define	ALLOC(type, qty)	((type *)calloc((qty), sizeof(type)))
> Dave Schmidt

I prefer:

#define alloc(type)		((type *) malloc(sizeof(type)))
#define alloc_array(type,size)	((type *) malloc(sizeof(type)*size))
#define mfree(x)		free(x) /* helps when debugging malloc
					 problems */

-- 
					Eddie Wyatt

jbs@eddie.MIT.EDU (Jeff Siegal) (04/28/87)

In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>  Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
>sizeof(long *) == sizeof(whatever *)?  I'm amazed at this because I have
>never run across this problem but then again I haven't worked on that many
>different architectures.  Not that I doubt you, but could you show me
>where you benifit or need pointers of different size?

On word-addressed machines (like the PDP-10), a pointer to a smaller
object (smaller than a word, that is) must include not only the
address of the word containing the object, but also the offset within
the word.  This also applies to byte-addressed machines which support
bitfields (except that pointers to bitfields do not exist in C). 

Jeff Siegal

gwyn@brl-smoke.UUCP (04/28/87)

In article <150@sds.UUCP> dave@sds.UUCP (dave schmidt x194) writes:
>#define NIL(type)	((type)NULL)
>#define	ALLOC(type, qty)	((type *)calloc((qty), sizeof(type)))

These are good; they explicitly address the important issue that the
data type is an important factor.  However, I would use malloc() rather
than calloc().  It isn't clear what calloc()'s auto-filling with 0 should
do; my interpretation is that it should fill with 0 "bytes" (usually the
same as 0-valued (char)s), but that's not necessarily going to make any
sense for floating-point or pointer data.  It also adds overhead when one
does one's own initialization of the allocated thingy.  I think calloc()
should be gotten rid of.

gwyn@brl-smoke.UUCP (04/28/87)

In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>  Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
>sizeof(long *) == sizeof(whatever *)?

Yup.  Consider a word-addressed machine (no byte addressing), for example
a 16-bit one to make it likely that there simply aren't any spare address bits.
To select a byte within a word you'll have to use more than 16 bits, so a
(char *) would probably be 32 bits while an (int *) would probably be 16.

Although I made up this example, there in fact are C implementations using
different sizes for different pointer types.  I think Prime or DGC do..

>#define alloc(type)		((type *) malloc(sizeof(type)))

Yup, although you might want one more set of () around the first `type'.

>#define mfree(x)		free(x)

More convenient is		free((char *)(x))	/* (void *) if supported */

henry@utzoo.UUCP (Henry Spencer) (04/28/87)

>   Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
> sizeof(long *) == sizeof(whatever *)? ...

Precisely.  The major reason this arises is that on word-addressed machines,
e.g. Data General systems (unless they've changed since I knew them), the
"normal" pointer format points only to a word, and there are no spare bits
in the pointer that could be used to point to a byte within the word.  (The
use of such spare bits is a major reason why the representation of "char *"
can differ from the representation of "int *" even on machines where the
sizes of the two pointers are the same.)  There is just no way to point to
a character except to define a longer kind of pointer to hold the extra
information.

Actually, there is nothing guaranteeing that *any* of those sizes are the
same, but sizeof(char *) is usually the anomalous one in practice.
-- 
"If you want PL/I, you know       Henry Spencer @ U of Toronto Zoology
where to find it." -- DMR         {allegra,ihnp4,decvax,pyramid}!utzoo!henry

gemini@homxb.UUCP (Rick Richardson) (04/30/87)

In article <1129@ius2.cs.cmu.edu>, edw@ius2.cs.cmu.edu.UUCP writes:
> Not that I doubt you, but could you show me
> where you benifit or need pointers of different size?

Medium model on an 80x8x family (arch flames to /dev/null), where pointers
to functions are 32 bits, but pointers to data are 16 bits, to name one
example.

Rick Richardson, PC Research, Inc: (201) 922-1134  ..!ihnp4!castor!pcrat!rick
	         when at AT&T-CPL: (201) 834-1378  ..!ihnp4!castor!polux!rer

karl@haddock.UUCP (04/30/87)

In article <150@sds.UUCP> dave@sds.UUCP (dave schmidt x194) writes:
>I agree; the problem is compounded by the fact that certain C library
>routines encourage this sort of thinking.  For example, the MicroSoft
>Xenix manual says of time(): "long time(tloc) long *tloc; ... If 'tloc'
>(taken as an integer) is nonzero, the return value is also stored in the
>location to which 'tloc' points."  [...] I assume the manual means "... if
>'tloc' (taken as an integer of the requisite size) is nonzero, ..."

Neither phrasing is correct.  A pointer variable which contains a null pointer
need not result in zero if cast back to an arithmetic value.

>Worse yet, consider what the manual says about tmpnam():
>"char *tmpnam(s) char *s; ... If 's' is NULL, ...  If 's' is not NULL ..."
>This CLEARLY implies that "tmpnam(NULL)" is CORRECT, which it is ***NOT***.

I think the phrasing here is better than above, although "a null pointer" or
"(char *)0" or "(char *)NULL" might be better still (to eliminate the "clear
implication" you find).  I don't believe your inference, btw; I think it
clearly implies that the code will be testing the value with the equivalent of
"if (s == NULL) ...", which is perfectly correct.  It's unfortunate that the
cast is required, and that the user has to already know that; but (until ANSI)
them's the rules.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

am@cl.cam.ac.uk (Alan Mycroft) (05/01/87)

In article <5804@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>>  Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
>>sizeof(long *) == sizeof(whatever *)?
>
>Yup.
Very reasonable.  As other correspondents pointed out elsewhere this has
ramifications w.r.t. passing NULL to non-prototype functions.
If sizeof(int *) != sizeof(char *)t hen *NO* definition of NULL
(in the ANSI draft as it stands)
can allow correct compilation of (the by suposition buggy, but common, code):
    #include <stdio.h>
    extern void f();    /* or implicit, or varargs prototype */
    g() { f(NULL,NULL); }
if the actual definition of f was:
    void f(x,y) int *y; char *y; { ... } 

The problem is that the user is currently required to cast NULL to correct
pointer types for non-prototype functions.

As I pointed out to ANSI, one alternative suggestion is
"If a function (f above) has no prototype in scope, or a varargs part
of a prototype then all pointer arguments are widened to (void *)."
Correspondingly, a function defined by
    void f(x,y) int *y; char *y; { ... }
will need to generate relevant narrowing code (this would always be
null on vax-like machines).
In this case NULL is only sensibly defined as ((void *)0).

This always works, both for prototype and non-prototype forms.
The only thing to watch out for is that varargs functions will need
to use (type *)va_next(ap, void *) rather that va_next(ap, type *).

To convince the world that this is not unreasonable, just consider
the exactly analogous situoutation with 'float' and 'double' (in this
role (type *) and (void *) repectively).
prototype in scope:  float (effectively) left alone)
prototype not in scope:  float widened to double on call, narrowed on entry.

john@uw-nsr.UUCP (John Sambrook) (05/01/87)

In article <258@homxb.UUCP> gemini@homxb.UUCP (Rick Richardson) writes:
>In article <1129@ius2.cs.cmu.edu>, edw@ius2.cs.cmu.edu.UUCP writes:
>> Not that I doubt you, but could you show me
>> where you benifit or need pointers of different size?
>
>Medium model on an 80x8x family (arch flames to /dev/null), where pointers
>to functions are 32 bits, but pointers to data are 16 bits, to name one
>example.

I believe the Honeywell GPS-45 series systems have chararacter pointers that
are 6 bytes in length while pointers to other objects are 4 bytes.  

-- 
John Sambrook                           Work: (206) 545-7433
University of Washington WD-12          Home: (206) 487-0180
Seattle, Washington  98195              UUCP: uw-beaver!uw-nsr!john

throopw@dg_rtp.UUCP (Wayne Throop) (05/03/87)

> henry@utzoo.UUCP (Henry Spencer)
>>   Are you saying that you are not guarenteed sizeof(char *) == sizeof(int *)
>> sizeof(long *) == sizeof(whatever *)? ...
> Precisely.  The major reason this arises is that on word-addressed machines,
> e.g. Data General systems (unless they've changed since I knew them), the
> "normal" pointer format points only to a word, and there are no spare bits
> in the pointer that could be used to point to a byte within the word.  (The
> use of such spare bits is a major reason why the representation of "char *"
> can differ from the representation of "int *" even on machines where the
> sizes of the two pointers are the same.)  There is just no way to point to
> a character except to define a longer kind of pointer to hold the extra
> information.

This is all perfectly correct, except that DG machines aren't examples
of this phenomenon, quite.  DG machines' main pointer format is a
pointer to a 16-bit word.  However, it has a "spare bit" in the form of
an "indirect bit", which indicates that this pointer points to a pointer
to the desired word.  This indirect bit is almost never used, and IS
never used by the C compiler.  And so, instructions were introduced
(during the evolution from the original Nova architecture to the current
MV architecture) which manipulated pointers to 8-bit bytes, with no
indirect bit.  They thus effectively cover the same address space, and
both (on an MV) fit into 32 bits.  As a later addition, instructions
that reference individual bits were added, and the pointers here cannot
fit into 32 bits.  They occupy 64 bits (which is somewhat bigger than
strictly necessary, but it was the format that was chosen, for various
reasons).

Nevertheless, most instructions naturally deal with 16-bit granular
pointers, and only character-manipulation instructions deal with 8-bit
granular pointers.  So there is a strong motive for using the two
different pointer formats for their respective purposes, that is, to
indicate the addresses of characters by 8-bit granular pointers (most
often called bytepointers), and the addresses of everything else by
16-bit granular pointers (called wordpointers).  Again, both of these
pointer formats fit into 32 bits, but the bits are interpreted somewhat
differently.  Thus, it is important in DGs implementation of C to use
pointers in a type-correct way.

The moral is: ALWAYS pass the correct pointer type.  If you have a
compiler that implements prototypes, you can pass an uncast NULL.  If
you do not, you must ALWAYS cast NULL to the appropriate type before
passing.  If you don't do this, your code will not be portable, and that
is that.  There is NOTHING that a compiler implementor can do to help
you on all machines if you make this coding mistake.  Making NULL 
((void *)0) won't help in the general case.  And that is all there is to
that.

(Of course, making NULL be ((char *)0) is an even worse faux-pas.  There
is no guarantee that casting a null pointer of one type to another type
gives sensible results, though it generally does.)

(Further, the incorrect, widespread and apparently incorrigable belief
that NULL should be #defined to be something other than 0 causes me
personally never to code NULL, but always a correctly cast 0.  Just a
personal prejudice of mine.)

--
"Mowe bwiefing?"
"More briefing!"
                                --- Elmer & Daffy
-- 
Wayne Throop      <the-known-world>!mcnc!rti!dg_rtpW *)0)a 

cccmark@ucdavis.UUCP (Mark Nagel) (05/03/87)

In article <1000@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook 5-7433) writes:
>In article <258@homxb.UUCP> gemini@homxb.UUCP (Rick Richardson) writes:
>>In article <1129@ius2.cs.cmu.edu>, edw@ius2.cs.cmu.edu.UUCP writes:
>>> Not that I doubt you, but could you show me
>>> where you benifit or need pointers of different size?
>>
>>Medium model on an 80x8x family (arch flames to /dev/null), where pointers
>>to functions are 32 bits, but pointers to data are 16 bits, to name one
>>example.
>
>I believe the Honeywell GPS-45 series systems have chararacter pointers that
>are 6 bytes in length while pointers to other objects are 4 bytes.  
>

I hadn't realized there were actually machines with different sized pointers
before this -- what a sheltered life!  Anyway, I have been thinking about this
and was wondering if the function prototyping available in ANSI C will take
care of the problem mentioned.  Obviously, 0 (or NULL) can be formed into the
correct bit pattern if necessary when it appears as an r-value.  The correct
value can be inferred from the l-value.  But in parameter passing, the type
of the argument is unknown, so different sized pointers will blow it up.
Since function prototypes will allow knowledge of function argument types,
will the passing of NULL be a problem any longer?


-- 

- Mark Nagel

"Don't ever let anything mechanical know you are in a hurry."

ucdavis!deneb!cccmark@ucbvax.berkeley.edu               (ARPA)
mdnagel@ucdavis                                         (BITNET)
...!{sdcsvax|lll-crg|ucbvax}!ucdavis!deneb!cccmark      (UUCP)

Disclaimer: If you were my employer would *you* be responsible for my
	    opinions?

greg@utcsri.UUCP (05/04/87)

In article <5804@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>>#define alloc(type)		((type *) malloc(sizeof(type)))
>
>Yup, although you might want one more set of () around the first `type'.
>

But then alloc(int) -> (((int) *) malloc (sizeof(int)))	is a syntax error.

The problem you seem to be addressing is that (type *) is not always a
cast to pointer to type: e.g.  int (*)() means pointer to func returning int,
but ( int (*)() * ) means you get a syntax error.

There is no cure for this; it is a result of C's bass-ackward type syntax.
That definition of alloc simply cannot be used with types containing
array or function notation ( although you can alwaytypedef them to
avoid having ()'s and [n]'s in the 'alloc()' ).

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

tps@sdchem.UUCP (Tom Stockfisch) (05/04/87)

In article <4724@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes:

>>In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:

>>>#define alloc(type)		((type *) malloc(sizeof(type)))

>...problem...is that (type *) is not always a
>cast to pointer to type: e.g.  int (*)() means pointer to func returning int,
>but ( int (*)() * ) means you get a syntax error.
>
>There is no cure for this; it is a result of C's bass-ackward type syntax.

>Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg


There is a workaround, but you might not like it:

# define palloc( ptype )	(  (ptype)malloc( sizeof( *(ptype) ) )  )

now to get a 
	
	int	(*)()

you must do

	palloc( int (**)() )

|| Tom Stockfisch, UCSD Chemistry	tps%chem@sdcsvax.ucsd.edu
					or  sdcsvax!sdchem!tps

edw@ius2.cs.cmu.edu (Eddie Wyatt) (05/06/87)

  Someone suggested that the 'type' argument to the macro definition

	#define alloc(type)	((type *) malloc(sizeof(type)))

should have parentheses around it.  Well consider the macro expansion of

	x = alloc(int *);

Currently the expansion is

	x = ((int * *) malloc(sizeof(int *)));

If the 'type' argument is parenthesized then it would expand to

	x = (((int *) *) malloc(sizeof(int *)));

which is  syntactically incorrect (or at least the SUN compiler says it is).
Also I've added the macro definition :

	#define NULLPTR(type)	((type *) 0)

I know all of you out there will be relieved to hear that :-).

-- 
					Eddie Wyatt

e-mail: edw@ius2.cs.cmu.edu

lmiller@venera.isi.edu.UUCP (05/06/87)

In article <728@sdchema.sdchem.UUCP> tps@sdchemf.UUCP (Tom Stockfisch) writes:
>In article <4724@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes:
>
>>>In article <1129@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>
>>>>#define alloc(type)		((type *) malloc(sizeof(type)))
>
>>...problem...is that (type *) is not always a
>>cast to pointer to type: e.g.  int (*)() means pointer to func returning int,
>>but ( int (*)() * ) means you get a syntax error.
>>
>>There is no cure for this; it is a result of C's bass-ackward type syntax.
>
>>Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

ETC.

The type of (int (*)()) is "pointer to function returning int".  If this is
the type in the allocation (type *) malloc(sizeof(type)), then we're
allocating space for a "function returning int".  I don't think we can do
that.  Here's a little prorgram, and the output of lint (4.3 UNIX on VAX
8650).

#include <stdio.h>
main()
{
  int (*x)();		/* ptr to function returning int */

  x = (int (*)()) malloc (sizeof (int ());
}

test.c:
test.c(6): compiler error: compiler takes size of function

Larry Miller
lmiller@venera.isi.edu