[net.unix] rcs blows up on suns

dpn@panda.UUCP (Dale P. Nielsen) (08/29/85)

***

RCS blows up on Suns in part because Suns insist that a null pointer be a
pointer to a null string; zero pointers won't do.  Before I start digging
through it, is anyone out there in net-land aware of a fix to this problem?
If so, please send me mail!  Thank you.

--

			--Dale P. Nielsen
			  GenRad, Incorporated
			  Concord, Massachusetts

			  {decvax,linus,masscomp,mit-eddie}!genrad!panda!dpn

jww@sdcsvax.UUCP (Joel West) (08/31/85)

> RCS blows up on Suns in part because Suns insist that a null pointer be a
> pointer to a null string; zero pointers won't do.

Is this for real?  Does any other system adopt this perverse convention?

	Joel West	CACI, Inc. - Federal (c/o UC San Diego)
	{ucbvax,decvax,ihnp4}!sdcsvax!jww
	jww@SDCSVAX.ARPA

west@sdcsla.UUCP (Larry West) (08/31/85)

>> RCS blows up on Suns in part because Suns insist that a null pointer be a
>> pointer to a null string; zero pointers won't do.
>
>Is this for real?  Does any other system adopt this perverse convention?

I (and others here) have been using RCS on Suns extensively for
years.   I've never had any problems with it, and I don't think
anyone else has had anything major go wrong.   So I doubt that
the bad RCS code was that distributed by SMI.

I suspect that what the original poster was complaining about was
that things like:
	if ( *p ) ... ;
[equal time for those who prefer more characters:]
	if ( *p != '\0' ) ... ;
cause address faults on the Suns when (p == 0).

When "p" == 0, of course, the above code is improper, unless your
address space goes down to zero [it doesn't].   This will work
on a Vax (e.g.), for no particularly good reason.   If you ever
intend to use something other than a Vax, you'd be safer writing
things like:
	if ( p && *p ) ... ;

I happen to like the way it works on the Vax, but it's a fluke
and not to be counted on in general.
-- 

Larry West			Institute for Cognitive Science
(USA+619-)452-6220		UC San Diego (mailcode C-015) [x6220]
ARPA: <west@nprdc.ARPA>		La Jolla, CA  92093  U.S.A.
UUCP: {ucbvax,sdcrdcf,decvax,ihnp4}!sdcsvax!sdcsla!west OR ulysses!sdcsla!west

west@sdcsla.UUCP (Larry West) (09/06/85)

In article <961@sdcsla.UUCP> west@sdcsla.UUCP (Larry West) writes:
>>> RCS blows up on Suns in part because Suns insist that a null pointer be a
>>> pointer to a null string; zero pointers won't do.
>>
>>Is this for real?  Does any other system adopt this perverse convention?

>I (and others here) have been using RCS on Suns extensively for
>years.   I've never had any problems with it, and I don't think
>anyone else has had anything major go wrong.   So I doubt that
>the bad RCS code was that distributed by SMI.

I was wrong here -- Diana L. Syriac (genrad!panda!dls) sent me
an example command sequence that dumps core on Suns:
	rcs -l foo.c
	rcs -u foo.c

This is indeed a null-pointer problem.   The RCS code is clearly
incorrect here, in that it assumes a null pointer is okay for
"strcmp" or "strcpy".

My apologies to Dale Nielsen (genrad!panda!dpn) for assuming
he was incorrect in his assessment of the problem, simply because
I've never had RCS dump core on me.   Similar apologies to the
net for adding noise to the net.unix group.

As I stated in my original reply, dereferencing zero is always
a bad idea, even though it `works' (returns 0) on Vaxen.   So,
it isn't Sun's fault per se, and I don't think RCS is really
a supported product of Sun Microsystems... (anyone from Sun
care to comment on this?).

Larry
-- 

Larry West				(USA+619-)452-6771
Institute for Cognitive Science
UC San Diego (mailcode C-015)
La Jolla, CA  92093  U.S.A.

ARPA:	<west@nprdc.ARPA>	or	<west@ucsd.ARPA>
UUCP:	{ucbvax,sdcrdcf,decvax,ihnp4}!sdcsvax!sdcsla!west
  or	{sun,ulysses}!sdcsla!west

guy@sun.uucp (Guy Harris) (09/09/85)

> As I stated in my original reply, dereferencing zero is always
> a bad idea, even though it `works' (returns 0) on Vaxen.   So,
> it isn't Sun's fault per se, and I don't think RCS is really
> a supported product of Sun Microsystems... (anyone from Sun
> care to comment on this?).

RCS is not a supported product of Sun Microsystems.  SMI doesn't distribute
RCS on any of its standard distribution tapes.

Here's a fix to "rcs.c" to keep it from dropping core (this fix was
originally discovered on another 68000-family machine which prohibits
dereferencing null pointers):

*** rcs.c.broken	Sun Sep  8 15:25:18 1985
--- rcs.c	Sun Sep  8 15:26:51 1985
***************
*** 982,988
          dummy.nextlock=next=Locks;
          trail = &dummy;
          while (next!=nil) {
!                numr = strcmp(num, next->delta->num);
                 if ((whor=strcmp(who,next->login))==0 &&
                    (num==nil || numr==0))
                          break; /* found a lock */

--- 982,989 -----
          dummy.nextlock=next=Locks;
          trail = &dummy;
          while (next!=nil) {
!                if(num!=nil)
!                        numr = strcmp(num, next->delta->num);
                 if ((whor=strcmp(who,next->login))==0 &&
                    (num==nil || numr==0))
                          break; /* found a lock */

	Guy Harris

campbell@maynard.UUCP (Larry Campbell) (09/09/85)

> ...
> As I stated in my original reply, dereferencing zero is always
> a bad idea, even though it `works' (returns 0) on Vaxen.   So,
> it isn't Sun's fault per se, and I don't think RCS is really
> a supported product of Sun Microsystems... (anyone from Sun
> care to comment on this?).
> -- 
> Larry West				(USA+619-)452-6771
> UUCP:	{ucbvax,sdcrdcf,decvax,ihnp4}!sdcsvax!sdcsla!west
>   or	{sun,ulysses}!sdcsla!west

A minor point, but dereferencing zero only "works" on Vaxen that are
running Berkeley Unix (don't know about USG).  VMS sets page zero to
no access; this is one of the few areas where I concede VMS a point
over Unix.  It catches an all-too-common programming error.
-- 
Larry Campbell                     decvax!genrad
The Boston Software Works, Inc.                 \
120 Fulton St.                 seismo!harvard!wjh12!maynard!campbell
Boston MA 02109                         /       /
                                   ihnp4  cbosgd

ARPA: campbell%maynard.uucp@harvard.arpa

thomson@uthub.UUCP (Brian Thomson) (09/10/85)

> ...
>> As I stated in my original reply, dereferencing zero is always
>> a bad idea, even though it `works' (returns 0) on Vaxen.   So,
>
> A minor point, but dereferencing zero only "works" on Vaxen that are
> running Berkeley Unix (don't know about USG).

The ideal situation for someone developing code is an OS that does not
map location 0, but the best environment for someone who is interested
in using existing programs from other machines is one that requires
him to do no more than type 'make'.  This means being as forgiving
as possible about things like null pointers.
-- 
		    Brian Thomson,	    CSRI Univ. of Toronto
		    {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson

guy@sun.uucp (Guy Harris) (09/11/85)

> A minor point, but dereferencing zero only "works" on Vaxen that are
> running Berkeley Unix (don't know about USG).

Prior to the paging release of System V for the VAX (S5R2V2), S5 UNIX
suffered from the same problem as Berkeley UNIX - it allowed you to
reference location 0.  By default, it still does, but there is a linker
switch to build an executable image with no page 0.  John Bruner of Lawrence
Liverwurst Laboratory posted a set of kernel and linker changes to 4.2BSD
to make it support no-page-0 executables also.

	Guy Harris

steve@tove.UUCP (Steve D. Miller) (09/11/85)

In article <2772@sun.uucp> guy@sun.uucp (Guy Harris) writes:
>> As I stated in my original reply, dereferencing zero is always
>> a bad idea, even though it `works' (returns 0) on Vaxen.   So,
>> it isn't Sun's fault per se, and I don't think RCS is really
>> a supported product of Sun Microsystems... (anyone from Sun
>> care to comment on this?).
>
>RCS is not a supported product of Sun Microsystems.  SMI doesn't distribute
>RCS on any of its standard distribution tapes.
>
>Here's a fix to "rcs.c" to keep it from dropping core (this fix was
>originally discovered on another 68000-family machine which prohibits
>dereferencing null pointers):
>
>*** rcs.c.broken	Sun Sep  8 15:25:18 1985
>--- rcs.c	Sun Sep  8 15:26:51 1985
>***************
>*** 982,988
>          dummy.nextlock=next=Locks;
>          trail = &dummy;
>          while (next!=nil) {
>!                numr = strcmp(num, next->delta->num);
>                 if ((whor=strcmp(who,next->login))==0 &&
>                    (num==nil || numr==0))
>                          break; /* found a lock */
>
>--- 982,989 -----
>          dummy.nextlock=next=Locks;
>          trail = &dummy;
>          while (next!=nil) {
>!                if(num!=nil)
>!                        numr = strcmp(num, next->delta->num);
>                 if ((whor=strcmp(who,next->login))==0 &&
>                    (num==nil || numr==0))
>                          break; /* found a lock */
>
>	Guy Harris


   I think that there are a goodly number of null pointer/strcmp bugs
in rcs; we had a minimally version that no one pushed too hard until
recently, when some of these bugs began to pop up.  Delving into the
source, I found a *lot* of places that looked like they potentially
had this bug; I'm pretty sure that I had to fix one (not the one above)
just to get it to run at all back when I first brought it up here.
Since I was feeling lazy and in a hurry, I just wrote another strcmp
that does relatively intelligent things with null pointers and changed
the makefile so that everything that looked suspicious had my version
linked in.  I readily admit that I should have tracked down all the
potential strangenesses and fixed them individually, but it looks like
it'd take a lot of time to do and I'm swamped as it is...

	-Steve

jdb@mordor.UUCP (John Bruner) (09/11/85)

In article <151@maynard.UUCP> campbell@maynard.UUCP (Larry Campbell) writes:
>A minor point, but dereferencing zero only "works" on Vaxen that are
>running Berkeley Unix (don't know about USG).  VMS sets page zero to
>no access; this is one of the few areas where I concede VMS a point
>over Unix.  It catches an all-too-common programming error.

I implemented an (optional) new object format with page 0 unmapped in
VAX 4.2BSD.  The changes to the kernel and a few user-level programs
(e.g. the linker) are minimal.  A program in this format (which I
call Z0MAGIC (0420) or -Z format) will receive a SIGBUS if it attempts
to indirect through NULL.  This has been very useful in finding
NULL-dereferencing bugs, both in locally-written code and in
distributed software.

I posted the set of changes to "net.unix-wizards" several months
ago.  If you missed them and would like a copy, mail me a note.
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: jdb@mordor [jdb@s1-c.ARPA]	(415) 422-0758
  UUCP: ...!ucbvax!dual!mordor!jdb 	...!seismo!mordor!jdb

guy@sun.uucp (Guy Harris) (09/14/85)

> In article <2772@sun.uucp> guy@sun.uucp (Guy Harris) writes:
> > (Whole damn article cited!)

A summary would have been sufficient.

>    I think that there are a goodly number of null pointer/strcmp bugs
> in rcs; we had a minimally version that no one pushed too hard until
> recently, when some of these bugs began to pop up.  Delving into the
> source, I found a *lot* of places that looked like they potentially
> had this bug; I'm pretty sure that I had to fix one (not the one above)
> just to get it to run at all back when I first brought it up here.

We brought up the RCS that came off the 4.2BSD tape at CCI on our Power
5/20s; not only did they prohibit null pointer dereferencing, but they also
had 16-bit "int"s and 32-bit pointers, so all the null pointers passed as
arguments had to be properly cast and functions had to be properly declared.
We never saw any null-pointer-dereference problems other than the one
listed.  We may not have exercised all the paths through RCS, so there may
be others lurking.

> Since I was feeling lazy and in a hurry, I just wrote another strcmp
> that does relatively intelligent things with null pointers...

The only intelligent thing to do with a null pointer is to avoid
dereferencing it, and the most intelligent way to do that is to say "if this
pointer is null, it probably means that some argument wasn't supplied or
something like that.  As such, I probably want to do very different
processing - something like using the default value for that argument, or
not do whatever processing uses that argument's value, or something like
that.  If I do so, I'll probably automatically avoid dereferencing that null
pointer."

	Guy Harris

asw@rlvd.UUCP (Antony Williams) (09/20/85)

In article <151@maynard.UUCP> campbell@maynard.UUCP (Larry Campbell) writes:
>A minor point, but dereferencing zero only "works" on Vaxen that are
>running Berkeley Unix (don't know about USG).  VMS sets page zero to
>no access; this is one of the few areas where I concede VMS a point
>over Unix.  It catches an all-too-common programming error.

This is a difficult decision for implementors:  the error is so common
that disabling address zero causes just about every Unix program to dump
core under some circumstance.  One instance I recall is that PIC
will dereference null pointers if given syntactically incorrect
input.  It works fine with correct input.  The problem with PIC
is exacerbated in that the null pointer is used as a pointer to
various kinds of structure, with further pointers at various
offsets:  V7 Unix on PDP11/70 used to ensure a few bytes of zeros
at address zero, but behaviour like that of PIC seems to require
an unknowable number of zeros to avoid the error.
-- 
--------------------------------------------------
UK JANET:	asw@uk.ac.rl.vd
Usenet:		{... | mcvax}!ukc!rlvd!asw
ARPAnet:	asw%rlvd@ucl-cs.arpa