[net.bugs.usg] Comments on the preceeding bugs in System V

bruce@stride.UUCP (Bruce Robertson) (03/26/86)

Just a quick question... these are *serious* bugs in a *production* version
of System V.  These aren't bugs due to our port to the 68000.  Also, these
are NEW bugs; none of them existed in Release 1 of System V. I know that
AT&T does Beta testing.  All of the bugs showed up within days of installing
the new stdio package, and were tracked down and fixed within a couple of
weeks.  I don't understand how they could have been missed by a
thorough Beta test.  Does AT&T just ignore Beta test results, or is it
just very careless when choosing Beta sites?
-- 

	Bruce Robertson
	UUCP: cbosgd!utah-cs!utah-gr!stride!bruce
	ARPA: stride!bruce@utah-gr.arpa

rcj@burl.UUCP (Curtis Jackson) (03/28/86)

We are running SVR2 on a Vax 11/780 and have been for quite some time.
We have noticed none of the problems you mentioned, and I even tried
to duplicate a couple of them using your examples with no success --
everything worked fine.  Now, are you *absolutely* sure that it wasn't
your port to the 68000 that caused these problems?  Can anyone else
on the net with an un-ported SVR2 duplicate any/all of these problems.
-- 

The MAD Programmer -- 919-228-3313 (Cornet 291)
alias: Curtis Jackson	...![ ihnp4 ulysses cbosgd allegra ]!burl!rcj
			...![ ihnp4 cbosgd decvax watmath ]!clyde!rcj

gwyn@brl-smoke.UUCP (03/28/86)

In article <566@stride.stride.UUCP> bruce@stride.UUCP (Bruce Robertson) writes:
>Just a quick question... these are *serious* bugs in a *production* version
>of System V.  These aren't bugs due to our port to the 68000.  Also, these
>are NEW bugs; none of them existed in Release 1 of System V. I know that
>AT&T does Beta testing.  All of the bugs showed up within days of installing
>the new stdio package, and were tracked down and fixed within a couple of
>weeks.  I don't understand how they could have been missed by a
>thorough Beta test.  Does AT&T just ignore Beta test results, or is it
>just very careless when choosing Beta sites?

It would be interesting to hear how AT&T tests their code.
I have made literally over a thousand bug fixes to SVR2 (V1)
user-mode source code so far, and as you observe, many of
the bugs were newly introduced.  Interestingly, although I'm
one of the major redistributors of UNIX System V source code
to UNIX System vendors, I've never even been approached by
AT&T about being a code or SVID reviewer, test site, etc.

By the way, thanks for sharing your bug fixes.

bruce@stride.UUCP (03/29/86)

AT&T turned us down to be a Beta site for Version 3, because they felt
that we "aren't committed to UNIX".  I don't see how they get that idea,
since UNIX is the primary operating system on our machines.
-- 

	Bruce Robertson
	UUCP: cbosgd!utah-cs!utah-gr!stride!bruce
	ARPA: stride!bruce@utah-gr.arpa

guy@sun.uucp (Guy Harris) (03/31/86)

> We are running SVR2 on a Vax 11/780 and have been for quite some time.
> We have noticed none of the problems you mentioned, and I even tried
> to duplicate a couple of them using your examples with no success --
> everything worked fine.  Now, are you *absolutely* sure that it wasn't
> your port to the 68000 that caused these problems?  Can anyone else
> on the net with an un-ported SVR2 duplicate any/all of these problems.

In no way does the fact that a particular example can be made to work on a
VAX prove that the bugs are due to the port, except in the sense that if
they'd ported to the VAX instead of the 68000 the bugs wouldn't have shown
up.  There are a LOT of bugs in S5R2V1 (note that he was reporting bugs in
V2) which don't show up on a VAX but definitely show up on machines where
there is no location 0 in a process' address space.

Programs which have bugs which cause them to dereference null pointers are
less likely to fail on VAX 4BSD or VAX System V because those system start
the text at location 0, so if you illegally use a null pointer as if it
pointed to something the machine can at least fetch data using that pointer.
(VAX/VMS, however, does it right; the first page of the address space is not
mapped.)  However, many other systems do not have a location 0 in the user's
address space, and any attempt to dereference a null pointer will get caught
immediately.

Changes have been posted to 4BSD to optionally have programs which, when
loaded into processes, have no location 0; System V Release 2 Version 2
already has this option - the "-z" flag to "ld".  I strongly suggest that
all future System V releases be built entirely with this option, so that
null pointer problems will get caught *before* the source gets shipped to
people with machines where this is not an option.  (I'd also suggest that
"-z" be made the default, but this would probably upset people by forcing
them to fix their code.)

My present company, and the company I worked at before then, both have
machines where processes do not have a location 0 which is accessible in
user mode; it is a real pain fixing all the "perfectly good" C code out
there to work on these machines.  The fact that it happens to work on VAXes
doesn't make it correct.
-- 
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.arpa	(yes, really)