[comp.lang.c] Messing with 0 ptr on m68020 & sys V / 68

riku@clinet.FI (Riku Kalinen) (11/25/88)

First of all: I know, that messing around with 0 pointers is asking of
trouble. The following is kinda intresting, however..

Hardware: Motorola 8400 business - unix box with m68020
Software: System V / 68 ver. 5.3.1 (?) [ Sys V rel 3 ]

When i run the following program, it gives me 1024 '\0':s and then core dump
(when offset gets into 1024). 

Seems that my process's address space contains 1 k read-only nulls in
very beginnig.

Questions:
  1) Why? This causes reference thru 0 ptr to return 0 instead of core dump.
     (Of course, if I try to write something there, everything crashes as it
      should.)
  2) Who sets up process's memory when it is started? Kernel?
  3) Is there any good reasons to do this.

/* --- clip --- clip --- clip --- */

/* baz.c - mess around with null pointer. */

#include <stdio.h>
#include <ctype.h>

main()
{
  register char *base = 0;
  register unsigned long offset;
  register char ch;

  for (offset = (unsigned long) 0; offset < (unsigned long) 2000; offset ++)
    {
      cha = *(base + offset); /* dumps core when offset == 1024 == 1k */
      printf ("%04lx = %d\n", offset, (int) cha);
    }
}

/* --- clip --- clip --- clip --- */

Please send responses via email, I'll summarize if I get something usable.

And, PLEASE, PLEASE don't tell me this should never be done.
I know it already, but sometimes it is fun to try something impossible 8-) .


-- 
Riku "the bit" Kalinen                    Internet      : riku@clinet.FI
                                          Elisa/s.mail  : funet:riku@clinet.fi
City Lines Inc, Helsinki, Finland         Telephone int : +358 0 694 1056
    -- "..We are what we are and it's never enough.." (Chris de Burgh) --

bengsig@orcenl.uucp (Bjorn Engsig) (11/28/88)

In article <784@clinet.FI>, riku@clinet.FI (Riku Kalinen) writes:
> Questions:
>   1) Why? This causes reference thru 0 ptr to return 0 instead of core dump.
>      (Of course, if I try to write something there, everything crashes as it
>       should.)
>   2) Who sets up process's memory when it is started? Kernel?
>   3) Is there any good reasons to do this.
This "bug" has to be present for many of the utilities to run.  In e.g.
the SCCS, lots of programs has a chain of pointers, which end by just
referencing (xxx *)0 for reading.  That's just the way it is coded.  Your
OS vendor the has to allow you to read from address 0, whatever is there,
but you can of course not count on the contents, and (as you point out)
writing there gives you a bus-error (or segmentation violation).
-- 
Bjorn Engsig, ORACLE Europe      \ /    "Hofstaedter's Law:  It always takes
 ..!uunet!mcvax!orcenl!bengsig    X      longer than you expect, even if you
phone:  +31 21 59 56 411         / \     take into account Hofstaedter's Law"

df@nud.UUCP (Dale Farnsworth) (11/28/88)

Riku Kalinen (riku@clinet.fi) writes:
> Hardware: Motorola 8400 business - unix box with m68020
> Software: System V / 68 ver. 5.3.1 (?) [ Sys V rel 3 ]
> 
> Seems that my process's address space contains 1 k read-only nulls in
> very beginnig.
> 
> Questions:
>   1) Why? This causes reference thru 0 ptr to return 0 instead of core dump.

Yes, that's all it was intended to do.

>   2) Who sets up process's memory when it is started? Kernel?

Yes.  Actually, the null page doesn't get allocated at the start; it only
gets allocated (by the kernel) when it's first read.

>   3) Is there any good reasons to do this.

This is debatable.

The reasons are historical.  Once upon a time, the standard UNIX distribution
from AT&T ran on the PDP-11 family.  The instruction which began each program
(from crt0.o) just "happened" to have a zero-valued first byte.  Unfortunately,
a number of programs in the distribution relied on this "feature".  (They
contained constructs like this: { char *s; if (*s) x(); } rather than
{ char *s; if (s && *s) x(); } .)  When we did our first port to the 68000,
the instruction at location happened to *not* have a zero at location zero.
These programs failed to operate on the 68000 port as they did on the PDP-11.

Arguably, the correct thing to do was to track down all such programs and
fix them.  This would have been time consuming and it would have been
difficult to know when the task was completed, since many of the programs
failed in subtle ways or in paths that were rarely executed.

We did an easy fix which provided the same behavior on the 68000 port as on
the PDP-11.  We added an innocuous instruction to the beginning of crt0.o
which provided a zero at location 0.  (This also had the side effect of
making it as difficult to debug null pointer dereferencing problems on the
68000 port as on the PDP-11.)

Later, when we did the port of SVR3 to the 68020, the first word of crt0.o
was no longer mapped to location 0.  We again had to make the decision.
What was the correct behavior of the system when a program referenced
location 0? Of course the technically correct thing to do was to signal
an exception.  This would still require that the incorrect programs
described above be repaired.  (In fact, we have since fixed these programs,
but it is difficult to be sure that we have found them all.)  Complicating
the issue was the fact that VARs and end-users had written programs which
made use of the "zero at location zero" behavior and compatibility with
past releases was (and remains) important.

We opted to map a "zero on demand" page to page zero.  This provides the
"zero at location zero" behavior with no overhead to a process which does
use the behavior.  Unfortunately, it remains more difficult to debug null
pointer dereferencing problems than if an exception were signaled.

I must emphasize that programs should *not* rely on this behavior.  It
may not be maintained in future ports.  In fact, some existing Motorola
UNIX ports already signal an exception on a read of location zero.

These are my personal recollections and do not represent Motorola policy.

-Dale

-- 
Dale Farnsworth		602-438-3092	noao!asuvax!nud!df

henry@utzoo.uucp (Henry Spencer) (11/29/88)

In article <784@clinet.FI> riku@clinet.fi (Riku Kalinen) writes:
>Seems that my process's address space contains 1 k read-only nulls in
>very beginnig.
>
>  1) Why? This causes reference thru 0 ptr to return 0 instead of core dump.

Somebody is making a concession to badly-written programs.  Sun had to bite
the bullet and fix this because the Sun 1 just couldn't do "*0", and their
fixes got propagated back into 4.nBSD.  System V, especially its early
releases, was a very different story, and provided considerable incentive
for having a readable location 0.

>  2) Who sets up process's memory when it is started? Kernel?

Right.  Actually it's a three-way collaboration between the kernel, the
details specified by ld, and the file format (which limits what ld can say).
-- 
SunOSish, adj:  requiring      |     Henry Spencer at U of Toronto Zoology
32-bit bug numbers.            | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (11/30/88)

In article <1564@nud.UUCP> df@nud.UUCP (Dale Farnsworth) writes:
>... Once upon a time, the standard UNIX distribution
>from AT&T ran on the PDP-11 family.  The instruction which began each program
>(from crt0.o) just "happened" to have a zero-valued first byte...

Not quite correct.  The first instruction did not in fact have a zero low
byte (or a zero high byte).  However, when a pdp11 program was compiled
split-space, to get maximum address space on a large 11, a one-word "shim"
was inserted at location 0 in data space to ensure that no legitimate
variable ever got put there (since C guarantees that &x != 0 for any x).
The shim, unfortunately, was a zero.  Most large and complex programs
needed split space, so their developers got used to having a readable
zero at location zero.

Friends of mine (at HCR) ran into this when using overlaying to make the
big stuff work on small (non-split-space) 11s.  They ended up inserting
a contrived instruction with a zero low byte at the beginning of the
startup code.  (They would have preferred a zero word, but that's a
HALT instruction -- illegal in user mode -- on the 11!)
-- 
SunOSish, adj:  requiring      |     Henry Spencer at U of Toronto Zoology
32-bit bug numbers.            | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

guy@auspex.UUCP (Guy Harris) (11/30/88)

>This "bug" has to be present for many of the utilities to run.

No, it doesn't.  You can fix the utilities; that's what was done at Sun
for all the cases noted.

There is (as has been stated zillions of times) no requirement in any C
specification that you be able to dereference a null pointer.  Some
systems disallow it (VMS is one that comes to mind).

>In e.g. the SCCS, lots of programs has a chain of pointers, which end by just
>referencing (xxx *)0 for reading.  That's just the way it is coded.

Well, it's not coded that way in the Sun version, because....

>Your OS vendor the has to allow you to read from address 0,

Sun doesn't allow it, and neither do some other vendors.

Motorola may have given up in disgust and mapped a bunch of zeroes there.

andrew@frip.gwd.tek.com (Andrew Klossner) (12/02/88)

>> This "bug" has to be present for many of the utilities to run.

> No, it doesn't.  You can fix the utilities; that's what was done at Sun
> for all the cases noted.

You can fix the utilities, but you can't cause all your potential
customers to fix their code before they try it on your system.

>> Your OS vendor the has to allow you to read from address 0,

> Sun doesn't allow it, and neither do some other vendors.

To our delight.  We get orders from customers whose VAX-developed,
buggy code runs on our workstations but not on Suns.  Perhaps Motorola
is mining the same market opportunity.

  -=- Andrew Klossner   (uunet!tektronix!hammer!frip!andrew)    [UUCP]
                        (andrew%frip.gwd.tek.com@relay.cs.net)  [ARPA]

guy@auspex.UUCP (Guy Harris) (12/02/88)

 >To our delight.  We get orders from customers whose VAX-developed,
 >buggy code runs on our workstations but not on Suns.  Perhaps Motorola
 >is mining the same market opportunity.

Could be, but Sun's revenues continue to grow well, demonstrating that
disallowing "*0" is not an absolute bar to success....