[comp.unix.aix] Invalid pointer traps

brandis@inf.ethz.ch (Marc Brandis) (02/11/91)

I noticed that when a program under AIX on the S/6000 reads through an invalid
pointer, no exception is reported but instead the value 0 is returned. On write,
the exception is reported. This seems to be true for the whole address range
from 0 up to 7fffffff.

As I understand the hardware documentation, the MMU does in fact detect the
illegal access, so the whole thing can only be a matter of the operating system.

Is there a way to turn this off so that all accesses to unmapped pages report
an exception, not only writes?

Thanks for any pointers.


Marc-Michael Brandis
Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology)
CH-8092 Zurich, Switzerland
email: brandis@inf.ethz.ch

jfc@athena.mit.edu (John F Carr) (02/12/91)

In article <24518@neptune.inf.ethz.ch> brandis@inf.ethz.ch (Marc Brandis) writes:
>As I understand the hardware documentation, the MMU does in fact detect the
>illegal access, so the whole thing can only be a matter of the operating system.

The problem is, page zero is mapped and readable.  The MMU detects no
exception.

>Is there a way to turn this off so that all accesses to unmapped pages report
>an exception, not only writes?

There doesn't appear to be a way in the current OS version to unmap page
zero.  Even using the linker -T and -D flags to move the program text and
data out of page zero doesn't help (it doesn't put anything in page zero,
but it leaves it accessible).

Speculation: IBM found too many things broke when they made NULL pointer
dereferences trap.  The documentation even says that *(int *)0 == 0.  AIX
1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow
reads from location 0.  I don't know if the AIX 1 developers talk to the AIX
3 developers or not.

--
    John Carr (jfc@athena.mit.edu)

brandis@inf.ethz.ch (Marc Brandis) (02/12/91)

In article <1991Feb12.033513.27494@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes:
>The problem is, page zero is mapped and readable.  The MMU detects no
>exception.
>
>Speculation: IBM found too many things broke when they made NULL pointer
>dereferences trap.  The documentation even says that *(int *)0 == 0.  AIX
>1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow
>reads from location 0.  I don't know if the AIX 1 developers talk to the AIX
>3 developers or not.

It is not just that page zero is mapped. Programs that are compiled and linked
without any special options get the start of the text segment at address 
0x10000000 and the start of the data segment at 0x20000000. The stack seems to
grow downwards from 0x2ffffffc. I wrote a small C program that tries to read
from each page starting at 0. It did not stop until it reached the page at
address 0x20044000, which is just above the end of the data segment.

Looking at this, it seems that the OS maps just everything below the data 
segment as readable as well as an area suitable for the stack (which was 
0x2df80000 up to 0x2ffffffc for my test program). 

These pages cannot all be mapped when the program starts, as this would mean
that several hundred megabytes of memory would have to be mapped. Since the
inverted page table architecture of the S/6000 does not easily allow sharing
of pages, this would result in an awful lot of real memory used up for this
purpose. Considering the high amount of paging activity when I am running my
test program, it seems that the pages become allocated when accessed.

From all this together, I would guess that the MMU detects the access to an
unmapped page, causes an exception and that the trap handler in turn allocates
a new page if its address is below the data segment limit. I really do not
understand what this should be good for. But anyway, even if this may have a
use somewhere, if should be possible to turn it off. 

So, once again, does anybody know a way to turn it off, or does at least 
somebody have an explanation why this has been implemented like that and
whether we can expect this to change in future releases of AIX?

Any help or pointers appreciated. Thanks.


Marc-Michael Brandis
Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology)
CH-8092 Zurich, Switzerland
email: brandis@inf.ethz.ch

prener@arnor.uucp (02/13/91)

Actually, the compiler optimization takes advantage of the fact that location
0 is guaranteed to be readable and to contain a zero.  This permits pointer-based
loads to be done speculatively, on occasion, which, in turn can improve
the instruction scheduling.
-- 
                                   Dan Prener (prener @ ibm.com)

jeffs@soul.esd.sgi.com (Jeff Smith) (02/14/91)

In article <1991Feb12.033513.27494@athena.mit.edu>, jfc@athena.mit.edu
(John F Carr) writes:
|> Speculation: IBM found too many things broke when they made NULL pointer
|> dereferences trap.  The documentation even says that *(int *)0 == 0.  AIX
|> 1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow
|> reads from location 0.  I don't know if the AIX 1 developers talk to the AIX
|> 3 developers or not.

Lots of things do break when *(char *)0 != 0.  On early AIX/ps 1, *(char *)0
was really 'L'.  The coff header was mapped in at address 0, and the first
byte of the magic number corresponded 'L' I belive.  This broke lots of
utilities from the RT tree (parts of PS/2 and RS/6000 AIX started here).

I wasn't around when they made *(char *)0 trap, but, I know it wasn't a
easy time.  Surprised they changed it back in 1.2 though.  I used 1.2 for
8 months or so, but never noticed it.

And no, the AIX 1 developers (PS/2) and the AIX 3 (RS/6000) developers
don't talk much.

jeffs@sgi.com

jfh@greenber.austin.ibm.com (John F Haugh II) (02/15/91)

In article <1991Feb13.223557.3901@odin.corp.sgi.com> jeffs@sgi.com writes:
>And no, the AIX 1 developers (PS/2) and the AIX 3 (RS/6000) developers
>don't talk much.

This is a completely untrue statement.  The department which I work
in is comprised of a considerable number of version 3 developers.
Not only myself, but three others from the same v3 area, plus several
others from different v3 groups.

There is a significant amount of communication from my area (PS/2,
S/370) to the S/6000 area.  Many of the programmers who worked on
the most recent round of PS/2 development here in Austin were
contractors whose stint with IBM ran out as development on AIX v3
wound down.  The same is true from the PS/2 side - as development
on the latest PS/2 update/etc expired, a good number of contractors
have gone back to AIX v3 as contractors.

So, not only is there direct communication, there is also a good deal
of cross-pollination.  What you must remember is that AIX v1 and
AIX v3 are two different products running on two very different
hardware platforms.  There are restrictions on the PS/2 side that
just don't exist on the S/6000, and vice versa.  For example, v3
just won't fit on a 9370 or PS/2.
-- 
John F. Haugh II      |      I've Been Moved     |    MaBellNet: (512) 838-4340
SneakerNet: 809/1D064 |          AGAIN !         |      VNET: LCCB386 at AUSVMQ
BangNet: ..!cs.utexas.edu!ibmchs!auschs!snowball.austin.ibm.com!jfh (e-i-e-i-o)

brandis@inf.ethz.ch (Marc Brandis) (02/18/91)

The replies I got so far about NULL pointer accesses seem to indicate the
following two things.

	1) The instruction scheduler may decide to execute a load speculatively
	   before it has checked the pointer.

	2) A lot of code would break if *(NULL) != 0.

Before I comment on these issues, let me briefly explain what I found on the
machine. The whole address range from 0x0 to the end of the data segment is
readable, with some areas also writable. There is an additional area for the
stack which is both readable and writable. Since the data segment is usually
allocated at address 0x20000000, this results in a space of about 600 megabytes
which is readable. Due to the inverted page tables in the S/6000, these are
not just shared page table entries pointing to a single page but in fact are 
allocated in memory the first time they are accessed. A program that I wrote
to scan this area caused a lot of paging traffic.

So, this whole stuff has been implemented somewhere in the page fault handler
of AIX.

Let me now comment on the above two points. I do not think that there are very
many cases in which the speculative execution would be of any help, as the
S/6000 does speculative execution in hardware. You can safely place the branch
in front of the load and the machine will speculatively execute the load until
the it is sure whether the result is needed or not. Note that this is 
advantegeous to using a load that has been moved in front of the branch, as
all kind of exceptions or cache misses have to be handled. 

In fact, I looked at the code generated by the XLC compiler with all 
optimizations turned on for several pieces of code that traverse data 
structures including NULL pointers, and I did not find a single example where
the compiler inserted such a speculatively executed load.

I tend to believe more in the second issue that has been raised, that a lot of
code would break. However, I checked several UNIX systems and I found that this
does not seem to be the case for all systems. The test is whether the machines
trap on a read from address 0, sending an segmentation violation signal to the
application.

	Sun-3, SunOS 4.0.3			traps
	SparcStation, SunOS 4.0.3		traps
	DECstation 5000, Ultrix V4.1 (Rev. 52)	traps
	IBM RS/6000 530, AIX 3			no trap
	Sequent Symetry S81, Dynix V3.0.17.9	no trap

So, it must be possible to make common UNIX programs run without having reads
from 0 returning 0. I have to say that I feel a little bit unsecure using 
these systems, as programs that need to access stuff that they do not have
allocated contain bugs for sure. 

But still, it would be great if there were a way to turn this off. In the long
run I hope IBM is able to write their code so that they would not require
reading through a NULL pointer.


Marc-Michael Brandis
Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology)
CH-8092 Zurich, Switzerland
email: brandis@inf.ethz.ch