brandis@inf.ethz.ch (Marc Brandis) (02/11/91)
I noticed that when a program under AIX on the S/6000 reads through an invalid pointer, no exception is reported but instead the value 0 is returned. On write, the exception is reported. This seems to be true for the whole address range from 0 up to 7fffffff. As I understand the hardware documentation, the MMU does in fact detect the illegal access, so the whole thing can only be a matter of the operating system. Is there a way to turn this off so that all accesses to unmapped pages report an exception, not only writes? Thanks for any pointers. Marc-Michael Brandis Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology) CH-8092 Zurich, Switzerland email: brandis@inf.ethz.ch
jfc@athena.mit.edu (John F Carr) (02/12/91)
In article <24518@neptune.inf.ethz.ch> brandis@inf.ethz.ch (Marc Brandis) writes: >As I understand the hardware documentation, the MMU does in fact detect the >illegal access, so the whole thing can only be a matter of the operating system. The problem is, page zero is mapped and readable. The MMU detects no exception. >Is there a way to turn this off so that all accesses to unmapped pages report >an exception, not only writes? There doesn't appear to be a way in the current OS version to unmap page zero. Even using the linker -T and -D flags to move the program text and data out of page zero doesn't help (it doesn't put anything in page zero, but it leaves it accessible). Speculation: IBM found too many things broke when they made NULL pointer dereferences trap. The documentation even says that *(int *)0 == 0. AIX 1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow reads from location 0. I don't know if the AIX 1 developers talk to the AIX 3 developers or not. -- John Carr (jfc@athena.mit.edu)
brandis@inf.ethz.ch (Marc Brandis) (02/12/91)
In article <1991Feb12.033513.27494@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes: >The problem is, page zero is mapped and readable. The MMU detects no >exception. > >Speculation: IBM found too many things broke when they made NULL pointer >dereferences trap. The documentation even says that *(int *)0 == 0. AIX >1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow >reads from location 0. I don't know if the AIX 1 developers talk to the AIX >3 developers or not. It is not just that page zero is mapped. Programs that are compiled and linked without any special options get the start of the text segment at address 0x10000000 and the start of the data segment at 0x20000000. The stack seems to grow downwards from 0x2ffffffc. I wrote a small C program that tries to read from each page starting at 0. It did not stop until it reached the page at address 0x20044000, which is just above the end of the data segment. Looking at this, it seems that the OS maps just everything below the data segment as readable as well as an area suitable for the stack (which was 0x2df80000 up to 0x2ffffffc for my test program). These pages cannot all be mapped when the program starts, as this would mean that several hundred megabytes of memory would have to be mapped. Since the inverted page table architecture of the S/6000 does not easily allow sharing of pages, this would result in an awful lot of real memory used up for this purpose. Considering the high amount of paging activity when I am running my test program, it seems that the pages become allocated when accessed. From all this together, I would guess that the MMU detects the access to an unmapped page, causes an exception and that the trap handler in turn allocates a new page if its address is below the data segment limit. I really do not understand what this should be good for. But anyway, even if this may have a use somewhere, if should be possible to turn it off. So, once again, does anybody know a way to turn it off, or does at least somebody have an explanation why this has been implemented like that and whether we can expect this to change in future releases of AIX? Any help or pointers appreciated. Thanks. Marc-Michael Brandis Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology) CH-8092 Zurich, Switzerland email: brandis@inf.ethz.ch
prener@arnor.uucp (02/13/91)
Actually, the compiler optimization takes advantage of the fact that location 0 is guaranteed to be readable and to contain a zero. This permits pointer-based loads to be done speculatively, on occasion, which, in turn can improve the instruction scheduling. -- Dan Prener (prener @ ibm.com)
jeffs@soul.esd.sgi.com (Jeff Smith) (02/14/91)
In article <1991Feb12.033513.27494@athena.mit.edu>, jfc@athena.mit.edu (John F Carr) writes: |> Speculation: IBM found too many things broke when they made NULL pointer |> dereferences trap. The documentation even says that *(int *)0 == 0. AIX |> 1.1 made NULL pointer reads trap, and IBM changed this for AIX 1.2 to allow |> reads from location 0. I don't know if the AIX 1 developers talk to the AIX |> 3 developers or not. Lots of things do break when *(char *)0 != 0. On early AIX/ps 1, *(char *)0 was really 'L'. The coff header was mapped in at address 0, and the first byte of the magic number corresponded 'L' I belive. This broke lots of utilities from the RT tree (parts of PS/2 and RS/6000 AIX started here). I wasn't around when they made *(char *)0 trap, but, I know it wasn't a easy time. Surprised they changed it back in 1.2 though. I used 1.2 for 8 months or so, but never noticed it. And no, the AIX 1 developers (PS/2) and the AIX 3 (RS/6000) developers don't talk much. jeffs@sgi.com
jfh@greenber.austin.ibm.com (John F Haugh II) (02/15/91)
In article <1991Feb13.223557.3901@odin.corp.sgi.com> jeffs@sgi.com writes: >And no, the AIX 1 developers (PS/2) and the AIX 3 (RS/6000) developers >don't talk much. This is a completely untrue statement. The department which I work in is comprised of a considerable number of version 3 developers. Not only myself, but three others from the same v3 area, plus several others from different v3 groups. There is a significant amount of communication from my area (PS/2, S/370) to the S/6000 area. Many of the programmers who worked on the most recent round of PS/2 development here in Austin were contractors whose stint with IBM ran out as development on AIX v3 wound down. The same is true from the PS/2 side - as development on the latest PS/2 update/etc expired, a good number of contractors have gone back to AIX v3 as contractors. So, not only is there direct communication, there is also a good deal of cross-pollination. What you must remember is that AIX v1 and AIX v3 are two different products running on two very different hardware platforms. There are restrictions on the PS/2 side that just don't exist on the S/6000, and vice versa. For example, v3 just won't fit on a 9370 or PS/2. -- John F. Haugh II | I've Been Moved | MaBellNet: (512) 838-4340 SneakerNet: 809/1D064 | AGAIN ! | VNET: LCCB386 at AUSVMQ BangNet: ..!cs.utexas.edu!ibmchs!auschs!snowball.austin.ibm.com!jfh (e-i-e-i-o)
brandis@inf.ethz.ch (Marc Brandis) (02/18/91)
The replies I got so far about NULL pointer accesses seem to indicate the following two things. 1) The instruction scheduler may decide to execute a load speculatively before it has checked the pointer. 2) A lot of code would break if *(NULL) != 0. Before I comment on these issues, let me briefly explain what I found on the machine. The whole address range from 0x0 to the end of the data segment is readable, with some areas also writable. There is an additional area for the stack which is both readable and writable. Since the data segment is usually allocated at address 0x20000000, this results in a space of about 600 megabytes which is readable. Due to the inverted page tables in the S/6000, these are not just shared page table entries pointing to a single page but in fact are allocated in memory the first time they are accessed. A program that I wrote to scan this area caused a lot of paging traffic. So, this whole stuff has been implemented somewhere in the page fault handler of AIX. Let me now comment on the above two points. I do not think that there are very many cases in which the speculative execution would be of any help, as the S/6000 does speculative execution in hardware. You can safely place the branch in front of the load and the machine will speculatively execute the load until the it is sure whether the result is needed or not. Note that this is advantegeous to using a load that has been moved in front of the branch, as all kind of exceptions or cache misses have to be handled. In fact, I looked at the code generated by the XLC compiler with all optimizations turned on for several pieces of code that traverse data structures including NULL pointers, and I did not find a single example where the compiler inserted such a speculatively executed load. I tend to believe more in the second issue that has been raised, that a lot of code would break. However, I checked several UNIX systems and I found that this does not seem to be the case for all systems. The test is whether the machines trap on a read from address 0, sending an segmentation violation signal to the application. Sun-3, SunOS 4.0.3 traps SparcStation, SunOS 4.0.3 traps DECstation 5000, Ultrix V4.1 (Rev. 52) traps IBM RS/6000 530, AIX 3 no trap Sequent Symetry S81, Dynix V3.0.17.9 no trap So, it must be possible to make common UNIX programs run without having reads from 0 returning 0. I have to say that I feel a little bit unsecure using these systems, as programs that need to access stuff that they do not have allocated contain bugs for sure. But still, it would be great if there were a way to turn this off. In the long run I hope IBM is able to write their code so that they would not require reading through a NULL pointer. Marc-Michael Brandis Computer Systems Laboratory, ETH-Zentrum (Swiss Federal Institute of Technology) CH-8092 Zurich, Switzerland email: brandis@inf.ethz.ch