[comp.unix.i386] Implementing NULL trapping on AT&T SVR3.2

junk1@cbnews.att.com (eric.a.olson) (07/06/90)

In article <1990Jul5.174608.17336@eci386.uucp> clewis@eci386.UUCP (Chris Lewis) writes:
>
>On System V (I'm 386/ix 1.0.6), the memory layout of an executable
>program is controlled by a default loader control file ("ifile"),
...
>386 one uses the "defaults" built into "ld"'s binary, which I can't
>seem to be able to reconstruct from the 386/ix Guide entries for
>the loader.  Eg: by manually creating an ifile, I can't seem to be
>able to build a binary that runs (and many variants won't even link - the
>examples seem defective).  
...
>
>Anyways, two questions:
>	1) Has anybody got a working ifile for a 386 UNIX system
>	   that I could try playing with?
>	2) Has anybody got a working ifile for 386 UNIX systems
>	   that explicitly maps *out* at least the first couple
>	   of pages at virtual 0 so that null dereferences fault?
>	   Is this possible?  (does the 386/ix execution model
>	   memory requirements forbid this?)
	
	I'd  like to do this too, but I've been seeing the same
	results you mention... The '-z' flag causes the loader
	to issue a complaint that the default bond address is
	not within allocated memory, and no ifile that I can 
	construct seems to produce a runnable program.

	Has anybody (Conor?) been able to do this?

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/07/90)

   In article <1990Jul5.174608.17336@eci386.uucp> clewis@eci386.UUCP
   (Chris Lewis) writes:
   
   On System V (I'm 386/ix 1.0.6), the memory layout of an executable
   program is controlled by a default loader control file ("ifile"),
   ...
   386 one uses the "defaults" built into "ld"'s binary, which I can't
   seem to be able to reconstruct from the 386/ix Guide entries for
   the loader.

You cannot. The example assumes a linker primtive that is not actually
there. This one is the one that tells you how long is the COFF header;
without this you must waste almost a pageful in the executable...

   	2) Has anybody got a working ifile for 386 UNIX systems
   	   that explicitly maps *out* at least the first couple
   	   of pages at virtual 0 so that null dereferences fault?
   	   Is this possible?  (does the 386/ix execution model
   	   memory requirements forbid this?)

That is pretty easy. All you have to do is to read as a preliminary the
Unix Papers (SAMS) article on the port of System V to the 386, as there
are a couple of non obvious tricks: you must make the data begin at the
same within the page offset where the code ends, and you must make the
code begin -- within the loadable file itself -- at a page boundary.

I had posted some months ago a full set of patches to g++ 1.36.x that
contained this ifile, and the ifile itself separately. If any kind soul
has saved, they might want to repost it (should go in the frequently
asked questions writeup) or send it to Chris Lewis (my copy is on my
home machine, i.e. not handy here).

Another alternative is to use the gdb patches that enable watchpoints,
and set a watchpoint on address 0.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/09/90)

In article I, <PCG.90Jul7172536@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk
(Piercarlo Grandi), write:

      In article <1990Jul5.174608.17336@eci386.uucp> clewis@eci386.UUCP
      (Chris Lewis) writes:

      On System V (I'm 386/ix 1.0.6), the memory layout of an executable

By the way: take advantage of ISC's *generous* upgrade policy and get up
to 2.x, which is vastly improved, or get (for probably much less than
the upgrade cost) ESIX rev. D, which apparently has the Berkeley FFS, as
well as RFS, TCP/IP, X11, etc...

      program is controlled by a default loader control file ("ifile"),
      ...
      386 one uses the "defaults" built into "ld"'s binary, which I can't
      seem to be able to reconstruct from the 386/ix Guide entries for
      the loader.

	[ ... ]

   I had posted some months ago a full set of patches to g++ 1.36.x that
   contained this ifile, and the ifile itself separately. If any kind soul
   has saved, they might want to repost it (should go in the frequently
   asked questions writeup) or send it to Chris Lewis (my copy is on my
   home machine, i.e. not handy here).

Well, given my infinite generosity I have myself brought over from home
the ifile concerned, embellished it a bit, and here it is:

-----------------------cut here-----------------------------------
/*
    Copyright 1989,1990 Piercarlo Grandi. All rights reserved.

    This source is free software; you can redistribute it and/or
    modify it under the terms of the GNU General Public License as
    published by the Free Software Foundation; either version 1, or
    (at your option) any later version.

    This source is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You may have received a copy of the GNU General Public License
    along with this source; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/

/*
    This is a set of sysV 3.2 directives to assist with making the -z
    option of ld(1) work.  This options undefines a stretch of memory
    starting with virtual address 0, thus helping to catch stray
    memory references (tipically indirections thru the 0 pointer).

    Unfortunately -z only redefines the memory map; this script must
    be also used to ensure that the first section (.text) begins at
    the first valid virtual memory map location and that it begins in
    the executable file at a page boundary, so that demand loading is
    still possible.

    On a sysV/386 pages are 0x1000 or 4K bytes long, and segments are
    0x400000 or 4M bytes long.
*/

/*
    Just for curiosity, here are the directives that would set up the
    memory map appropriately (well, the stack is a bit bogus); if you
    use these, you can leave option -z out, but you get a limit on the
    number of supported shared libraries.  Note that -z starts coede
    at 0x00020000; my manipulating three values you can change that.

    Note that the address of the shared libraries after the 1st are
    a bit speculative, as is the origin and length of the stack.

    The kernel and uarea ranges are ther eonly if you want to do funny
    things; they could be easily left out. If you want to use them,
    you have to use a noload section.
*/

/*
MEMORY
{
	code	(RXI)		: origin=0x00020000,length=0x003e0000
	data	(RWXI)		: origin=0x00400000,length=0x00400000
	stack	(RWX)		: origin=0x40000000,length=0x40000000

	code1	(RX)		: origin=0xa0000000,length=0x00400000
	data1	(RX)		: origin=0xa0400000,length=0x00400000
	code2	(RX)		: origin=0xa0800000,length=0x00400000
	data2	(RX)		: origin=0xa0c00000,length=0x00400000
	code3	(RX)		: origin=0xa1000000,length=0x00400000
	data3	(RX)		: origin=0xa1400000,length=0x00400000
	code4	(RX)		: origin=0xa1800000,length=0x00400000
	data4	(RX)		: origin=0xa1c00000,length=0x00400000

	kernel	(RX)		: origin=0xd0010000,length=0x003f0000
	uarea	(R)		: origin=0xe0000000,length=0x00020000
}
*/

SECTIONS
{
    /*
	Ensure that text is the first section loaded. Note that we align the
	start of code to the first 4K bytes in the object file to make it
	possible to demand load it. We could have instead aligned it to the
	address immediately after the end of the COFF headers, but ld does not
	give us a primitive with the size of the COFF header. We therefore
	align code to a page boundary, and this incidentally leaves the first
	4K bytes free to the COFF headers. They should never even approach
	that size, so it is a bit of disk space waste, but demand loading
	is important, and also peace of mind that they do not overwrite the
	beginning of the code section.
    */

    .text	BIND(0x00020000)	/* -z starts virtual mem here	*/
		BLOCK(0x00001000):	/* Align text in file to page	*/
    {
	*(.init)
	*(.text)
	*(.fini)
    }

    /*
        Ensure that data and bss begin at the next region boundary
        (0x400000) and that it begins at an offset within the page
        that is the same as the offset of the end of the text region
        (note that we *know* that text begins on a page boundary
        here).  This may waste some bytes in the first page of the
        data+bss region, but allows it to overlap the text region in
        the page table, thus saving a lot of page table space.  See
        the relevant article in Unix Papers (SAMS).
    */

    GROUP	BIND(NEXT(0x00400000) + SIZEOF(.text)%0x1000):
    {
	.data			: { }
	.bss			: { }
    }
}
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

fnf@riscokid.UUCP (Fred Fish) (07/10/90)

In article <PCG.90Jul9124816@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>	possible to demand load it. We could have instead aligned it to the
>	address immediately after the end of the COFF headers, but ld does not
>	give us a primitive with the size of the COFF header. We therefore
>	align code to a page boundary, ...

The implementations I am familiar with support the construct SIZEOF(.headers)
to obtain the aggregate size of all the COFF headers.  I don't know if this
is just undocumented in some implementations or actually nonexistant in them.

-Fred

clewis@eci386.uucp (Chris Lewis) (07/11/90)

I've come up with a partial (non-generalized) solution to implementing
NULL dereference catching...  (Incidentally, you changed the subject
to SVR3.2[.2] - we're 386/ix 1.0.6 which is SVR3.0, but I think my
partial solution will still work on SVR3.2)

In article <PCG.90Jul7172536@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
 
|   In article <1990Jul5.174608.17336@eci386.uucp> clewis@eci386.UUCP
|   (Chris Lewis) writes:
    
|>On System V (I'm 386/ix 1.0.6), the memory layout of an executable
|>program is controlled by a default loader control file ("ifile"),
|>    ...
|>386 one uses the "defaults" built into "ld"'s binary, which I can't
|>seem to be able to reconstruct from the 386/ix Guide entries for
|>the loader.
 
| You cannot. The example assumes a linker primtive that is not actually
| there. This one is the one that tells you how long is the COFF header;
| without this you must waste almost a pageful in the executable...

I managed to construct an ifile that does almost what I want:

>SECTIONS
>{
>    .text 0xd0 : { *(.init) *(.text) *(.fini) }
>    GROUP BIND ( NEXT(0x400000) +
>	((SIZEOF(.text) + ADDR(.text)) % 0x2000)) :
>	{
>	    .data : { }
>	    .bss : { }
>	}
>}

The 0xd0 is in replacement of the "sizeof_headers" primitive that's
missing according to the Guide (p 12-14).  The 0xd0 is formed by
computing:

	sizeof(struct filehdr) + sizeof(struct aouthdr) +
	    4 * sizeof(struct scnhdr)

The 4 is the number of sections that dump -h tells you about when
you link without the ifile (.text/.data/.bss/.comment).  If you
use shared libraries in your link (-lc_s), this number goes to 7 (constant
becomes 0x148).  These constants may be slightly different on your
machine - you'll have to figure out how big filehdr,aouthdr,scnhdr
and run the loader without an ifile to figure out how many sections
would be in the output file.

[Explanation: link .init/.text/.fini at 0xd0, link .data and .bss
contiguously starting at 0x400000, with the same starting offset
within a page as the end of the .text area - the 0x400000 is the "region"
according to the SAMS book where data is supposed to go.].

If you change the 0x00d0 to 0x10d0 or 0x20d0 or 0x30d0 etc., page 0
is not mapped into memory and the program will fault on a null-dereference.
Yeah!  What I still cannot do is make this ifile independent of the
size of the headers....  It appears as if the loader(kernel?) automatically
prepends the headers to the output .text section in the executing image,
and you have to get the offset of the .text section to be equal to the
sizeof of the headers if you want the resultant "virtual" .text to start
at a offset of 0 within a page.  If I set the constant to 0x20d0,
virtual address 0x2000 has the filehdr magic number again....

| That is pretty easy. All you have to do is to read as a preliminary the
| Unix Papers (SAMS) article on the port of System V to the 386, as there
| are a couple of non obvious tricks: you must make the data begin at the
| same within the page offset where the code ends, and you must make the
| code begin -- within the loadable file itself -- at a page boundary.

I borrowed a copy, and it wasn't too helpful - too high a level to mention that
file headers get automatically prepended to the .text area and it's the
file headers that must start at a page boundary...

Thank you Piercarlo for getting me onto the right track.
-- 
Chris Lewis, Elegant Communications Inc, {uunet!attcan,utzoo}!lsuc!eci386!clewis
Ferret mailing list: eci386!ferret-list, psroff mailing list: eci386!psroff-list