[comp.os.minix] MINIX memory management/protection

ballou@brahms.Berkeley.EDU.UUCP (02/05/87)

In article <511@bobkat.UUCP> m5d@bobkat.UUCP (Mike McNally (dlsh)) writes:
>In article <1169@steinmetz.steinmetz.UUCP> davidsen@kbsvax.UUCP (william E Davidsen) writes:
>>
>>If you allocate a full 64k to data, there is hardware protection: you
>>can't address more than that. This assures that any program which
>>doesn't deliberately set out to cause problems will not modify the
>>code. ...
>
>What about a program with a bug in it?  Like "strcpy(a, b)" when "a" is
>not quite what I meant?  It's real easy to make this kind of mistake;
>how many times while debugging a program on a VAX (or whatever) do you
>get SIGBUS or SIGSEGV?
[Omitted here:  a description of a phenomenon with which I'm sure we are
 all too painfully familiar -- how wild pointers can crash programs and
 machines.]

	I think you have missed a key point here which depends on the iAPX86
architecture.  Because addresses are constructed as SEGMENT:OFFSET, as long
as the compiler generates no code that would reload the segment registers
and as long as you do not use any assembly code in your program that does
this, then you are physically constrained to 64K bytes starting at absolute
location 16 * SEGMENT.  Since pointers are passed only as offsets, the
worst you can do is scribble over your 64K segment.

	However, this is a blatant lie, and it is possible you are
vindicated.  For, if auto variables are allocated on the stack, one
could still easily lose.  Since the stack must lie in the same segment
as the static data (otherwise, pointers must have segments associated
with them to distinguish between auto and static variables), it is
possible to scribble over the stack.  In doing so, one could alter a
return address and find oneself in another process, or perhaps the
kernel.  Also, equally likely is that one would try to execute data
and encounter an illegal opcode.  I believe (but I am not certain)
that this halts the 8086.

	So, in summary, it turns out you are right, in an indirect
sort of way.
--------
Kenneth R. Ballou			ARPA:  ballou@brahms.berkeley.edu
Department of Mathematics		UUCP:  ...!ucbvax!brahms!ballou
University of California
Berkeley, California  94720

beattie@netxcom.UUCP (02/06/87)

In article <888@cartan.Berkeley.EDU> ballou@brahms.Berkeley.EDU (Kenneth R. Ballou) writes:
>In article <511@bobkat.UUCP> m5d@bobkat.UUCP (Mike McNally (dlsh)) writes:
>>In article <1169@steinmetz.steinmetz.UUCP> davidsen@kbsvax.UUCP (william E Davidsen) writes:
>>>
>>>If you allocate a full 64k to data, there is hardware protection: you
>>>can't address more than that. This assures that any program which
>>>doesn't deliberately set out to cause problems will not modify the
>>>code. ...
>>
>>What about a program with a bug in it?  Like "strcpy(a, b)" when "a" is
>>not quite what I meant?  It's real easy to make this kind of mistake;
>>how many times while debugging a program on a VAX (or whatever) do you
>>get SIGBUS or SIGSEGV?
>[Omitted here:  a description of a phenomenon with which I'm sure we are
> all too painfully familiar -- how wild pointers can crash programs and
> machines.]
>
>	I think you have missed a key point here which depends on the iAPX86
>architecture.  Because addresses are constructed as SEGMENT:OFFSET, as long
>as the compiler generates no code that would reload the segment registers
>and as long as you do not use any assembly code in your program that does
>this, then you are physically constrained to 64K bytes starting at absolute
>location 16 * SEGMENT.  Since pointers are passed only as offsets, the
>worst you can do is scribble over your 64K segment.
>
>	However, this is a blatant lie, and it is possible you are
>vindicated.  For, if auto variables are allocated on the stack, one
>could still easily lose.  Since the stack must lie in the same segment
>as the static data (otherwise, pointers must have segments associated
>with them to distinguish between auto and static variables), it is
>possible to scribble over the stack.  In doing so, one could alter a
>return address and find oneself in another process, or perhaps the
>kernel.  Also, equally likely is that one would try to execute data
>and encounter an illegal opcode.  I believe (but I am not certain)
>that this halts the 8086.
>
>--------
>Kenneth R. Ballou			ARPA:  ballou@brahms.berkeley.edu
>Department of Mathematics		UUCP:  ...!ucbvax!brahms!ballou
>University of California
>Berkeley, California  94720


Actually if you arange your code so data follows (which MINIX does)
since all jumps/calls/rets (except the long vesion) are CS:OFFSET
you may jump into your data, but not outside of your process space.
-- 
-----------------------------------------------------------------------
Brian Beattie			| Phone: (703)749-2365
NetExpress Communications, Inc.	| uucp: seismo!sundc!netxcom!beattie
1953 Gallows Road, Suite 300	|
Vienna,VA 22180			|

sl@van-bc.UUCP (02/07/87)

In article <888@cartan.Berkeley.EDU> ballou@brahms.Berkeley.EDU (Kenneth R. Ballou) writes:
>In article <511@bobkat.UUCP> m5d@bobkat.UUCP (Mike McNally (dlsh)) writes:
>>In article <1169@steinmetz.steinmetz.UUCP> davidsen@kbsvax.UUCP (william E Davidsen) writes:
>>>
>>>If you allocate a full 64k to data, there is hardware protection: you
>>>can't address more than that. This assures that any program which
>>>doesn't deliberately set out to cause problems will not modify the
>>>code. ...
>>
>>What about a program with a bug in it?  Like "strcpy(a, b)" when "a" is
>>not quite what I meant?  It's real easy to make this kind of mistake;
>>how many times while debugging a program on a VAX (or whatever) do you
>>get SIGBUS or SIGSEGV?
>[Omitted here:  a description of a phenomenon with which I'm sure we are
> all too painfully familiar -- how wild pointers can crash programs and
> machines.]
>
>	I think you have missed a key point here which depends on the iAPX86
>architecture.  Because addresses are constructed as SEGMENT:OFFSET, as long
>as the compiler generates no code that would reload the segment registers
>and as long as you do not use any assembly code in your program that does
>this, then you are physically constrained to 64K bytes starting at absolute
>location 16 * SEGMENT.  Since pointers are passed only as offsets, the
>worst you can do is scribble over your 64K segment.
>
>	However, this is a blatant lie, and it is possible you are
>vindicated.  For, if auto variables are allocated on the stack, one
>could still easily lose.  Since the stack must lie in the same segment
>as the static data (otherwise, pointers must have segments associated
>with them to distinguish between auto and static variables), it is
>possible to scribble over the stack.  In doing so, one could alter a
>return address and find oneself in another process, or perhaps the
>kernel.  Also, equally likely is that one would try to execute data
>and encounter an illegal opcode.  I believe (but I am not certain)
>that this halts the 8086.
>
>	So, in summary, it turns out you are right, in an indirect
>sort of way.
>--------


NO! RTFM! This is a very safe way to keep programs from clobbering the
system.

To quote from the Intel iAPX 88 Book:

	"For an intrasegment CALL, SP (the stack pointer) is decremented by two
	and IP is pushed onto the stack. The target procedure's relative
	displacement (up to +- 32k) from the CALL instruction is then added to
	the instruction pointer. This CALL instruction form is "self-relative"
	and appropriate for position-independant (dynamically relocatable)
	routines in which the CALL and its target are move together in the same
	segment.

	"For an intersegment direct CALL, SP is decremented by two, and CS is
	pushed onto the stack. CS is replaced by the segment word contained in
	the instruction.....

	"... RET pops the word at the top of the stack (pointed to by register
	SP) into the instruction pointer and increments SP by two. If RET is
	intersegment, the word at the new top of stack is popped into the CS
	register.......



If properly done the compiler never emits code to modify ANY of the four
segment registers. This includes the code segment register. Calls, and
returns have two versions, inter and intra segment. You simply have to
restrict yourselft to only using intrasegment calls, returns, jumps, loads
and NOTHING your program does will go outside of your 64k segments. This
includes such things as clobbering the stack as you described. 


The only possible way is to have self-modifying code :-). Again, simply
restrict you emitted code to not modifying anything relative to the CS
register, and ensure that when you load the program with the 64k code
segment not overlapping the 64k data segment.


This has been used for years with such systems as Coherent which is a Mark
Williams Version 7 Unix look alike. I used it for about a year and can't
remember ever having it blow up because of a runaway program (while doing 
software development).


-- 
Stuart Lynne	ihnp4!alberta!ubc-vision!van-bc!sl     Vancouver,BC,604-937-7532
Todays feature: Perry Mason Solves the Case of the Buried Clock, Erle
Stanley Gardner, 1943. A bank clerk boasted branzenly about a $90,000
embezzlement, and an alarm clock ticked away cheerfully underground.

kneller@ucsfcgl.UUCP (02/08/87)

In article <299@netxcom.UUCP> beattie@netxcom.UUCP (Brian Beattie) writes:
> [ ... ]
	Long discussion about getting out of your process by an
	errant return.
>
>Actually if you arange your code so data follows (which MINIX does)
>since all jumps/calls/rets (except the long vesion) are CS:OFFSET
>you may jump into your data, but not outside of your process space.

Of course, if your data contains the opcode for RETF ...
-----
	Don Kneller
UUCP:	...ucbvax!ucsfcgl!kneller
ARPA:	kneller@cgl.ucsf.edu
BITNET:	kneller@ucsfcgl.BITNET

mark@ems.UUCP (02/08/87)

>>>In article <1169@steinmetz.steinmetz.UUCP> davidsen@kbsvax.UUCP (william E Davidsen) writes:
>>>>
>>>>If you allocate a full 64k to data, there is hardware protection: you
>>>>can't address more than that. This assures that any program which
>>>>doesn't deliberately set out to cause problems will not modify the
>>>>code. ...
>>> [ other comments deleted ]
In article <299@netxcom.UUCP> beattie@netxcom.UUCP (Brian Beattie) writes:
>
>Actually if you arange your code so data follows (which MINIX does)
>since all jumps/calls/rets (except the long vesion) are CS:OFFSET
>you may jump into your data, but not outside of your process space.

Are you really sure that you want to execute your data?  Think about this
for a minute.  Who is to say what fun instructions you data would look like?
Like, for example, it could change you CS:, DS:, ES: or SS: and then, even
if it did nothing else, and (miraculously) returned back to the code in some
kind of reasonable point, what is going to happen?  Can you say that the
program will not write outside of the process space?  I can't, and I have
seen this happen.  There is no substitute for a hardware MMU.

Do not underestimate the problems that executing data can produce.

--
Mark H. Colburn          mark@ems.uucp      
EMS/McGraw-Hill          {rutgers|amdahl|ihnp4}!{dayton|meccts}!ems!mark
9855 West 78th Street     
Eden Prairie, MN 55344   (612) 829-8200 x235
-- 
Mark H. Colburn          mark@ems.uucp      
EMS/McGraw-Hill          {rutgers|amdahl|ihnp4}!{dayton|meccts}!ems!mark
9855 West 78th Street     
Eden Prairie, MN 55344   (612) 829-8200 x235

ballou@brahms.Berkeley.EDU.UUCP (02/09/87)

In article <352@van-bc.UUCP> sl@van-bc.UUCP (Stuart Lynne) writes:
>In article <888@cartan.Berkeley.EDU> ballou@brahms.Berkeley.EDU (Kenneth R. Ballou) writes:
[Omitted here:  various discussions about whether there is hardware protection
from clobbering the system with a runaway program.]
>NO! RTFM! This is a very safe way to keep programs from clobbering the
>system.

	I propose to demonstrate that your claim of safety is false.
Consider carefully the following possible scenario.

	1.  Since stack and data are in the same segment (this is forced upon
you once you decide that pointers are represented by 16 bit offsets into a
segment, since you must be able to take addresses of both static and auto
variables), it is possible that with a wild pointer you could overwrite the
stack frame associated with the current procedure call.  In particular, you
could alter the return address for your procedure.  Now, you are correct in
asserting that since the return address is a 16 bit offset, you will return to
some location in your code segment.

	2.  However, no guarantee is made that the address to which you return
will be the address of one of the instructions in your code.  Consider in
particular that the byte at the location to which you return is FF hex.  If
you RTFM, you will note that this is a far JMP instruction.  The effect of this
is that the next two bytes will be stuffed into the IP register, and the two
bytes thereafter will go into the CS register.  At this point, you are 
officially in for disaster.  You might either end up in another process's code
segment, in a data segment (in which case, watch the CPU hang because of an
illegal instruction which is eventually going to occur from trying to execute
data), or in some unallocated region of memory.  In any case, you have lost
big.  
	I believe Don Kneller also pointed out less disastrous losses; you
might find yourself executing code which reloads a segment register.  In any
case, there are enough things that could go wrong that the idea of this scheme
being safe is incorrect.

	SUMMARY:  Simply using a small memory model is not enough to guarantee
safety of your process.
--------
Kenneth R. Ballou			ARPA:  ballou@brahms.berkeley.edu
Department of Mathematics		UUCP:  ...!ucbvax!brahms!ballou
University of California
Berkeley, California  94720

allbery@ncoast.UUCP (02/10/87)

As quoted from <888@cartan.Berkeley.EDU> by ballou@brahms.Berkeley.EDU.UUCP:
+---------------
| 	However, this is a blatant lie, and it is possible you are
| vindicated.  For, if auto variables are allocated on the stack, one
| could still easily lose.  Since the stack must lie in the same segment
| as the static data (otherwise, pointers must have segments associated
| with them to distinguish between auto and static variables), it is
| possible to scribble over the stack.  In doing so, one could alter a
| return address and find oneself in another process, or perhaps the
| kernel.  Also, equally likely is that one would try to execute data
| and encounter an illegal opcode.  I believe (but I am not certain)
| that this halts the 8086.
+---------------

If the only way to exit to the kernel is a TRAP instruction, which will be
controlled, then the program will only use NEAR RET instructions, and so
cannot RET into another segment by changing a return address.

Executing data still remains a problem; a bit of data that just happens to
look like (say) JMP FAR F000:0000 will do a good job of hanging the system.
Invalid opcodes?  Sure enough, I see no indication of a trap vector for an
illegal instruction.  So executing data as program remains a potential problem.

Or does it?  One can always use separate I & D and fill unused sections of the
code segment with 0xCC (INT 3), which can then be made synonymous with the
exit syscall.  Of course, the MINIX assembler needs to be fixed to work with
separate I & D, but after that you CANNOT find yourself in the data segment.
You then have to do assembler munging -- deliberately -- to crash the system.
-- 
++Brandon (Resident Elf @ ncoast.UUCP)
 ____   ______________
/    \ / __   __   __ \   Brandon S. Allbery	    <backbone>!ncoast!allbery
 ___  | /__> /  \ /  \    aXcess Co., Consulting    ncoast!allbery@Case.CSNET
/   \ | |    `--, `--,    6615 Center St. #A1-105 	   (...@relay.CS.NET)
|     | \__/ \__/ \__/    Mentor, OH 44060-4101     
\____/ \______________/   +1 216 974 9210

davidsen@steinmetz.UUCP (02/11/87)

In article <888@cartan.Berkeley.EDU> ballou@brahms.Berkeley.EDU (Kenneth R. Ballou) writes:
$In article <511@bobkat.UUCP> m5d@bobkat.UUCP (Mike McNally (dlsh)) writes:
$>In article <1169@steinmetz.steinmetz.UUCP> davidsen@kbsvax.UUCP (william E Davidsen) writes:
$>>=== what I wrote ===
$>
$>What about a program with a bug in it?  Like "strcpy(a, b)" when "a" is
$>not quite what I meant?  It's real easy to make this kind of mistake;
$>how many times while debugging a program on a VAX (or whatever) do you
$>get SIGBUS or SIGSEGV?
$[Omitted here:  a description of a phenomenon with which I'm sure we are
$ all too painfully familiar -- how wild pointers can crash programs and
$ machines.]
$
$
$	However, this is a blatant lie, and it is possible you are
$vindicated.  For, if auto variables are allocated on the stack, one
$could still easily lose.  Since the stack must lie in the same segment
$as the static data (otherwise, pointers must have segments associated
$with them to distinguish between auto and static variables), it is
$possible to scribble over the stack.  In doing so, one could alter a
$return address and find oneself in another process, or perhaps the
                                    ^^^^^^^^^^^^^^^^
$kernel.  Also, equally likely is that one would try to execute data
 ^^^^^^

NO. You can find yourself somewhere in your own code segment. There
is no way to wind up somewhere else. After you get to your own code
segment there is a posibility of executing a far call, jump, or return.
Since the return is 'under' the auto variables (lower address) and
the most common errors run beyond the end of an array, this is unlikely
but not imposible.
-- 
bill davidsen			sixhub \
      ihnp4!seismo!rochester!steinmetz ->  crdos1!davidsen
				chinet /
ARPA: davidsen%crdos1.uucp@ge-crd.ARPA (or davidsen@ge-crd.ARPA)