[comp.arch] PEP: Page Execution Priviledge

dtynan@sultra.UUCP (Der Tynan) (09/29/88)

I was thinking last night, about an improvement in the standard User/Supervisor
status bit.  Before I let you in on what I was thinking, I must give the
disclaimer that some "brilliant" ideas I've had at 4:00 in the morning have
been pretty dumb, in the cold light of day.

Rather than have a standard U/S bit in the CPU status register, it might make
more sense to assign it to each I-page in a paged-MMU system.  The idea is that
certain functions within marked pages would carry a higher priviledge than the
rest.  This could best be used to alter certain key variables in the UN*X
kernel.  An example might be getpid(), which is a fairly nondescript system
call.  Instead of doing a trap to the kernel (with all the incredible overhead
involved), an application calls the standard library (which is in a page with
special access to the running kernel) which pulls the process ID right out of
the proc structure.  Date & time the same way.  Even the stuff to change UID/
GID could be done in the standard library (although this produces serious
security problems).  The programs such as 'ps', need not touch /dev/kmem, but
go straight to the kernel itself, and pull the appropriate stuff (sort of like
open-heart surgery).  All-in-all, the user application cannot touch the kernel,
but those instructions fetched from certain pages (which are X-only) could.

My question is this:  Does any system implement a scheme like this?  What are
the problems with doing this?  And is there anyone actually researching this?

Another application might be to use shared libraries between different
applications (I believe AT&T does this in SYSV3?), by mapping the library
into the appropriate page of the user process (again, X-only).  Any comments?
						- Der

-- 
Reply:	dtynan@sultra.UUCP		(Der Tynan @ Tynan Computers)
	{mips,pyramid}!sultra!dtynan
	Cast a cold eye on life, on death.  Horseman, pass by...    [WBY]

bga@raspail.UUCP (Bruce Albrecht) (09/29/88)

In article <2550@sultra.UUCP>, dtynan@sultra.UUCP (Der Tynan) writes:
> Rather than have a standard U/S bit in the CPU status register, it might make
> more sense to assign it to each I-page in a paged-MMU system.  The idea is that
> certain functions within marked pages would carry a higher priviledge than the
> rest.  This could best be used to alter certain key variables in the UN*X
> kernel.  [ Examples deleted. ]
> 
> My question is this:  Does any system implement a scheme like this?  What are
> the problems with doing this?  And is there anyone actually researching this?

This scheme has been around for a long time.  One of the first systems that
I'm aware of that used this was Multics, which has been described in a book
by Organick (MIT Press, not sure of exact title).  Control Data's Cyber 180
architecture supports this, and its NOS/VE operating system uses it.

I think the scheme is known as rings of protection.  I'm most familiar with
the Cyber implementation, so I will describe it.  The Cyber has a 16 level
ring heirarchy, with ring 0 the most privileged, and ring 15 the least.  Most
user tasks run in ring 11.  The memory is segmented, with 2*12 segments of
2**31 bytes.  Each segment has a 3 number ring attribute associated with it
that is used to determine access.  Processes running at a ring number less 
than or equal to the first memory ring number can read or write to the memory,
processes running at a ring number less than or equal to the second memory
ring number can read the memory, and if the segment is executable, processes
running at a ring number less than or equal to the third memory ring number
can call a subroutine in the segment.  The second and third ring numbers are
also used to determine whether the process ring number is lowered during a
subroutine call.

In NOS/VE, all of the operating system routines are known to your task, and
therefore can be accessed by simple subroutine calls.  Although it would be
possible to make data that doesn't contain "secure" information (such as
date and time) could be defined as externally available data and just 
reference them, NOS/VE usually requires you make a call to a subroutine to
get it.

Another interesting feature of NOS/VE is that all files are considered to be
memory segments, and all I/O is done by paging.  You can either access files
through the record manager, or you can access the file as though it was just
a part of your memory, and put (and modify) data structures in it using
memory allocation and pointers.

Bruce

barmar@think.COM (Barry Margolin) (09/30/88)

What you described is a simplified version of the ring protection
scheme used on Multics.  On Multics, each memory segment has three
ring numbers (integers from 0 to 7) associated with it, called its
write ring, its read ring, and its execute ring.  Additionally, each
process has a current ring of execution.  You can read a segment if
your ring of execution is less than the segment's read ring, you can
write it if it is less than the write ring, and you can execute it if
you are between the write and execute rings.  The read and execute
rings are normally the same, but if the read ring is less than the
execute ring then the segment is called a gate.  If your ring of
execution is between the read ring and the execute ring, and you use a
subroutine call instruction to transfer into a gate, then a ring
crossing occurs and your ring of execution is lowered to the read ring
of the segment; the matching return instruction restores the ring of
execution.  Gates also have special protection that allows them to
only be called at specified entrypoints, so that a caller can't
transfer to the instruction after some access checks are performed.

This works well with dynamic linking of subroutines.  When a routine
is dynamically linked, a shared memory segment is mapped onto the file
containing the library, and the ring numbers of the segment are taken
from its attributes in the file system (in fact, this is how ALL file
access is done on Multics -- there are no kernel interfaces that do
explicit file I/O, although there is a user-ring library that
implements various file access methods using shared segments mapped
onto files).

All supervisor data is in segments whose ring numbers are [0, 0, 0]
(this is [write, read, execute]), so that they can only be accessed
while executing in ring 0.  System calls are implemented by doing a
subroutine call to a segment whose ring numbers are [0, 0, 5], i.e. a
gate from ring 5 to 0, and most normal users run in ring 4.  A number
of privileged system operations that aren't actually part of the
supervisor are in rings 2 and 3.

There is still a need for a supervisor/user state in the processor,
though.  When you are in ring 0 you are in supervisor state, which
allows execution of privileged instructions (such as performing I/O,
or manipulating segment descriptor registers).

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

henry@utzoo.uucp (Henry Spencer) (10/01/88)

In article <2550@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes:
>Rather than have a standard U/S bit in the CPU status register, it might make
>more sense to assign it to each I-page in a paged-MMU system...

Things at least vaguely along those lines have been done.  There is one
major problem that has to be solved:  how do you prevent a user from
branching to some well-chosen place in the *middle* of a privileged
routine?  Say, for example, bypassing some of the legality checks at
the beginning?  One needs some hardware-enforced notion of entry points,
so that transitions from lower privilege to greater privilege get done
only in authorized ways.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

smryan@garth.UUCP (Steven Ryan) (10/01/88)

>There is still a need for a supervisor/user state in the processor,
>though.  When you are in ring 0 you are in supervisor state, which
>allows execution of privileged instructions (such as performing I/O,
>or manipulating segment descriptor registers).

From what I understand, the much maligned 286 associates instruction
privledges with ring numbers so that a separate supervisor/user state
is unnecessary.

PLS@cup.portal.com (10/01/88)

I'm sure others will also point out: Multics worked very much this way, except
that the privleges were on a segment rather than a page basis. The same
idea is being incorporated into the present Honeywell-Bull GCOS system.

  ++PLS

gillies@p.cs.uiuc.edu (10/01/88)

Well, your system isn't exactly like Multics, but it has some
similarities.  A central idea of Multics is that the rights of a
program are determined by its addressing descriptors.  The descriptors
are basically capabilities for memory segments, and they contain
rights bits.  You have decided to append your rights bits to each
fixed-size page; Multics appends its rights bits to each variable-size
data segment capability.  The Multics way has an advantage -- 
different programs may efficiently share the same segment, with
different rights for each program.

There are a few problems with your method, however.  One problem: How
do you prevent a user-call from transferring to some random
non-instruction on one of these protected pages?  In Multics, you
could only jump to the beginning of a protected segment, I believe
(ENTER).

When are these bits validated?  Are they kepts in some sort of cache
and validated on every use?  What happens when someone wants to
"declassify" one of these segments?

A fundamental advantage of the Multics design is that it's simple
enough to implement in hardware, and yet it supports access-control
list protection.  A fundamental flaw of the Multics design is that
ring zero is omnipotent, and domain trust must nest.  There exist
capability-based protection schemes (e.g. Plessey/250 System) with
essentially NO omnipotent protection domains (except for the necessary
one that manufactures capabilities), and no nesting of trust.

Thus, I believe it is still an open problem to design a hardware
scheme that simultaneously supports access-control-list protection
efficiently, but does away with this omnipotent supervisor domain, and
does not enforce the Multics nesting constraint.

Don Gillies, Dept. of Computer Science, University of Illinois
1304 W. Springfield, Urbana, Ill 61801
ARPA: gillies@cs.uiuc.edu   UUCP: {uunet,ihnp4,harvard}!uiucdcs!gillies

bvs@light.uucp (Bakul Shah) (10/02/88)

In article <1988Sep30.170503.19191@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>In article <2550@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes:
>>Rather than have a standard U/S bit in the CPU status register, it might make
>>more sense to assign it to each I-page in a paged-MMU system...
>
>Things at least vaguely along those lines have been done.  There is one
>major problem that has to be solved:  how do you prevent a user from
>branching to some well-chosen place in the *middle* of a privileged
>routine?  Say, for example, bypassing some of the legality checks at
>the beginning?  One needs some hardware-enforced notion of entry points,
>so that transitions from lower privilege to greater privilege get done
>only in authorized ways.

Hardware-enforced entry points are not needed if you
use indirection.  Make sure only privileged jump
tables are accessible from a non privileged place.
So call to a privileged routine will be a call to a
jump table address and another jump from there to
the real routine.  I think this idea can be easily
extended to switch from one protection domain to
another (and they don't have to be rings).

Seems to me that RISC processors like AMD29000 can
modified fairly easily  and are ideally suited for
this sort of things.

----
Bakul Shah <..!{ucbvax,sun,uunet}!amdcad!light!bvs>

henry@utzoo.uucp (Henry Spencer) (10/04/88)

In article <1988Oct1.115519.11020@light.uucp> bvs@light.UUCP (Bakul Shah) writes:
>>... how do you prevent a user from
>>branching to some well-chosen place in the *middle* of a privileged
>>routine? ...
>
>Hardware-enforced entry points are not needed if you
>use indirection.  Make sure only privileged jump
>tables are accessible from a non privileged place.

Ah, but now we need three levels of protection:  user, jump table, and
privileged.  Your privileged jump tables *are* hardware-enforced entry
points.  (P.S. the jump-indirect instruction is going to have to be
careful that it can't be fooled.  Consider a machine like the 68020
that will do unaligned fetches:  jump indirect via an unaligned address
in the jump table, that picks up some bytes from one address and some
from the next and treats the combination as a privileged address.)
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

bvs@light.uucp (Bakul Shah) (10/05/88)

[Henry Spencer:]
>>>... how do you prevent a user from
>>>branching to some well-chosen place in the *middle* of a privileged
>>>routine? ...

[Me:]
>>Hardware-enforced entry points are not needed if you
>>use indirection.  Make sure only privileged jump
>>tables are accessible from a non privileged place.

[Henry Spencer:]
>Ah, but now we need three levels of protection:  user, jump table, and
>privileged.  Your privileged jump tables *are* hardware-enforced entry
>points.  (P.S. the jump-indirect instruction is going to have to be
>careful that it can't be fooled.  Consider a machine like the 68020
>that will do unaligned fetches:  jump indirect via an unaligned address
>in the jump table, that picks up some bytes from one address and some
>from the next and treats the combination as a privileged address.)

No, we don't!  I should have used the term ``a table of
jump instructions'' instead of `a jump table'.  Call TO
an address in this table will be allowed because the
table is eXecutable from user mode.  Jump FROM the table
is allowed to land in any executable privileged page
because the table is privileged.  Yes, this scheme is
tricky and may not work if you can jump into middle of
an instruction. On a RISC processor where the jump delay
slot is exposed to the software, the table contains nops
in addition to jumps.

Example:

	; user mode code
		...
		call	lr0, foo
		 nop	; delay slot instruction
		...

	; privileged table, executable from user mode
	privileged_table:
		...
	foo:
		jmp	real_foo
		 nop
	bar:
		jmp	real_bar
		  nop
		...

	; privileged code, not executable from user mode
	real_foo:
		...
		jmpi	lr0
	real_bar:
		...
		jmpi	lr0

----
Bakul Shah <..!{ucbvax,sun,uunet}!amdcad!light!bvs>

rpw3@amdcad.AMD.COM (Rob Warnock) (10/05/88)

Concerning the notion of a privileged jump table:

Note that the DEC KL-10 (a.k.a. "late-model" PDP-10) had just such a
"protected jump" instruction, used for allowing users to attach to and
jump to execute-only segments of proprietary code only at defined places.
There was a particular jump instruction ("JRST") that had an otherwise
unused bit in it. If an attempt was made to jump or call a protected
segment, the access was allowed iff the target of the jump/call was
a "JRST magic_bit, some_address". These instructions were called "portals",
and in fact the assembler opcode "PORTAL" expanded into the "JRST magic_bit,".
Such private segments often had a table of PORTAL instructions at the beginning,
so that calling table+offset was a call to some function.

Unfortunately (if my vague recollection is correct), the magic bit in
question was used for something in kernel mode, so this method was not
useful for changing between user and kernel states, only for unprotected-user
to proprietary-execute-only-user transitions. (*sigh*) 


Rob Warnock
Systems Architecture Consultant

UUCP:	  {amdcad,fortune,sun}!redwood!rpw3
ATTmail:  !rpw3
DDD:	  (415)572-2607
USPS:	  627 26th Ave, San Mateo, CA  94403

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (10/05/88)

In article <76700049@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:

>do you prevent a user-call from transferring to some random
>non-instruction on one of these protected pages?  In Multics, you
>could only jump to the beginning of a protected segment, I believe
>(ENTER).
>
>
>

May I assume that the mechanism permitted unrestricted branching WITHIN
a particular code segment, and only restricted accesses OUTSIDE of the
current segment via a protected CALL/ENTER?

This mechanism would seem to be a very low overhead means of providing
"top half of the kernel" type services to a process, since it provides
a protection mechanism which doesn't require a context switch.  It also
extends nicely to a multiprocessor environment: anything which doesn't
require single threaded access could be in the protected system segments.
So, how come everybody doesn't do it?


-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117       

andrew@frip.gwd.tek.com (Andrew Klossner) (10/07/88)

[]

	"This mechanism would seem to be a very low overhead means of
	providing "top half of the kernel" type services to a process,
	since it provides a protection mechanism which doesn't require
	a context switch."

It also doesn't give you any of the benefits of a context switch.  If
all you want is naked entry to kernel code, you can usually get it on a
conventional system with little overhead.  For example, the "trap to
kernel" instructions on the M88k take just three cycles.  Then the
kernel is responsible for saving registers, etc.

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]

bwong@sundc.UUCP (Brian Wong) (10/07/88)

We've mentioned domain-based systems a lot.  What are the most appropriate
readings on this, given a less-than-graduate-level of architectural study
and not a lot of interest in the electronics of this?
-- 
Brian Wong					Sun Microsystems
bwong@sun.com					Vienna, Va.  703-883-1243

bga@raspail.UUCP (Bruce Albrecht) (10/07/88)

In article <16017@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes:
] In article <76700049@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
] 
] >do you prevent a user-call from transferring to some random
] >non-instruction on one of these protected pages?  In Multics, you
] >could only jump to the beginning of a protected segment, I believe
] >(ENTER).
] 
] May I assume that the mechanism permitted unrestricted branching WITHIN
] a particular code segment, and only restricted accesses OUTSIDE of the
] current segment via a protected CALL/ENTER?
] 
] So, how come everybody doesn't do it?

Well, the Control Data Cybers do it, so why don't you all go out and buy one.
We could certainly use the orders!

Bruce

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (10/07/88)

In article <10435@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
>>	"This mechanism would seem to be a very low overhead means of
>>	providing "top half of the kernel" type services to a process,
>>	since it provides a protection mechanism which doesn't require
>>	a context switch."

>It also doesn't give you any of the benefits of a context switch.  If
>all you want is naked entry to kernel code, you can usually get it on a
>conventional system with little overhead.  For example, the "trap to
>kernel" instructions on the M88k take just three cycles.  Then the
>kernel is responsible for saving registers, etc.
>
>
>
>

It doesn't give you ANY of the benefits of a context switch?

The idea is to provide system level services that a user process
cannot be trusted to provide for itself, such as figuring out what block of
data to read on the disk, but which do not require single threaded access,
such as allocating/deallocating a block on the disk.  This is a very
important distinction in a multiprocessor system, since it is not a very
good idea to single thread the entire set of system services if you have
more than two processors.

Now, since most of the kernel services provided, even in Unix, are in doing
things that do not require single-threading, there is a very definite 
advantage in splitting things up this way in a multiprocessor system.  And,
there may be no disadvantage on a single processor system, since you have
avoided adding a context switch.




-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117       

gillies@p.cs.uiuc.edu (10/13/88)

Re:  What are good things to read?

I don't think there are any good surveys on general protection in
computer systems.  There is a 1975 paper by Saltzer & Schroeder
(Proceedings of IEEE, September), but it was probably written too
early, has a very baroque ways of looking at things, and the english
is very poor (needlessly complex).

If you're interested in capability-based computers, may I suggest you
read "Capability-Base Computer Systems", by Henry M. Levy, published
by Digital Press (DEC), Bedford, MA, 1984.  The book is second to none
on the subject of capability-based systems.


Don Gillies, Dept. of Computer Science, University of Illinois
1304 W. Springfield, Urbana, Ill 61801      
ARPA: gillies@cs.uiuc.edu   UUCP: {uunet,ihnp4,harvard}!uiucdcs!gillies