[comp.sys.intel] segments and Unix

phil@amdcad.UUCP (Phil Ngai) (11/18/86)

I'd like to propose something here. Have you ever spent a few days
tracking down a bug caused by writing beyond the bounds of an array
and trashing a vital data structure which only gets noticed many
cycles later? Strings, of course, are arrays.

Suppose every data structure were in its own segment. And of course,
that every segment were big enough to hold any data structure you
needed so that you didn't need to manage multiple segments for one
data structure. Then when a bug trys to access beyond the end of an
array, the bad reference is trapped at the time of dereference instead
of invisibly (at the time) trashing an innocent data structure that
happened to be in the right (wrong) place.

Would this be worth doing? Of course, it would complicate the OS's
memory management duties. But think about it.
-- 
 The distance from the North end of Vietnam to the South end is about
 the same as the distance from New York to Florida.

 Phil Ngai +1 408 749 5720
 UUCP: {ucbvax,decwrl,hplabs,allegra}!amdcad!phil
 ARPA: amdcad!phil@decwrl.dec.com

dhp@ihlpa.UUCP (Douglas H. Price) (11/19/86)

> Suppose every data structure were in its own segment. 
>
> Would this be worth doing? 

This is (if I understand correctly) exactly what happened in the object
oriented environment of the Intel 432 processor.  Each data (and code)
object could only be accessed in the appropriate context.  All objects not
explicitly referenced in the current execution context where turned off,
and would cause a detectable fault.  The problem with doing this on a
286, for instance is again the extreme overhead necessary to set up
(or check access permissions) on each data reference.  It is also not
a general solution on a 286; you can run out of segments too quickly.
It was the overhead (and the general confusion about how you programmed
object oriented hardware) that killed the 432 off. 
-- 
						Douglas H. Price
						Analysts International Corp.
						@ AT&T Bell Laboratories
						..!ihnp4!ihlpa!dhp

david@sun.uucp (David DiGiacomo) (11/19/86)

In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes:
>I'd like to propose something here. Have you ever spent a few days
>tracking down a bug caused by writing beyond the bounds of an array
>and trashing a vital data structure which only gets noticed many
>cycles later? Strings, of course, are arrays.
>
>Suppose every data structure were in its own segment...
>
>Would this be worth doing? Of course, it would complicate the OS's
>memory management duties. But think about it.

No, for four reasons:

 - It is very expensive to expand pointers to hold a reasonably large 
   (16 bit?) segment number field.

 - Other things being equal, address translation is slowed considerably 
   by segment bounds checking.

 - Non-uniform pointers lead to additional software complexity and cause 
   severe problems when porting code from traditional systems.

 - You can easily accomplish what you want to do in a pure paged
   system.  Just decide that you are going to use an arbitrary number of
   pointer bits for the "segment" number and load your page tables
   accordingly.  The only difference is that segment granularity is one
   page, but that shouldn't matter for the debugging application you
   mention.

kds@mipos3.UUCP (Ken Shoemaker ~) (11/19/86)

I'm sorry, but I have to take exception to some of the points raised here:
In article <9400@sun.uucp> david@sun.uUCp (David DiGiacomo) writes:
>
>No, for four reasons:
>
> - It is very expensive to expand pointers to hold a reasonably large 
>   (16 bit?) segment number field.

very expensive?  They are already at least 32-bits wide!  Besides, I
thought that the whole idea of the 32-bit processors was that you got
to talk to lots of memory...

>
> - Other things being equal, address translation is slowed considerably 
>   by segment bounds checking.

it really depends on how you do it.  In the 386 the segment bounds checking
is integrated into the pipeline and happens in parallel with other things
going on, so it really doesn't add to the processing time of the instruction.

> - Non-uniform pointers lead to additional software complexity and cause 
>   severe problems when porting code from traditional systems.

I'll agree with this one...

> - You can easily accomplish what you want to do in a pure paged
>   system.  Just decide that you are going to use an arbitrary number of
>   pointer bits for the "segment" number and load your page tables
>   accordingly.  The only difference is that segment granularity is one
>   page, but that shouldn't matter for the debugging application you
>   mention.

This implies some amount of control by the application program of what
kinds of accesses are allowed to various parts of its virtual memory
space.  Maybe I'm wrong, but if you have a segmented system, I'd think
that it would be easier for the os to manage a request for a segment
of some size than it would be for the os to assign the attributes
for a certain page, and then to maintain them for you.  In addition,
I'd think that it would be easier from a programmer standpoint to
keep track of a single segment number than it would be for the
programmer to keep track of an area of memory in a linear space.
Imagine being able to have calls to set the brk location in one
of any number of segments.  Not very Unix-like, difficult
to do with c, and only applicable to processors that support
segmentation, but?  You've also got a problem with the concept
of setting page table attributes and porting tools from any generic
architecture in that page tables are not at all consistent from
one architecture to another, or even between systems using the
same architecture.  And you can expect that page table sizes will
change as memory sizes increase.  There are very good reasons to
hide this kind of memory management from the user in that it is
pretty closely tied to the hardware of the system.  With segments,
you are dealing with a level of abstraction that isn't quite as
close to the hardware, and isn't as likely to change as the system
"ages."  But then, what do I know...
-- 
The above views are personal.

I've seen the future, I can't afford it...

Ken Shoemaker, Microprocessor Design, Intel Corp., Santa Clara, California
uucp: ...{hplabs|decwrl|amdcad|qantel|pur-ee|scgvaxd|oliveb}!intelca!mipos3!kds
csnet/arpanet: kds@mipos3.intel.com

mike@peregrine.UUCP (Mike Wexler) (11/19/86)

In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes:
>Suppose every data structure were in its own segment. And of course,
>that every segment were big enough to hold any data structure you
>needed so that you didn't need to manage multiple segments for one
>data structure. 
> Phil Ngai +1 408 749 5720
Intels IAPX 432 processor incorporated this idea in hardware.  It was
slow.  This is not necessarily because of the basic idea.  It may have
just been a bad implementation.  It would be worth while to get the databook
for the 432 it is quite interesting.  The 432 went a little bit further.
It basically "knows" the type of every data structure and enforces certain
rules as to what can be done with things of each type.


-- 
Mike Wexler
(trwrb|scgvaxd)!felix!peregrine!mike
(714)855-3923

dan@prairie.UUCP (Daniel M. Frank) (11/19/86)

In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes:
>Suppose every data structure were in its own segment.
>
>Would this be worth doing? Of course, it would complicate the OS's
>memory management duties. But think about it.

   I think this was called the iAPX432.  Anyone remember the 432?

   Seriously, though, this is the old RISC-vs.-CISC argument, or the
old capabilities-vs.-addresses argument.  Here are the problems, as
succinctly as I can put them:

   1) Given a fast, simple instruction set, why not just let the 
	  compiler put in checks (in languages where such checks make
	  sense)?  Then, once you "trust" your program, you can turn
	  them off.  Capabilities are forever.

   2) If you have a segment (translate "capability") for every
	  data object, you have to store the capability info (permissions,
	  real address, size) somewhere and, because it is very inefficient
	  to keep it in main memory, you'll want to cache it.  This is
	  effectively what the 286 does:  every time you do a segment register
	  load, a bunch of hidden information is associated with the register.
	  With lots of capabilities, you will have to maintain a capability
	  cache, and the size of that cache will have to be very large to
	  efficiently support programs with lots of data objects.

   3) The religious argument of the RISC folks is that, even if you
	  can solve problem (2), which is difficult, you pay an extra
	  performance cost for the complexity you have to build into 
	  the processor.

   4) Virtual memory is a real problem for capabilities.  They tend
	  to be very small, so it's not clear what you want to swap.
	  Do you want to swap individual capabilities in and out, or
	  do you want lay capabilities over a paged memory, and manage
	  the memory with no knowledge of capabilities?

   One compromise would be to use the 286 architecture, but build a
compiler that can be told to put objects above a given size in their
own segments.  You have to be careful, though, not to exceed the size
of the local descriptor table for the process.

-- 
    Dan Frank
    uucp: ... uwvax!prairie!dan
    arpa: dan%caseus@spool.wisc.edu

tomk@intsc.UUCP (Tom Kohrs) (11/20/86)

> This is (if I understand correctly) exactly what happened in the object
> oriented environment of the Intel 432 processor.  Each data (and code)
> object could only be accessed in the appropriate context.  All objects not
> explicitly referenced in the current execution context where turned off,
> and would cause a detectable fault.  
 
This is a correct assumption.

>                                      The problem with doing this on a
> 286, for instance is again the extreme overhead necessary to set up
> (or check access permissions) on each data reference. 

In the 286 the hardware takes care of the overhead associated with checking
the access writes.  The time to do this is hidden in the pipeline.

>                                                        It is also not
> a general solution on a 286; you can run out of segments too quickly.

16K segments is a lot to run out of. Typical of what is done in a segmented
programming environment is to give each array its own segment and lump all 
of the single element variables together in one segment.  From a c portability
issue it can be hidden from the programmer by the compiler using the 14 bit
segment pointer as the base of the variable, the offset is the index.  Of
course programmers that try to index into an array off of the base address
of an adjacent array will get in trouble (nobody writes code like that do
they? (:-) ).   What you run out of on the 286 is bytes in a segment,  the 
386 should fix that, though I sure somebody is going to come back and complain
about crippling 4GByte segments.

> It was the overhead (and the general confusion about how you programmed
> object oriented hardware) that killed the 432 off. 

The confusion (fear) about object oriented programming is probably what
will shy programmers away from segments in the 386 for a long time.
-- 
------
"Ever notice how your mental image of someone you've 
known only by phone turns out to be wrong?  
And on a computer net you don't even have a voice..."

  tomk@intsc.UUCP  			Tom Kohrs
					Regional Architecture Specialist
		   			Intel - Santa Clara

m5d@bobkat.UUCP (Mike McNally ) (11/21/86)

In article <407@intsc.UUCP> tomk@intsc.UUCP (Tom Kohrs) writes:
>	[ ... ]
>>                                      The problem with doing this on a
>> 286, for instance is again the extreme overhead necessary to set up
>> (or check access permissions) on each data reference. 
>
>In the 286 the hardware takes care of the overhead associated with checking
>the access writes.  The time to do this is hidden in the pipeline.
>
>  tomk@intsc.UUCP  			Tom Kohrs
>					Regional Architecture Specialist
>		   			Intel - Santa Clara

The protection checks which involve already loaded descriptors are
indeed "free" (ignoring RISCy arguments along the lines that the chip
real estate and hardware sophistication could have been better used in
making instructions faster (and maybe curing my pet peeve, the
affection for AX felt by IMUL) (but I digress)).  However, a scenario
in which each individual object lies in its own data segment would
involve an LDS or LES or something before each reference.  Check your
handy iAPX 286 reference guide and see how long these instructions
take.  A long time, right?  While you're at it, ask your local Intel
rep why it's a bad idea to have too many code segments in a protected
environment.

Don't get me wrong; I like the 286.  I strongly believe that it could
be used to great advantage in a machine like the Macintosh; that is, a
machine which is to run its own custom-designed OS.  Or, if you're so
inclined, I suppose the machine works just fine under iRMX (I'm not so
inclined...I'd like to more directly express my feelings here, but I
can't spell the noise I make when I think about RMX...sort of like
coming into work in the morning, pouring some coffee, taking a sip,
then realizing that you grabbed the wrong pot and got last nights cold
mouldy scum-laden black death...).  Of course, as I look through my
iAPX 386 Programmer's Reference Manual, I get warm feelings when I
think about stacks > 64K...

BUT THE MULTIPLY INSTRUCTION STILL SUCKS.

-- 
****                                                         ****
**** At Digital Lynx, we're almost in Garland, but not quite ****
****                                                         ****

Mike McNally                                    Digital Lynx Inc.
Software (not hardware) Person                  Dallas  TX  75243
uucp: ...convex!ctvax!bobkat!m5                 (214) 238-7474

mwm@eris.UUCP (11/22/86)

Just a few quick observations on segments:

1) Segments are not a new thing - Burroughs has been selling them on
their large systems for over a decade now (two decades, maybe?).

2) The thing that everybody who works with unix should think of when
the word "segments" comes up (after eighty-eighty sux, of course :-)
is "Multics," followed by "slow." But Multics tried to support far
more than is being discussed here.

3) Segments are a good thing, but only if you've got enough to be
usefull (enough to store arrays as illith vectors), and each one is
big enough to be ditto. "Enough" varies with time, of course.

4) You don't have to have a time overhead for having segments. After
all, a VAX has segments already.

5) You don't have to have broken pointer semantics if you use
segments. Scattering them around in a large, sparse address space
works fine.

6) 32 bits isn't a big enough address space. You can (maybe) make
something usefull out of it, but it probably won't be usefull in a few
years.

7) The memory cost for segments should be small, and may be zero,
depending on what kind of architechture you're trying to cram them
into.

8) Segments are coming to Unix. See either MACH or the Karels&McKusick
paper on the new BSD virtual memory system.

	<mike

mike@peregrine.UUCP (Mike Wexler) (11/25/86)

Reply-To: mike@peregrine.UUCP (Mike Wexler)
Organization: Peregrine Systems, Inc., Irvine, CA

In article <260@mipos3.UUCP> kds@mipos3.UUCP (Ken Shoemaker ~) writes:
->I'm sorry, but I have to take exception to some of the points raised here:
->In article <9400@sun.uucp> david@sun.uUCp (David DiGiacomo) writes:
->>No, for four reasons:
->> - It is very expensive to expand pointers to hold a reasonably large 
->>   (16 bit?) segment number field.
->very expensive?  They are already at least 32-bits wide!  Besides, I
->thought that the whole idea of the 32-bit processors was that you got
->to talk to lots of memory...
The point is to handle segments you would want to have 48 bit points:
a 16 bit segment number and a 32 bit offset within the segment.  Otherwise,
you would be limiting how big individual objects could be.  The other problem
is you might want to have >65536 objects.
-- 
Mike Wexler
(trwrb|scgvaxd)!felix!peregrine!mike
(714)855-3923

shap@sfsup.UUCP (J.S.Shapiro) (11/25/86)

> Suppose every data structure were in its own segment.

Phil, it's a useful debugging technique, and it does provide protection and
sharing, but in real code the overhead of swapping the segment registers in
and out would be too high. On the other hand, It ought to be possible
(designing on horseback now) to generate it that way with some appropriate
symbol table and go through and merge the segments you believe to eliminate
the segment register swaps.

The problem I see is getting into some nontrivial expression and thrashing
the segment registers around, along with the consequent changes
andinvalidations in the TLB. This would blow the mmu firmly out of the
water, if i remember the design right. It's been about 2 years since I
looked.

Jon Shapiro