phil@amdcad.UUCP (Phil Ngai) (11/18/86)
I'd like to propose something here. Have you ever spent a few days tracking down a bug caused by writing beyond the bounds of an array and trashing a vital data structure which only gets noticed many cycles later? Strings, of course, are arrays. Suppose every data structure were in its own segment. And of course, that every segment were big enough to hold any data structure you needed so that you didn't need to manage multiple segments for one data structure. Then when a bug trys to access beyond the end of an array, the bad reference is trapped at the time of dereference instead of invisibly (at the time) trashing an innocent data structure that happened to be in the right (wrong) place. Would this be worth doing? Of course, it would complicate the OS's memory management duties. But think about it. -- The distance from the North end of Vietnam to the South end is about the same as the distance from New York to Florida. Phil Ngai +1 408 749 5720 UUCP: {ucbvax,decwrl,hplabs,allegra}!amdcad!phil ARPA: amdcad!phil@decwrl.dec.com
dhp@ihlpa.UUCP (Douglas H. Price) (11/19/86)
> Suppose every data structure were in its own segment. > > Would this be worth doing? This is (if I understand correctly) exactly what happened in the object oriented environment of the Intel 432 processor. Each data (and code) object could only be accessed in the appropriate context. All objects not explicitly referenced in the current execution context where turned off, and would cause a detectable fault. The problem with doing this on a 286, for instance is again the extreme overhead necessary to set up (or check access permissions) on each data reference. It is also not a general solution on a 286; you can run out of segments too quickly. It was the overhead (and the general confusion about how you programmed object oriented hardware) that killed the 432 off. -- Douglas H. Price Analysts International Corp. @ AT&T Bell Laboratories ..!ihnp4!ihlpa!dhp
david@sun.uucp (David DiGiacomo) (11/19/86)
In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >I'd like to propose something here. Have you ever spent a few days >tracking down a bug caused by writing beyond the bounds of an array >and trashing a vital data structure which only gets noticed many >cycles later? Strings, of course, are arrays. > >Suppose every data structure were in its own segment... > >Would this be worth doing? Of course, it would complicate the OS's >memory management duties. But think about it. No, for four reasons: - It is very expensive to expand pointers to hold a reasonably large (16 bit?) segment number field. - Other things being equal, address translation is slowed considerably by segment bounds checking. - Non-uniform pointers lead to additional software complexity and cause severe problems when porting code from traditional systems. - You can easily accomplish what you want to do in a pure paged system. Just decide that you are going to use an arbitrary number of pointer bits for the "segment" number and load your page tables accordingly. The only difference is that segment granularity is one page, but that shouldn't matter for the debugging application you mention.
kds@mipos3.UUCP (Ken Shoemaker ~) (11/19/86)
I'm sorry, but I have to take exception to some of the points raised here: In article <9400@sun.uucp> david@sun.uUCp (David DiGiacomo) writes: > >No, for four reasons: > > - It is very expensive to expand pointers to hold a reasonably large > (16 bit?) segment number field. very expensive? They are already at least 32-bits wide! Besides, I thought that the whole idea of the 32-bit processors was that you got to talk to lots of memory... > > - Other things being equal, address translation is slowed considerably > by segment bounds checking. it really depends on how you do it. In the 386 the segment bounds checking is integrated into the pipeline and happens in parallel with other things going on, so it really doesn't add to the processing time of the instruction. > - Non-uniform pointers lead to additional software complexity and cause > severe problems when porting code from traditional systems. I'll agree with this one... > - You can easily accomplish what you want to do in a pure paged > system. Just decide that you are going to use an arbitrary number of > pointer bits for the "segment" number and load your page tables > accordingly. The only difference is that segment granularity is one > page, but that shouldn't matter for the debugging application you > mention. This implies some amount of control by the application program of what kinds of accesses are allowed to various parts of its virtual memory space. Maybe I'm wrong, but if you have a segmented system, I'd think that it would be easier for the os to manage a request for a segment of some size than it would be for the os to assign the attributes for a certain page, and then to maintain them for you. In addition, I'd think that it would be easier from a programmer standpoint to keep track of a single segment number than it would be for the programmer to keep track of an area of memory in a linear space. Imagine being able to have calls to set the brk location in one of any number of segments. Not very Unix-like, difficult to do with c, and only applicable to processors that support segmentation, but? You've also got a problem with the concept of setting page table attributes and porting tools from any generic architecture in that page tables are not at all consistent from one architecture to another, or even between systems using the same architecture. And you can expect that page table sizes will change as memory sizes increase. There are very good reasons to hide this kind of memory management from the user in that it is pretty closely tied to the hardware of the system. With segments, you are dealing with a level of abstraction that isn't quite as close to the hardware, and isn't as likely to change as the system "ages." But then, what do I know... -- The above views are personal. I've seen the future, I can't afford it... Ken Shoemaker, Microprocessor Design, Intel Corp., Santa Clara, California uucp: ...{hplabs|decwrl|amdcad|qantel|pur-ee|scgvaxd|oliveb}!intelca!mipos3!kds csnet/arpanet: kds@mipos3.intel.com
mike@peregrine.UUCP (Mike Wexler) (11/19/86)
In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >Suppose every data structure were in its own segment. And of course, >that every segment were big enough to hold any data structure you >needed so that you didn't need to manage multiple segments for one >data structure. > Phil Ngai +1 408 749 5720 Intels IAPX 432 processor incorporated this idea in hardware. It was slow. This is not necessarily because of the basic idea. It may have just been a bad implementation. It would be worth while to get the databook for the 432 it is quite interesting. The 432 went a little bit further. It basically "knows" the type of every data structure and enforces certain rules as to what can be done with things of each type. -- Mike Wexler (trwrb|scgvaxd)!felix!peregrine!mike (714)855-3923
dan@prairie.UUCP (Daniel M. Frank) (11/19/86)
In article <13802@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >Suppose every data structure were in its own segment. > >Would this be worth doing? Of course, it would complicate the OS's >memory management duties. But think about it. I think this was called the iAPX432. Anyone remember the 432? Seriously, though, this is the old RISC-vs.-CISC argument, or the old capabilities-vs.-addresses argument. Here are the problems, as succinctly as I can put them: 1) Given a fast, simple instruction set, why not just let the compiler put in checks (in languages where such checks make sense)? Then, once you "trust" your program, you can turn them off. Capabilities are forever. 2) If you have a segment (translate "capability") for every data object, you have to store the capability info (permissions, real address, size) somewhere and, because it is very inefficient to keep it in main memory, you'll want to cache it. This is effectively what the 286 does: every time you do a segment register load, a bunch of hidden information is associated with the register. With lots of capabilities, you will have to maintain a capability cache, and the size of that cache will have to be very large to efficiently support programs with lots of data objects. 3) The religious argument of the RISC folks is that, even if you can solve problem (2), which is difficult, you pay an extra performance cost for the complexity you have to build into the processor. 4) Virtual memory is a real problem for capabilities. They tend to be very small, so it's not clear what you want to swap. Do you want to swap individual capabilities in and out, or do you want lay capabilities over a paged memory, and manage the memory with no knowledge of capabilities? One compromise would be to use the 286 architecture, but build a compiler that can be told to put objects above a given size in their own segments. You have to be careful, though, not to exceed the size of the local descriptor table for the process. -- Dan Frank uucp: ... uwvax!prairie!dan arpa: dan%caseus@spool.wisc.edu
tomk@intsc.UUCP (Tom Kohrs) (11/20/86)
> This is (if I understand correctly) exactly what happened in the object > oriented environment of the Intel 432 processor. Each data (and code) > object could only be accessed in the appropriate context. All objects not > explicitly referenced in the current execution context where turned off, > and would cause a detectable fault. This is a correct assumption. > The problem with doing this on a > 286, for instance is again the extreme overhead necessary to set up > (or check access permissions) on each data reference. In the 286 the hardware takes care of the overhead associated with checking the access writes. The time to do this is hidden in the pipeline. > It is also not > a general solution on a 286; you can run out of segments too quickly. 16K segments is a lot to run out of. Typical of what is done in a segmented programming environment is to give each array its own segment and lump all of the single element variables together in one segment. From a c portability issue it can be hidden from the programmer by the compiler using the 14 bit segment pointer as the base of the variable, the offset is the index. Of course programmers that try to index into an array off of the base address of an adjacent array will get in trouble (nobody writes code like that do they? (:-) ). What you run out of on the 286 is bytes in a segment, the 386 should fix that, though I sure somebody is going to come back and complain about crippling 4GByte segments. > It was the overhead (and the general confusion about how you programmed > object oriented hardware) that killed the 432 off. The confusion (fear) about object oriented programming is probably what will shy programmers away from segments in the 386 for a long time. -- ------ "Ever notice how your mental image of someone you've known only by phone turns out to be wrong? And on a computer net you don't even have a voice..." tomk@intsc.UUCP Tom Kohrs Regional Architecture Specialist Intel - Santa Clara
m5d@bobkat.UUCP (Mike McNally ) (11/21/86)
In article <407@intsc.UUCP> tomk@intsc.UUCP (Tom Kohrs) writes: > [ ... ] >> The problem with doing this on a >> 286, for instance is again the extreme overhead necessary to set up >> (or check access permissions) on each data reference. > >In the 286 the hardware takes care of the overhead associated with checking >the access writes. The time to do this is hidden in the pipeline. > > tomk@intsc.UUCP Tom Kohrs > Regional Architecture Specialist > Intel - Santa Clara The protection checks which involve already loaded descriptors are indeed "free" (ignoring RISCy arguments along the lines that the chip real estate and hardware sophistication could have been better used in making instructions faster (and maybe curing my pet peeve, the affection for AX felt by IMUL) (but I digress)). However, a scenario in which each individual object lies in its own data segment would involve an LDS or LES or something before each reference. Check your handy iAPX 286 reference guide and see how long these instructions take. A long time, right? While you're at it, ask your local Intel rep why it's a bad idea to have too many code segments in a protected environment. Don't get me wrong; I like the 286. I strongly believe that it could be used to great advantage in a machine like the Macintosh; that is, a machine which is to run its own custom-designed OS. Or, if you're so inclined, I suppose the machine works just fine under iRMX (I'm not so inclined...I'd like to more directly express my feelings here, but I can't spell the noise I make when I think about RMX...sort of like coming into work in the morning, pouring some coffee, taking a sip, then realizing that you grabbed the wrong pot and got last nights cold mouldy scum-laden black death...). Of course, as I look through my iAPX 386 Programmer's Reference Manual, I get warm feelings when I think about stacks > 64K... BUT THE MULTIPLY INSTRUCTION STILL SUCKS. -- **** **** **** At Digital Lynx, we're almost in Garland, but not quite **** **** **** Mike McNally Digital Lynx Inc. Software (not hardware) Person Dallas TX 75243 uucp: ...convex!ctvax!bobkat!m5 (214) 238-7474
mwm@eris.UUCP (11/22/86)
Just a few quick observations on segments: 1) Segments are not a new thing - Burroughs has been selling them on their large systems for over a decade now (two decades, maybe?). 2) The thing that everybody who works with unix should think of when the word "segments" comes up (after eighty-eighty sux, of course :-) is "Multics," followed by "slow." But Multics tried to support far more than is being discussed here. 3) Segments are a good thing, but only if you've got enough to be usefull (enough to store arrays as illith vectors), and each one is big enough to be ditto. "Enough" varies with time, of course. 4) You don't have to have a time overhead for having segments. After all, a VAX has segments already. 5) You don't have to have broken pointer semantics if you use segments. Scattering them around in a large, sparse address space works fine. 6) 32 bits isn't a big enough address space. You can (maybe) make something usefull out of it, but it probably won't be usefull in a few years. 7) The memory cost for segments should be small, and may be zero, depending on what kind of architechture you're trying to cram them into. 8) Segments are coming to Unix. See either MACH or the Karels&McKusick paper on the new BSD virtual memory system. <mike
mike@peregrine.UUCP (Mike Wexler) (11/25/86)
Reply-To: mike@peregrine.UUCP (Mike Wexler) Organization: Peregrine Systems, Inc., Irvine, CA In article <260@mipos3.UUCP> kds@mipos3.UUCP (Ken Shoemaker ~) writes: ->I'm sorry, but I have to take exception to some of the points raised here: ->In article <9400@sun.uucp> david@sun.uUCp (David DiGiacomo) writes: ->>No, for four reasons: ->> - It is very expensive to expand pointers to hold a reasonably large ->> (16 bit?) segment number field. ->very expensive? They are already at least 32-bits wide! Besides, I ->thought that the whole idea of the 32-bit processors was that you got ->to talk to lots of memory... The point is to handle segments you would want to have 48 bit points: a 16 bit segment number and a 32 bit offset within the segment. Otherwise, you would be limiting how big individual objects could be. The other problem is you might want to have >65536 objects. -- Mike Wexler (trwrb|scgvaxd)!felix!peregrine!mike (714)855-3923
shap@sfsup.UUCP (J.S.Shapiro) (11/25/86)
> Suppose every data structure were in its own segment.
Phil, it's a useful debugging technique, and it does provide protection and
sharing, but in real code the overhead of swapping the segment registers in
and out would be too high. On the other hand, It ought to be possible
(designing on horseback now) to generate it that way with some appropriate
symbol table and go through and merge the segments you believe to eliminate
the segment register swaps.
The problem I see is getting into some nontrivial expression and thrashing
the segment registers around, along with the consequent changes
andinvalidations in the TLB. This would blow the mmu firmly out of the
water, if i remember the design right. It's been about 2 years since I
looked.
Jon Shapiro