mch@computing-maths.cardiff.ac.uk (Major Kano) (05/04/88)
This is a partial reprint of an article that I posted in mid March. Please read it carefully as only one person replied last time; and I'd like to comment on this subject and invite others to. -------------------------------------------------------------------------------- In article <2904@omepd> mcg@iwarpo3.UUCP (Steve McGeady) writes: > >It can't be elegance of design, for (e.g.) the 80386 and the MIPSco processor >are each somewhat inelegant in their own ways (for those who don't wish to >fill in the blanks, segmentation and register model in the former case, ^^^^^^^^^^^^ ^^^^^ ** WHAT THE $@#@++%%& HELL ?!? ** Wow ! And I thought Ray Coles (writes for Practical Computing, a UK Magazine) had it in for Intel ! Agreeing, as I do, that the register model doesn't have enough of them, and that even the '386 isn't regular enough, I though that the feature of the '386 that made it so TECHNICALLY advanced (IBM compatibility not withstanding :-) ** WAS ** its memory management and protection model. The 32-bit within-segment addresses are what people have been waiting for for ages. I would question the fact that only 16 bit selectors are avaliable, but I defy anyone to come up, in the near or intermediate future, with an Intel-style memory model that is better than Intel's, without opening up a whole can of voracious memory-eating killer-worms at the descriptor table level. If you don't know what I mean by that, pretend for a moment, that YOU were the person who had to come up with the byte/page granularity kludge in order to make 4GB segments fit in a DT entry. [ Perhaps that person (or team) would like to comment themselves, if they use the net. I would be interested as to what choices they started out with, before deciding on this and why they decided on it. ] What does everyone think ? As a computer architecture junkie, I would be very interested, as the Intel segments seem (to me) to come in for a lot of flack every so often; I just didn't expect it to come from an Intel employee. [1] *) -------------------------------------------------------------------------------- [1] The original poster has already replied (on the net) to this point. -------------------------------------------------------------------------------- In e-mail to me, someone (who since it was e-mail, I will not name without his permission -- thanks for writing, by the way) wrote the following:-- >I happen to agree with you -- I LIKE the basic Intel memory architecture. >It's more open to the future than the motorola 32-bit addresses. I think >we're heading into the world of 32-bit physical address spaces soon, and I >think that the segmented architecture is the right way to go. Apparently, >nobody else (who matters) does, though. I find the segmented architecture is a natural way of protecting programs and data, and although we often hear negative remarks, no-one seems to be able to give any coherent justification of such remarks. In particular. no-one has EVER in my experience directly compared segmenting to straight linear/paging without TOTALLY ignoring the advantages of segmenting, ie., ease of doing relocatable code, logical program design (code, data/heap stack separation), inter-task separation (LDT's) and a few other related features. (As an aside, I've heard of 68000 routines doing all kinds of contortions to check for/avoid overflow because the 68K traps on (eg., zerodivide) and traps into SUPERVISOR mode (believe it or not). Zerodivide should normally be a USER (ie., compiler's run time system) problem. With Intel, the user can handle it without compromising security (conforming code segments). With the 68K you get NO choice. Any 68K programmers out there who can confirm or deny this ? I am referring to articles in the "SUBSET" feature of Personal Computer World about a year and a half ago. I can't remember exactly which issues). I would like, provided people can temporarily dispense with the apathy that ensured that I only got one reply to my original posting, to hear a wide range of views about this. I'm sure there must be people on the net who would like to express opinions or read those of others. So how about it ? This issue has been a bone of contention among computerists for years; surely comp.arch is the very place in which it should be discussed. -mch -- Martin C. Howe, University College Cardiff | "You actually program in 'C' mch@vax1.computing-maths.cardiff.ac.uk. | WITHOUT regular eye-tests ?!" -------------------------------------------+-----+------------------------------ My cats know more about UCC's opinions than I do.| MOSH! In the name of ANTHRAX!
rminnich@udel.EDU (Ron Minnich) (05/06/88)
In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) writes:
I would like, provided people can temporarily dispense with the apathy
that ensured that I only got one reply to my original posting, to hear a wide
range of views about this. I'm sure there must be people on the net who would
like to express opinions or read those of others.
---
well, speaking as someone who spent some time at Burroughs, i can
say that i think segmentation is a Good Thing. Unfortunately, Intel
has single-handedly (i think) managed to give segmentation a bad name,
associating segmentation with stupid 16-bit limits, 5 different memory
models, and so on. In fact the 386 is the first time Intel actually
got segmentation right, just 10 years late. Don't expect to see
anyone else use it anytime soon.
But having used segmented machines (done right, at Burroughs) and
non-segmented machines, i can say that i trust programs running
on segmented machines a whole lot more. And I trust C programs
running on non-segmented machines not-a-whit- just ask anybody
who knows about NULL pointers.
ron
--
ron (rminnich@udel.edu)
daveb@geac.UUCP (David Collier-Brown) (05/06/88)
In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) writes: > that made it so TECHNICALLY advanced [] ** WAS ** its memory management > and protection model. The 32-bit within-segment addresses are what > people have been waiting for for ages. I would question the fact that > only 16 bit selectors are avaliable, but I defy anyone to come up, in the > near or intermediate future, with an Intel-style memory model that is > better than Intel's, without opening up a whole can of voracious > memory-eating killer-worms at the descriptor table level. The problem with the 80xxx memory models is visibility: the basic idea of a segment as a grouping mechanism is very old, and should not be confused with the idea of a segment as a sort of "big page". If the segments are invisible to the HLL programmer (usually by being "big enough"), they're a win. The only thing is... 16 bits of addressability is visibly too little 32 bits has been described as too small 36 bits was **FOUND** to be too small, about 10 years ago Programmer-visible segment limits are probably inadvisable, since the only good sizes in computer science are zero, one and "as much as you'd like". -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
eric@snark.UUCP (Eric S. Raymond) (05/07/88)
In article <2411@louie.udel.EDU>, rminnich@udel.EDU (Ron Minnich) writes: > But having used segmented machines (done right, at Burroughs) and > non-segmented machines, i can say that i trust programs running > on segmented machines a whole lot more. And I trust C programs > running on non-segmented machines not-a-whit- just ask anybody > who knows about NULL pointers. Waaaait a second, here. It sounds to me like two very different issues are being confused. Let's have some definitions: Segmented architecture -- one in which the register width is not sufficient to address all of memory, so that full addresses must be base/offset or segment-descriptor/address pairs. Memory protection -- the ability to enforce memory addressing restrictions on execution threads so that references outside a 'legal' region are detected and trapped (in UNIX terms, raise a SIGSEGV). These are very different concepts. To trap NULL pointers you want memory protection. Segmentation implies a crude form of memory protection, with fixed-sized regions defined by the address span of an offset. But the two should not be confused. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: {{uunet,rutgers,ihnp4}!cbmvax,rutgers!vu-vlsi,att}!snark!eric Post: 22 South Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
guy@gorodish.Sun.COM (Guy Harris) (05/08/88)
The point that you need not have a segmented architecture to protect against null-pointer dereferencing is 100% valid; there are many machine/OS pairs that do this on non-segmented architectures (the UNIX port to the CCI Power 5/20, SunOS on all Sun machines, VAX/VMS). However, the definitions of "segmented architecture" are bogus: > Waaaait a second, here. It sounds to me like two very different issues are > being confused. Let's have some definitions: > > Segmented architecture -- one in which the register width is not > sufficient to address all of memory, so that full addresses must > be base/offset or segment-descriptor/address pairs. Wrong. One could imagine a segmented machine with *no* registers (did the Burroughs machines have any registers that were visible to anyone or anything generating machine code?). One could have a machine where the registers were big enough to hold a segment-number/offset pair. (Zilog Z8001; a register pair could hold 32-bit quantities, the machine had a reasonably full set of 32-bit instructions - including 32-bit multiply and divide - and a segmented address was 8 bits of segment number, 8 bits of zero, and 16 bits of byte offset within the segment.) A segmented architecture could better be defined as one where an address consists of a segment number and an offset within the segment (although there may very well be cases that this doesn't describe). > These are very different concepts. To trap NULL pointers you want memory > protection. Segmentation implies a crude form of memory protection, with > fixed-sized regions defined by the address span of an offset. But the two > should not be confused. No, segmentation doesn't imply memory protection. You could imagine a system with segments that permits you to read and write from any address in the segment, whether valid or not (i.e., one that doesn't even do bounds checking).
rroot@edm.UUCP (uucp) (05/08/88)
From article <353@cf-cm.UUCP>, by mch@computing-maths.cardiff.ac.uk (Major Kano): > (As an aside, I've heard of 68000 routines doing all kinds of contortions to > check for/avoid overflow because the 68K traps on (eg., zerodivide) and traps > into SUPERVISOR mode (believe it or not). Zerodivide should normally be a USER > (ie., compiler's run time system) problem. With Intel, the user can handle it > without compromising security (conforming code segments). With the 68K you get > NO choice. Any 68K programmers out there who can confirm or deny this ? I Yes, the Divide instruction seems to unconditionially except on a zero divide, but if you REALLY want to ignore zero divide, you can have the interrupt vector point to an RTE instruction (or an OR/RTE pair if you want to set the condition code). The cost on the 68K is about 60 cycles (including both the exception and the return) - somethat less than a full divide. On the '20 the cost is about the same as the cost of a divide. Considering the fact that a zero divide is almost ALWAYS a big boo-boo, I think it is normally more cost effective (in terms of cycles) to blow up on zero-divide than it is to force a program to check for this (should-be) rarity before every division, where a zero divisor is possible. The nice thing is that it gives a programmer the ability to tell the difference between a divide by zero (almost always a mistake) and a REAL overflow (which often gets treated differently) with the possibility of a graceful recovery (above) where you really DON'T care. As far as I can tell, this is the only case where an otherwise innocuous 68K instruction will cause an exception. In any other case, you have to ASK for the interrupt (with a TRAPcc). Although you may not like the trade-off I see it as generally being a plus. -- ------------- Stephen Samuel {ihnp4,ubc-vision,vax135}!alberta!edm!steve or userzxcv@uqv-mts.bitnet
rminnich@udel.EDU (Ron Minnich) (05/08/88)
In article <22830abd:a11@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes: > Segmented architecture -- one in which the register width is not > sufficient to address all of memory, so that full addresses must > be base/offset or segment-descriptor/address pairs. Nope. Wrong. Take a look at the Burroughs architecture of the early sixties, withe their 20 bit address, and tell me that people built machines with 6 Mb of memory around then (the 20-bit address was for 48-bit words). Your definition demonstrates my thesis- that people think of Intel when they think of segments, and they think that segments are a way of getting around small addressing models. Intel did it wrong. > Memory protection -- the ability to enforce memory addressing restrictions > on execution threads so that references outside a 'legal' region are > detected and trapped (in UNIX terms, raise a SIGSEGV). In any real program, there are a large number of legal regions, which you don't want to get confused. Flat address space machines do not help in this regard. At most they provide two regions (code and data). The NULL pointer joke was an aside. What you really want is to (e.g.) keep array references from going to the wrong place, and that is what segments help enforce. In a flat address space, two arrays of structures butted against each other can (and do!) become confused; in a segmented machine done right, they can't; in most Intel machines, they can (and do!) become confused (in the tiny, small, medium, and large models, for example). -- ron (rminnich@udel.edu)
rminnich@udel.EDU (Ron Minnich) (05/08/88)
In article <52404@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: >No, segmentation doesn't imply memory protection. You could imagine a system >with segments that permits you to read and write from any address in the >segment, whether valid or not (i.e., one that doesn't even do bounds >checking). Hence my point about segmentation done right. The later Burroughs machines really did things pretty well; i still miss the address checking every time i try to debug some broken C code. Fact is, if i have an array of something, i want it to be in its own legal region, and i want it to be bounds-checked when i mess with it. Before any one goes off the handle about cost, remember the cost of all those programs that duplicate this stuff in C code. Segmentation, despite having been associated with some pretty unpleasant architectures in the last few years (8086,80286) is not in itself such a bad thing. I am getting to the point where I am willing to pay a performance penalty if the plagued things would just run right ... -- ron (rminnich@udel.edu)
daveb@geac.UUCP (David Collier-Brown) (05/09/88)
| In article <2411@louie.udel.EDU>, rminnich@udel.EDU (Ron Minnich) writes: | But having used segmented machines (done right, at Burroughs) and | non-segmented machines, i can say that i trust programs running | on segmented machines a whole lot more. And I trust C programs | running on non-segmented machines not-a-whit- just ask anybody | who knows about NULL pointers. In article <22830abd:a11@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes: | Waaaait a second, here. It sounds to me like two very different issues are | being confused. Woah! The comment about NULL pointers probably has to do with either separate I&D, or is just a snide comment... | Let's have some definitions: | Segmented architecture -- one in which the register width is not | sufficient to address all of memory, so that full addresses must | be base/offset or segment-descriptor/address pairs. Nope. That's the Intel form that was being complained about. A better one is from Organick[1]: 1.2.3 Segments A segment of a process is a collection of information important enough to be given a name. A segment is a unit of sharing and has associated with it a collection of attributes including a unique identification. Segments are, generally speaking, blocks of code (procedures) or blocks of data ranging in side from zero to 2**16 words[2]. Each segment can be allowed to grow or shrink during execution of the program. A record of its size is kept in the "descriptor word" associated with the segment. If we use this (still flawed) definition, we get something of plausible size, with a single name, addressable without explicit programmer action in an HLL, and pageable/protectable at will. Unfortunately, there are only three good sizes in computer science: zero, one and "as big as you'd like". Providing the latter is still a research problem, but we're getting better (As comments in this forum tend to point out). --dave (I hope that helps) c-b [1] Organick. Elliot I, "The Multics System, An Examination of Its Structure", MIT Press, 1972. [2] The segment in question was that of the GE/Honeywell 645, and had a 4-byte word, giving 2**18 bytes/segment. This was perfectly reasonable for code, but was a **pain** for data. One could have lots of segments per process, and rarely wanted blocks of code larger than the limit. Data, on the other hand, was hard, since databases could easily exceed the size of a single segment. And for some reason, people insisted on calling the database by a single name, not "fred part one" and "fred part 2". not too many people wrote librarys big enough to exceed a single segment -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
radford@calgary.UUCP (Radford Neal) (05/09/88)
In article <2411@louie.udel.EDU>, rminnich@udel.EDU (Ron Minnich) writes: > > But having used segmented machines (done right, at Burroughs) and > > non-segmented machines, i can say that i trust programs running > > on segmented machines a whole lot more. And I trust C programs > > running on non-segmented machines not-a-whit- just ask anybody > > who knows about NULL pointers. In article <22830abd:a11@snark.UUCP>, eric@snark.UUCP (Eric S. Raymond) writes > Waaaait a second, here. It sounds to me like two very different issues are > being confused. Let's have some definitions: > > Segmented architecture -- one in which the register width is not > sufficient to address all of memory, so that full addresses must > be base/offset or segment-descriptor/address pairs. I think you're both wrong, here. My impression of the common (and most useful) meaning of the word "segment", as used in MULTICS for instance, is the following: A segment is a contiguous portion of virtual memory that is conceptually separate from other portions of virtual memory. Typically, one gets to expand and shrink a segment independently of other segments. For instance, one maybe gets to map two files to two segments and be able to extend or truncate the mapped files as desired. Maybe one also gets to set permissions differently for different segments, but I'd consider that a side issue. I would expect to be able to address any word of any segment by simply following a pointer (i.e. i = *p). If you can't, you've got some sort of "bank switched" addressing kludge. This definition is really language, not machine, dependent - maybe you can do it in C but not Pascal - though obviously the underlying hardware support determines the cost of doing it. Typically, segments would begin at fixed distances in virtual address space from each other. This distance in turn determines the maximum size of a segment. (Other factors might limit a segment to less than this of course.) The typical problem with segmented systems is that the maximum size of a segment is too small - e.g. 64KB on an 8088, 1MB on MULTICS (I think). Allowing too few segments also causes problems - e.g. 256 on MULTICS (I think). If you don't have either of these problems, then segmentation is a great idea. Of course, you're inevitably paying something for those extra address bits necessitated by your sparse use of address space. Radford Neal
guy@gorodish.Sun.COM (Guy Harris) (05/09/88)
> Fact is, if i have an array of something, i want it to be in its own > legal region, and i want it to be bounds-checked when i mess with it. > Before any one goes off the handle about cost, remember the cost > of all those programs that duplicate this stuff in C code. Except that programs that duplicate that stuff in C (or whatever) code tend to do something useful when if the subscript is out of range. For a somewhat trivial example, consider a program that reads a large array of numbers from a file, and then prompts the user for an array index and prints out the element of the array selected by that index. Even in a language and implementation that does array-bounds checking, a program that just reads the index and uses it without first checking whether it's in range is wrong. Telling the user "try again, the valid indices are M through N" is far better than giving them a "subscript range exceeded" error and a stack trace. Having the language and its implementation do this checking may be helpful in detecting bugs; however, in many cases you still have to put in the check yourself anyway if you want a reasonable program.
lamaster@ames.arpa (Hugh LaMaster) (05/09/88)
In article <2411@louie.udel.EDU> rminnich@udel.EDU (Ron Minnich) writes: > But having used segmented machines (done right, at Burroughs) and >non-segmented machines, i can say that i trust programs running >on segmented machines a whole lot more. And I trust C programs >running on non-segmented machines not-a-whit- just ask anybody >who knows about NULL pointers. Several other postings defended segmentation as a Good Thing. I think several of the posters are assuming that two different uses of "segmentation" are synonyms or at least must go together. As a counterexample, consider the (infamous?) IBM 370. It has a linear 24 (now 32) bit address space. It also has Segmented Page Tables, which permit, for example, memory segments (or sections if you are used to a different terminology) to be shared (read only if you like) and so on. If you had a 64 bit linear address space, you could have 2^31 (or 32) memory sections of 2^32 bytes long, and put every data structure in its own section, if you wanted. Anyway, what is wrong with "Intel" style (also used in other machines) of "segmentation"? Well, if I need more than 2^32 bytes of memory, I probably need a Single Array of more than 2^32 bytes, and I probably need to address it very efficiently (Uh oh, it's those pesky scientific programmers again...) Now, I probably can't afford to load a segment register before every memory reference. Especially if I am using vector instructions and might cross a segment boundary in the middle of an instruction. The first kind of segmentation, segmented page tables, is a Good Thing. It can also do everything that the second kind of segmentation, the Bad Kind, can do, and without the drawbacks. Except for one- bigger addresses. Well, nothing is for free. -- Hugh LaMaster, m/s 233-9, UUCP {topaz,lll-crg,ucbvax}! NASA Ames Research Center ames!lamaster Moffett Field, CA 94035 ARPA lamaster@ames.arpa Phone: (415)694-6117 ARPA lamaster@ames.arc.nasa.gov
ok@quintus.UUCP (Richard A. O'Keefe) (05/09/88)
In article <22830abd:a11@snark.UUCP>, eric@snark.UUCP (Eric S. Raymond) writes: > In article <2411@louie.udel.EDU>, rminnich@udel.EDU (Ron Minnich) writes: > > But having used segmented machines (done right, at Burroughs) and > > non-segmented machines, i can say that i trust programs running > > on segmented machines a whole lot more. And I trust C programs > > running on non-segmented machines not-a-whit- just ask anybody > > who knows about NULL pointers. > > Waaaait a second, here. It sounds to me like two very different issues are > being confused. Let's have some definitions: > > Segmented architecture -- one in which the register width is not > sufficient to address all of memory, so that full addresses must > be base/offset or segment-descriptor/address pairs. > > Memory protection -- the ability to enforce memory addressing restrictions > on execution threads so that references outside a 'legal' region are > detected and trapped (in UNIX terms, raise a SIGSEGV). > I don't see what segmented addressing has to do with register width. On the Burroughs machines, "registers" (well, there are some top-of-stack registers) that hold descriptors are much wider than physical memory addresses, and the virtual address space is a tree which doesn't have any well defined upper bound on its size. "Segments" on the Burroughs machines are like "objects" in SmallTalk, and the original memory management scheme on those machines was not unlike object swapping. There is no such thing as a virtual address as such, you can only talk about a location within an object. Since segments usually correspond to logical entities (such as files, arrays, procedures, &c) this means that wild addressing out of an array is simply impossible. On a Burroughs system, separate processes can share objects (arrays, open files, &c) without having to share intervals of their address space, worrying about page boundaries &c. "Segments" in the 8086 are indeed a kludge to buy you 20-bit addressing on a 16-bit machine. Just because two segment registers hold different values doesn't mean that you can't address the same location through them. Segment registers on the 80386 can be used for either purpose, depending on the operating system, but there really aren't enough segment numbers to do much in the way of segments as objects, and system V/386 simply ignores them. The problem isn't 16-bit or 32-bit or N-bit address spaces, it's the assumption that an address space is a one-dimensional array. Viva tree-structured address spaces!
rminnich@udel.EDU (Ron Minnich) (05/09/88)
In article <52426@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: >Except that programs that duplicate that stuff in C (or whatever) code tend to >do something useful when if the subscript is out of range. For a somewhat >trivial example, consider a program that reads a large array of numbers from a >file, and then prompts the user for an array index and prints out the element >of the array selected by that index. i guess if you assume all these things are interactive, that is ok. What about all the daemons that run in background and screw up? I would like something, anything, better than a core file. Inferring just what happened and where is often impossible. On the Burroughs I got a stack trace *when the error happened*, not 15 hours after an index went awry and as a side-effect clobbered something else, which some time later caused a core dump. In addition, i got symbolic names and a real good summary of just what went wrong. In thinking about it, i guess i am arguing for more than two address spaces per program. Right now we have code and data. Is there so much wrong with having more than one data space? -- ron (rminnich@udel.edu)
mcg@omepd (Steven McGeady) (05/09/88)
In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) writes: > This is a partial reprint of an article that I posted in mid March. ... > >In article <2904@omepd> mcg@iwarpo3.UUCP (Steve McGeady) writes: >> >>It can't be elegance of design, for (e.g.) the 80386 and the MIPSco processor >>are each somewhat inelegant in their own ways (for those who don't wish to >>fill in the blanks, segmentation and register model in the former case, > ^^^^^^^^^^^^ ^^^^^ > ** WHAT THE $@#@++%%& HELL ?!? ** > > Wow ! And I thought Ray Coles (writes for Practical Computing, a UK > Magazine) had it in for Intel ! > After having taken the above quote completely out of context, Mr. Howe then goes on to attempt to rekindle the justifiably extinct *86 memory model discussion. To set the record straight, yet again: 1) I was not implying that there is anything wrong with the 80386 (or *86, for that matter) memory or register models, simply that THERE MAY EXIST PEOPLE WHO BELIEVE that there is something wrong. That particular rhetorical distinction is apparently lost on Mr. Kano. The point, perhaps worth repeating, is that there are some people who are as aesthetically offended by the exposure of the pipeline in the MIPSco processors as others are offended by segmentation, etc. 2) The 80386 butters my bread, and will continue to do so for some time. 3) Even if Mr. Howe's inference were correct, which it is not, I do not speak for Intel Corporation. My views are my own, etc, and often they are not even that, but temporarily adopted for didactic discourse. [Mr. Howe then descends into discourse on the merits of segmentation, which I have no interest in addressing.] As a final plea, it would be awfully nice if we could have a discussion on the network that did not devolve into a semiconductor race war: my chip is {bigger,faster,longer,stronger,prettier} than yours. Yeesh. S. McGeady Intel Corp.
alexande@drivax.UUCP (Mark Alexander) (05/10/88)
In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) writes about: >... the advantages of segmenting, ie., ease of doing >relocatable code, logical program design (code, data/heap stack separation), >inter-task separation (LDT's) and a few other related features. Maybe I'm slow, but I can't see how segmentation makes these things much easier, compared with a typical non-segmented paging system. We did a port of FlexOS to both the 386 and the NEC V60, and the V60 was actually easier to deal with precisely because we didn't have to muck with all that segment stuff. And on the V60 it was very easy to achieve all these Nice Things you mentioned, like code/data separation, relocatability, inter-task protection, etc. Hoping someone can explain this to me in a followup article. -- Mark Alexander (UUCP: amdahl!drivax!alexande) "Bob-ism: the Faith that changes to meet YOUR needs." --Bob (as heard on PHC)
guy@gorodish.Sun.COM (Guy Harris) (05/10/88)
> i guess if you assume all these things are interactive, that is ok. > What about all the daemons that run in background and screw up? If the problem is caused by e.g. a control file being bad, or some network input being bad, or something such as that, subscript-range checking should not *need* to be present; again, the program should explicitly check for bad input and report the error somehow. > I would like something, anything, better than a core file. I would like something better than a stack trace (the core file may be one way to *get* this kind of information, BTW, especially if the subscript-range-checking code causes a core dump when it detects an out-of-range subscript). I would like to be told that I screwed up a configuration file, or that host "xyzzy" sent me a bad packet, or something like that. I don't deny that subscript range checking can be useful for detecting bugs. I would prefer, if this is feasible, a compiler that didn't let people put those bugs into the code in the first place; e.g., one that would warn you that a given array reference could have an index outside the bounds of the array. One way to make sure it doesn't have an index outside the bounds of the array is to put in explicit checking code which prints a useful message, sends back an error packet, or whatever if the array is invalid. > In thinking about it, i guess i am arguing for more than two > address spaces per program. Right now we have code and data. > Is there so much wrong with having more than one data space? No, I never said there was. However, segmentation isn't the *only* way to get subscript range checking. It may be that other ways are more cost-effective.
jk3k+@andrew.cmu.edu (Joe Keane) (05/10/88)
> In thinking about it, i guess i am arguing for more than two address spaces > per program. Right now we have code and data. Is there so much wrong with > having more than one data space? This is what we really need, a number of separate areas in a flat address space. If you define segmentation as `addresses have a segment part and an offset part', it's really inferior to a flat address space; without different-sized segments, you waste address space. Of course, just like Intel did segmentation badly, Unix did flat address spaces badly. The idea of three segments is outdated. --Joe
steckel@Alliant.COM (Geoff Steckel) (05/10/88)
Segmentation (as a method for putting data structures in separate address spaces) is a point on the vector towards capability machines, where the descriptor also includes access control on a more subtle level than the entire data structure. While interesting and potentially useful, it seems to be less useful than (say) a tagged architecture. Note that tags and capabilities are not mutually exclusive. However, I assert that machines with segmented architectures which impose limits on segment size or addressability smaller than the machine global space will recieve flames comparable to the 16-bit maximum 808x series. It is not the business of the machine designer to tell the programmer how s/he must use the address space. Programmers often must work at the far end of a machine's capabilities - artificial addressing limits or kluged-on address extensions cause either disastrous code or disastrous performance. On the other hand, machines with implemented 30+ bit physical address memory exist now, and more are being made daily. Given the MIPS/megabyte ratios, a 100+MIP machine could usefully use a physical and virtual space greater than 32 bits. This subject has arisen in the past year on this newsgroup - my vote is for machines with 64 bit 'long's and 48+ bit pointers. Yes, a whole lot of C code would break if INT and LONG aren't the same type. However, if some of the compiler work devoted to rearranging the code to run fast were devoted to a 'super lint' phase, it might not be so hard on the poor programmer... geoff steckel (steckel@alliant.COM)
barmar@think.COM (Barry Margolin) (05/10/88)
In article <3095@edm.UUCP> rroot@edm.UUCP (uucp) writes: >From article <353@cf-cm.UUCP>, by mch@computing-maths.cardiff.ac.uk (Major Kano): >> (As an aside, I've heard of 68000 routines doing all kinds of contortions to >> check for/avoid overflow because the 68K traps on (eg., zerodivide) and traps >> into SUPERVISOR mode (believe it or not). >Yes, the Divide instruction seems to unconditionially except on a zero >divide, but if you REALLY want to ignore zero divide, you can have the >interrupt vector point to an RTE instruction (or an OR/RTE pair if you >want to set the condition code). Read the original message again, more carefully. He wasn't complaining so much about the fact that divide by zero results in a trap, but that it traps into SUPERVISOR mode, even though the program that executed the divide instruction was running in USER mode. Why should a zero-divide need to be handled by the protected kernel, rather than simply trapping to a user handler? Barry Margolin Thinking Machines Corp. barmar@think.com uunet!think!barmar
daveb@geac.UUCP (David Collier-Brown) (05/10/88)
In article <2430@louie.udel.EDU> rminnich@udel.EDU (Ron Minnich) writes: >Before any one goes off the handle about cost, remember the cost >of all those programs that duplicate this stuff in C code. Actually I prefer to deal with bounds-checking at the segment-descriptor level: shrinking a descriptor around my array makes it possible for the hardware (memory managment unit) to do the checks in parallel with the fetch... All I get is a need for an exception handler (once), not a need to write or generate a test-and-branch after every reference. -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
sjs@spectral.ctt.bellcore.com (Stan Switzer) (05/11/88)
I agree with the consensus here that 1) Segments can be a "Good Thing" 2) but do not solve to the problem of not enough addressing bits 3) and are not significantly superior to a partitioned (sparse,flat) addressing space 4) and unless we can come up with a better model 5) which Burroughs did rather nicely (in its day) 6) and which is reflected, for instance, in the Smalltalk Virtual Machine (See also Pleasy/80(?), et al.) 7) but which for lack of sufficient numbers of segments, is unimplementable with the '386 architecture (that is, making GOOD use of segments) 7) and in any event will do little good in today's environments which are dominated by UNIX, MVS, VMS, and other systems which believe memory is (basically) flat. So, I conclude that until we have languages and environments that can make good use of segments (e.g. Smalltalk) and architectures that support enough segments (shades of 432), flat's where it's at. BTW, I LIKE segments, but you'll never find me programming a '286 or lower. Stan Switzer sjs@ctt.bellcore.com bellcore!ctt!sjs
asg@pyuxf.UUCP (alan geller) (05/11/88)
In article <353@cf-cm.UUCP>, mch@computing-maths.cardiff.ac.uk (Major Kano) writes: > ... > The 32-bit within-segment addresses are what > people have been waiting for for ages. I would question the fact that > only 16 bit selectors are avaliable, but I defy anyone to come up, in the > near or intermediate future, with an Intel-style memory model that is > better than Intel's, without opening up a whole can of voracious > memory-eating killer-worms at the descriptor table level. If you don't > know what I mean by that, pretend for a moment, that YOU were the > person who had to come up with the byte/page granularity > kludge in order to make 4GB segments fit in a DT entry. > ... > I find the segmented architecture is a natural way of protecting programs > and data, and although we often hear negative remarks, no-one seems to be able > to give any coherent justification of such remarks. In particular. no-one has > EVER in my experience directly compared segmenting to straight linear/paging > without TOTALLY ignoring the advantages of segmenting, ie., ease of doing > relocatable code, logical program design (code, data/heap stack separation), > inter-task separation (LDT's) and a few other related features. I don't get it -- what is the advantage of the segmented architecture over the memory mapping model provided by, say, Motorola's PMMU (68851?)? - Why is relocatable code easier with a segmented architecture? I presume you refer to the ability to reset the base of the code and data segments, so that in principle every procedure could think that it starts at (virtual) 0. Of course, this means reloading these segment pointers on every procedure call/return, plus having to use long pointers (64 bits!! including the extra segment register) to access some other procedure's local data. While it is true that this could be simulated on the PMMU, it would be such a kludge as to bring any system to a halt. On the other hand, it's not going to do great things for your performance on the 386, either. - The aforementioned PMMU allows you to separate the data and code address spaces. This does still allow the possibility of stack and heap contention (in a 4 GB arena), but does separate the code from the data, so that code 0 and data 0 are separate. It is also possible, with the 68020 and the PMMU, to still force a code or data access for any given data reference, as one can do with segment prefixes on the 386. - The PMMU supports multiple translation tables. There are special copprocessor instructions that support saving and restoring the mapping state, and the translation cache supports multiple processes. What does the 386 have extra? - What related features? You mean, like the ability to have variable page sizes? You mean, like the ability to provide the result of a logical-to-physical translation to the CPU? You mean, like the ability to see if an address is mapped (probe)? Sure, the 386 model works, but I fail to see how it adds any new power over and above the Motorola PMMU (or the venerable VAX memory model, which is similar). It certainly adds complexity (data segment? extra segment? and what about these new segments? what do they all mean???). Alan Geller Bellcore ...!{rutgers,princeton}!bellcore!pyuxp!pyuxf!asg I'm not responsible for what I said, so how can my employers be??
asg@pyuxf.UUCP (alan geller) (05/11/88)
In article <953@cresswell.quintus.UUCP>, ok@quintus.UUCP writes: > ... > "Segments" on the Burroughs machines are like "objects" in SmallTalk, > and the original memory management scheme on those machines was not > unlike object swapping. There is no such thing as a virtual address > as such, you can only talk about a location within an object. Since > segments usually correspond to logical entities (such as files, > arrays, procedures, &c) this means that wild addressing out of an > array is simply impossible. On a Burroughs system, separate processes > can share objects (arrays, open files, &c) without having to share > intervals of their address space, worrying about page boundaries &c. > > "Segments" in the 8086 are indeed a kludge to buy you 20-bit addressing > on a 16-bit machine. Just because two segment registers hold different > values doesn't mean that you can't address the same location through them. > > Segment registers on the 80386 can be used for either purpose, depending > on the operating system, but there really aren't enough segment numbers > to do much in the way of segments as objects, and system V/386 simply > ignores them. > > The problem isn't 16-bit or 32-bit or N-bit address spaces, > it's the assumption that an address space is a one-dimensional array. > Viva tree-structured address spaces! You mean, like the memory model provided by heirarchical page translation tables, such as the VAX or Motorola PMMU? Alan Geller Bellcore ...!{princeton,rutgers}!bellcore!pyuxp!pyuxf!asg If I don't know what I'm saying, how can my employer? [ This is here to fool inews ... t h i s i s n o t m e a n i n g f u l ]
elg@killer.UUCP (Eric Green) (05/12/88)
in article <3384@drivax.UUCP>, alexande@drivax.UUCP says: > In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) > writes about: >>... the advantages of segmenting, ie., ease of doing >>relocatable code, logical program design (code, data/heap stack separation), >>inter-task separation (LDT's) and a few other related features. > > Maybe I'm slow, but I can't see how segmentation makes these things > much easier, compared with a typical non-segmented paging system. We > did a port of FlexOS to both the 386 and the NEC V60, and the V60 was > actually easier to deal with precisely because we didn't have to muck > with all that segment stuff. There's one case where segmentation (of code) is a Big Win (segmentation of data space is almost never a win): Shared libraries. I understand that Sys V.3 has a kludge that reserves various places in a (linear) 32-bit address space for certain libraries and types of libraries. This is necessary because the libraries must appear at the same address in every process (because otherwise they would need to be re-linked for each process, on most machines, which don't support relocatable code -- thus throwing away all the advantages of shared libraries). On a segmented machine, all shared libraries could start at address 0, in different segments in different processes. It is a much more elegant solution than chopping your address space in such a manner that you cannot have two windowing libraries loaded at the same time. -- Eric Lee Green {cuae2,ihnp4}!killer!elg Snail Mail P.O. Box 92191 Lafayette, LA 70509 "Is a dream a lie that don't come true, or is it something worse?"
gregg@a.cs.okstate.edu (Gregg Wonderly) (05/12/88)
Instruction stream references are vastly different from those of data, and for that very reason, I would vote for a variable size segment capability. To clarify, typically you either need a 'functional unit of a program' or you don't i.e. you are only executing one procedure at a time, so either the whole thing should be resident, or none of it (I don't write 4.2Gbyte functions do you?). Even for the 64K segment processors, the compiler can discover how a local branch within a function > 64K should be handled i.e. is it a NEAR or a FAR branch. Data references on the other hand are highly erratic. For this reason, data memory management should be page oriented so that only the most necessary portions are present. Because of the typical way that data addresses are calculated using integer arithmetic, the data address space should be representable in a single general purpose register. If variable length segments are used, so that each data entity is contained within a segment, there should be some paging mechanism provided within the segmentation to make unreferenced address space available for allocation to other processes. One of the major objections to the Intel segmentation is that when you finally escape the space limitations by using Large/Huge models, you are carrying around a lot more baggage in each instruction. Random array references can not really be done using integer arithmetic because some !@$#$$% engineer decided to place the RPL bits and the LDT/GDT selector bit in the lower order bits of the selector number. OS/2 claims to support Huge model on the 80286, which sends shivers up my spine. Would someone care to tell me how one might arrive at a linear address space (I could get real rich, fast)? My speculation is that the GDT is not used so therefore the 3 lower order bits are always set. Thus, normal integer arithmetic would cause these bits to be cleared when an overflow from the offset portion carried into the selector portion. Using this selector would cause an addressing error. The exception code could then check those 3 bits, and re-set them to all ones, and restart the instruction stream. Presto, you have a linear address space. THE PROGRAMMER SHOULD NEVER SEE A HARDWARE IMPOSED CONSTRAINT ON ADDRESS SPACE REFERENCES. Gregg Wonderly Department of Computing and Information Sciences Oklahoma State University UUCP: {cbosgd, ihnp4, rutgers}!okstate!gregg Internet: gregg@A.CS.OKSTATE.EDU
rroot@edm.UUCP (uucp) (05/12/88)
From article <2429@louie.udel.EDU>, by rminnich@udel.EDU (Ron Minnich): > In any real program, there are a large number of legal regions, which > you don't want to get confused. Flat address space machines do not help . . . . > segments help enforce. In a flat address space, two arrays of structures > butted against each other can (and do!) become confused; in a segmented > machine done right, they can't; in most Intel machines, they can (and do!) > become confused (in the tiny, small, medium, and large models, for example). It can be done with either system, assumingthat you have an MMU that can handle it. In either case, you just have to assign a separate page to each item. This costs in page descriptors. However with the way that the intel '286 works, you're already paying that cost, in a lot of cases, so it's a lot more worthwile to take advantage of what's already been paid for. I think this is why the 'one segment per article' rule tends to be more available with the '286 (esp. once you go beyond the small memory model). -- ------------- Stephen Samuel {ihnp4,ubc-vision,vax135}!alberta!edm!steve or userzxcv@uqv-mts.bitnet
rroot@edm.UUCP (uucp) (05/12/88)
From article <20618@think.UUCP>, by barmar@think.COM (Barry Margolin): ] In article <3095@edm.UUCP> rroot@edm.UUCP (uucp) writes: ]>From article <353@cf-cm.UUCP>, by mch@computing-maths.cardiff.ac.uk (Major Kano): >>> (As an aside, I've heard of 68000 routines doing all kinds of contortions to >>> check for/avoid overflow because the 68K traps on (eg., zerodivide) and traps >>> into SUPERVISOR mode (believe it or not). > Read the original message again, more carefully. He wasn't > complaining so much about the fact that divide by zero results in a > trap, but that it traps into SUPERVISOR mode, even though the program He seemed to be complaining about all the contortions that he heard that programs go thru to check for zero divide (and seemed to assume that it was also necessary for other overflow-type thins). I was basically defending the existance of -- and presumed logic behid -- the trap. The problem behind having the zerodivide interrupt trap into USER state is that it would mess up the whole world. RTI would then have to become a non-priveledged instruction, and user programs would have to set up for it whether they cared about recovering from zero divides or not (rather than tellin the OS when they did). It's not that difficult to emulate a vector into user state when you start in supervisor than it is to go the other way around without introducing some weird contortions on both the user and supervisor side of things. -- ------------- Stephen Samuel {ihnp4,ubc-vision,vax135}!alberta!edm!steve or userzxcv@uqv-mts.bitnet
allan@didsgn.UUCP (didsgn) (05/12/88)
In article <20618@think.UUCP>, barmar@think.COM (Barry Margolin) writes: > In article <3095@edm.UUCP> rroot@edm.UUCP (uucp) writes: > >From article <353@cf-cm.UUCP>, by mch@computing-maths.cardiff.ac.uk (Major Kano): > >> (As an aside, I've heard of 68000 routines doing all kinds of contortions to > >> check for/avoid overflow because the 68K traps on (eg., zerodivide) and traps > >> into SUPERVISOR mode (believe it or not). > >Yes, the Divide instruction seems to unconditionially except on a zero > >divide, but if you REALLY want to ignore zero divide, you can have the > >interrupt vector point to an RTE instruction (or an OR/RTE pair if you > >want to set the condition code). > > Read the original message again, more carefully. He wasn't > complaining so much about the fact that divide by zero results in a > trap, but that it traps into SUPERVISOR mode, even though the program > that executed the divide instruction was running in USER mode. Why > should a zero-divide need to be handled by the protected kernel, > rather than simply trapping to a user handler? > > Barry Margolin > Thinking Machines Corp. > > barmar@think.com > uunet!think!barmar The reason that you cannot trap into a user handler is that ALL programs, both user and supervisor, use the same trap handling routine (located at the address pointed to at location 0x14). Since the same handler is used by all processes (including the kernal), the processor must go to supervisor mode in order to handle what ever it wishes to do. Also, since the location of the handler routine is stored in low memory, most MMUs will prevent a user mode process from accessing it. So, in this architecture, the processor must go into supervisor mode to be able to read the location of the handler as well as handle it. If it was the kernal that caused the trap, then appropriate recovery must be done; if it was a user process, then the process must be marked as experiencing a "divide by zero" trap. A possible change to this architecture would be to let each process have its own "trap vectors" stored in the process's low memory and have the processor fetch the vector as normal, in user mode, and execute the appropriate routine as necessary. The problem with this is each type of trap must be set as either a user mode trap or a supervisor trap. The 68000 supports an instruction called "TRAP #x" where x ranges from 0 to 15. Under most systems, one of these TRAP instructions is used to communicate with the operating system. Thus, some of these TRAP instructions must go into supervisor mode and to a common routine (as it is above) and perform the necessary function. All of this implies a new (priviledge) set of instructions for loading and modifying the "trap type mask" (user or supervisor trap) for each process. Currently, the 68000 supports 48 different types of traps (note: approximately 14 are reserved, but must be considered for the future) while the 68020 uses 64 (same note applies). Thus, during a context switch, all this new information must be saved and restored each time. In order to avoid this, (I'll assume, and yes I know what that stands for :-) ) the designers said, "Since the processor, in supervisor mode, can simulate to each of the user mode processes exactly what happended in the trap, and can place that process in the correct position to perform the (local) trap handler, all traps shall be to supervisor mode. This simplifies the hardware and does not overly complicate the operating system. The OS can easily control what type of response or action is required for each trap, including allowing a "user trap" instruction. (This is where a user program, issuing a 'TRAP #x' call can have a local handler to perform whatever function is desired. An example of this capability was seen in Motorola's VERSAdos Operating System.) So, in conclusion, a trap to supervisor mode on a zero divide is not unreasonable. Rather than complicate the hardware or second guess whether the user wishes the trap or not, Motorola made it the responsibility of each OS to determine what the proper response should be. If "no action" is desired, so be it. But, if a special routine must be executed to flag to the process that something is wrong, then that is also possible. This allows the 68000 to remain very flexible. Allan G. Schrum gatech!rebel!didsgn!allan
henry@utzoo.uucp (Henry Spencer) (05/12/88)
> ... it traps into SUPERVISOR mode, even though the program > that executed the divide instruction was running in USER mode. Why > should a zero-divide need to be handled by the protected kernel, > rather than simply trapping to a user handler? Probably because practically every machine in existence routes *all* traps and interrupts to the kernel, which can pass them on to the user if it pleases. I know of no machine, offhand, whose hardware has any notion of a "user handler". -- NASA is to spaceflight as | Henry Spencer @ U of Toronto Zoology the Post Office is to mail. | {ihnp4,decvax,uunet!mnetor}!utzoo!henry
johnl@ima.ISC.COM (John R. Levine) (05/13/88)
In article <4053@killer.UUCP> elg@killer.UUCP (Eric Green) writes: >There's one case where segmentation (of code) is a Big Win (segmentation of >data space is almost never a win): Shared libraries. ... > >On a segmented machine, all shared libraries could start at address 0, in >different segments in different processes. It is a much more elegant solution >than chopping your address space in such a manner that you cannot have two >windowing libraries loaded at the same time. You'd think so, wouldn't you? Unfortunately, on the 286, code segments invariably include segment:offset addresses for jump and call instructions in-line in the code. It's also quite common to have sequences like this: mov ax,seg foop ; get segment number of something mov es,ax ; put it in a segment registers mov dx,es:something ; get the thing out of its segment The first mov instruction also has a segment number in-line in the code. The practical effect is to require that shared libraries be bound to fixed addresses when they are loaded, and that they be bound to the same segment numbers in each process in which they are used. Gordon Letwin's article on OS/2 in the current Byte goes into considerable detail describing the backflips he had to go through to make shared libraries work, including reserving in all of the segment tables a range of segment numbers when it loads a library, then making those segments point to the library in the tasks that are using it, and making them invalid in all other tasks. The issue of binding shared libraries to address spaces is not exactly a new one. TSS/360 did a reasonable job of it in 1969, on a non-segmented architecture, taking the approach that code segments had to be 100% pure and contain no relocatable addresses at all, and that at each procedure call the caller passed to the callee the address of the callee's data segment. Each routine kept the addresses of all of its callees' data segment in its own data segment, and there was some hack to pass the address of the main routine's data in the initial call. The 360 has no direct addressing, so almost all data addressing is done based on a pointer either loaded from memory or passed in somehow as a parameter; the extra effort to do stuff the TSS way was very low. (TSS had other problems, but shared libraries wasn't one of them.) I suppose they could have enforced a rule like this in OS/2, since all of the OS/2 code is new or at least recompiled. But it would be a horrible hack. Where would you pass the segment number -- as an extra argument on the stack, in the DS or ES, or somewhere else? If it's an extra argument, it creates considerable excitement for the many programmers who use slightly non-standard calling sequences. If in a segment register, there's a serious performance hit because reloading a segment register is very slow, even if the new value is the same as the old. The message here is that although the *86's segmentation scheme is somewhat less awful than the bank-switching kludges used on the Z80, it doesn't solve the problems that segmentation normally does, and so hardly deserves the same name as the addressing scheme in Multics or the B5000. (end of diatribe) -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something Rome fell, Babylon fell, Scarsdale will have its turn. -G. B. Shaw
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (05/13/88)
In article <1988May12.162207.16764@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Probably because practically every machine in existence routes *all* >traps and interrupts to the kernel, which can pass them on to the user >if it pleases. I know of no machine, offhand, whose hardware has any >notion of a "user handler". Well, here's your first: the Cyber 205 architecture has a "data flag branch register" which is under user control. Part of the processor context is this register which tells what to do with various conditions as well as what conditions exist. You can ignore conditions, trap to a user supplied error handler, or trap to a default data flag branch manager subroutine, all in the user's process address space, and without ever going to the kernel. The price paid for this is the cost of one extra dedicated context register and two regular registers (out of 256 on this machine). It should be noted that historically, CDC and Cray have provided a larger set of user context registers (the "exchange package" or "invisible package") which help provide fast context switches and reduce the cost of handling interrupts. At a cost of some expensive real estate - high speed registers which are not available to the user. -- Hugh LaMaster, m/s 233-9, UUCP {topaz,lll-crg,ucbvax}! NASA Ames Research Center ames!lamaster Moffett Field, CA 94035 ARPA lamaster@ames.arpa Phone: (415)694-6117 ARPA lamaster@ames.arc.nasa.gov
glennw@nsc.nsc.com (Glenn Weinberg) (05/13/88)
In article <4053@killer.UUCP> elg@killer.UUCP (Eric Green) writes: >There's one case where segmentation (of code) is a Big Win (segmentation of >data space is almost never a win): Shared libraries. I understand that Sys V.3 >has a kludge that reserves various places in a (linear) 32-bit address space >for certain libraries and types of libraries. This is necessary because the >libraries must appear at the same address in every process (because otherwise >they would need to be re-linked for each process, on most machines, which >don't support relocatable code -- thus throwing away all the advantages of ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >shared libraries). > >On a segmented machine, all shared libraries could start at address 0, in >different segments in different processes. It is a much more elegant solution >than chopping your address space in such a manner that you cannot have two >windowing libraries loaded at the same time. <<Mild flame on>> This is the kind of thing that really frosts me. "Oh, gee, no one wants to take the trouble to build relocatable code, so let's just come up with a kludgey solution rather than doing it right." There is absolutely no reason why language tools and software practices cannot be changed starting immediately to produce relocatable code. Most companies are already rebuilding their tools for RISC architectures anyway. It's just not that hard to write relocatable code from a programming point of view (I was doing it 10 years ago!). It's also not that hard to write code to support relocatable shared libraries. It's yet another of those things that Multics supported over 20 years ago, but that we somehow seem to believe is "just too hard" to do now. When I was at Prime 8 years ago (I know all about the lousy reputation Prime has in the Un*x community) we had shared libraries and were developing relocatable shared libraries. We had dynamic linking for system calls, too. It's not like the technology doesn't exist! Of course it takes more effort to write relocatable code. It takes more effort to write pure code, too, but somehow most of us manage to do it every day. And there's nothing forcing anyone to write relocatable code or relocatable shared libraries. I would just like to believe that people can see the advantages of these features and would want to take advantage of them. <<Flame off>> -- Glenn Weinberg Email: glennw@nsc.nsc.com National Semiconductor Corporation Phone: (408) 721-8102 (My opinions are strictly my own, but you can borrow them if you want.)
johnl@ima.ISC.COM (John R. Levine) (05/14/88)
In article <1988May12.162207.16764@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: > I know of no machine, offhand, whose hardware has any notion of a "user > handler". Funny you should mention that. On the 286 and 386, traps can go to "conformant" segments that run in the protection domain of the caller. You could imagine a generic zero-divide handler that looked around to see what mode it is in and jumps off accordingly. Considering the cost of even the microcode-assisted context switches on the 286, it probably wouldn't be a bad idea. I have no idea whether any actual 286 or 386 operating systems do that, though. -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something Rome fell, Babylon fell, Scarsdale will have its turn. -G. B. Shaw
earl@mips.COM (Earl Killian) (05/14/88)
From: henry@utzoo.uucp (Henry Spencer) Newsgroups: comp.arch Date: 12 May 88 16:22:07 GMT Organization: U of Toronto Zoology > ... it traps into SUPERVISOR mode, even though the program > that executed the divide instruction was running in USER mode. Why > should a zero-divide need to be handled by the protected kernel, > rather than simply trapping to a user handler? Probably because practically every machine in existence routes *all* traps and interrupts to the kernel, which can pass them on to the user if it pleases. I know of no machine, offhand, whose hardware has any notion of a "user handler". The PDP-10's LUUO instructions trap to a handler in the user address space. They are commonly used in PDP-10 programs to extend the instruction set. -- UUCP: {ames,decwrl,prls,pyramid}!mips!earl USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086
neubauer@bsu-cs.UUCP (Paul Neubauer) (05/14/88)
In article <8722@ames.arc.nasa.gov> Hugh LaMaster writes: >In article <1988May12.162207.16764@utzoo.uucp> Henry Spencer writes: >>Probably because practically every machine in existence routes *all* >>traps and interrupts to the kernel, which can pass them on to the user >>if it pleases. I know of no machine, offhand, whose hardware has any >>notion of a "user handler". > >Well, here's your first: the Cyber 205 architecture has a "data flag >branch register" which is under user control. Part of the processor >context is this register which tells what to do with various >conditions as well as what conditions exist. and in another article John R. Levine writes: >Funny you should mention that. On the 286 and 386, traps can go to >"conformant" segments that run in the protection domain of the caller. You A third machine (family) (not very exotic, in fact, downright mundane) that permits user-mode, user-written traps is the IBM 370 series. The Program Status (double) Word is a 64-bit double-word that contains the address of the next instruction, a condition code, and some other information on the status of the process. There are also 5 8-byte locations in low (virtual) memory where a programmer can put predefined PSW's for 5 classes of interrupts, so that when an interrupt makes that PSW current, the process will be placed into the appropriate error-handler for that interrupt class. For any given interrupt, the interrupt handler for that class can test an interruption code to see if it is an interrupt that it wants to handle, e.g., a handler for the class of "program" interrupts could be arranged to recover relatively gracefully from a divide-by-zero exception, but just allow the program to abend (crash) on an addressing exception for out-of-limits addresses, or vice-versa. This can all be done in user-mode with no special privileges. -- Paul Neubauer neubauer@bsu-cs.UUCP <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!neubauer
ok@quintus.UUCP (Richard A. O'Keefe) (05/14/88)
In article <316@pyuxf.UUCP>, asg@pyuxf.UUCP (alan geller) writes: > In article <953@cresswell.quintus.UUCP>, ok@quintus.UUCP writes: > > The problem isn't 16-bit or 32-bit or N-bit address spaces, > > it's the assumption that an address space is a one-dimensional array. > > Viva tree-structured address spaces! > > You mean, like the memory model provided by heirarchical page translation > tables, such as the VAX or Motorola PMMU? > Each process on a VAX sees a LINEAR address space. Page tables are part of the implementation of that, and are not part of the applications programmer's view of the architecture. I don't know much about the Motorola PMMU, but what I recall is that again it is implementation kruft supporting a LINEAR address space. People are complaining about 32-bit address spaces because they have their "an address space is an array of storage units each of which is identified by a single integer" blinkers on. I explicitly said "viva tree-structured address spaces". {Ok, I should *really* have said "graph-structured".} The kind of thing I have in mind is a system where the virtual memory space is pretty much like the UNIX file system. To get to a byte, you specify a sequence of integers, like a Dewey number. A very simple case would be that a segment is either an array of bytes or an array of segments. There might be more than one path to a segment, or there might not. On the B6700, physical memory was limited to 6 Mbytes, but the only limit (in principle) on the size of your virtual memory was the amount of disc you had for paging.
rgr@m10ux.UUCP (Duke Robillard) (05/15/88)
In article <2711@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: > If the segments are invisible to the HLL programmer (usually by >being "big enough"), they're a win. The only thing is... > 16 bits of addressability is visibly too little > 32 bits has been described as too small > 36 bits was **FOUND** to be too small, about 10 years ago Hang on a sec--doesn't the fact that the VAX uses the first two address bits to determine which page table to use mean that it has Intel-ish segmentation? Seems to me that 30 bits has been big enough so far. Has anyone out there ever written a program that used up a VAX's virtual memory? -- +------ | Duke Robillard | AT&T Bell Labs m10ux!rgr@att.UUCP | Murray Hill, NJ {backbone!}att!m10ux!rgr
billo@cmx.npac.syr.edu (Bill O) (05/16/88)
In article <3565@okstate.UUCP> gregg@a.cs.okstate.edu (Gregg Wonderly) writes: >Instruction stream references are vastly different from those of data, and >for that very reason, I would vote for a variable size segment capability. >To clarify, typically you either need a 'functional unit of a program' or >you don't i.e. you are only executing one procedure at a time, so either >the whole thing should be resident, or none of it (I don't write 4.2Gbyte >functions do you?). Actually, this isn't necessarily true. Example: A main procedure which reads-in and manipulates data, calls a subprogram to do a lengthy computation on the data, then, upon completion of the subprogram, prints some results. In a segmented system, the first and last portions of this main procedure would be in the same segment, and yet the last portion isn't needed until after the completion of some other lengthy computation. In a paged system, the pages that comprise that last portion wouldn't reside in memory until needed. The point of view that Gregg expresses was one of the stated premises for the design of the Burroughs B{5,7}6700, and yet I have never seen any studies (I may have missed them -- please tell me if you know of some) on the effect of (unpaged) procedural segmentation on effective working set size. Working set size is a term usually applied to paged systems, but one could still apply it to systems like the B6700, by studying segment-fault rates vs amount of real memory devoted to code. This could then be compared to the performance of paging in the same context. (hmmm, I wonder what the terms of comparison would be -- maybe I/O rates). Unpaged segmentation may still be a win over paging, I just don't think it can be assumed to be so. >Even for the 64K segment processors, the compiler >can discover how a local branch within a function > 64K should be >handled i.e. is it a NEAR or a FAR branch. Yes, but most linear address space machines also have near and far branches. Near branches allow code to be more compact than would be the case if full-size addresses were used exclusively. In fact, linear address space code may be more compact than code on a "real" segmentation machine (like Multics or the B6700) because when calling procedures in other segments, the latter typically has to provide a segment specifier as well as an offset into the segment. Even the 80(2)86, if I remember correctly, requires a 32 bit pointer for long jumps, even though the machine has a 20 bit address space. Still, code compactness may not be a big issue in segmentation -- it could still be a performance winner overall. > >Data references on the other hand are highly erratic. For this reason, >data memory management should be page oriented so that only the most >necessary portions are present. Because of the typical way that data addresses >are calculated using integer arithmetic, the data address space should be >representable in a single general purpose register. If variable length segments >are used, so that each data entity is contained within a segment, there >should be some paging mechanism provided within the segmentation ... As in Multics... >...to make >unreferenced address space available for allocation to other processes. I think by this you mean "to make the use of real memory more efficient by applying the principle of locality to data references via a paging mechanism". You may have a point there. >Gregg Wonderly >UUCP: {cbosgd, ihnp4, rutgers}!okstate!gregg >Internet: gregg@A.CS.OKSTATE.EDU I am actually a fan of segmentation (if done right, as in Multics) for a lot of reasons, such as dynamic linking, increased security, and the blending of the concepts of "file" and "segment" into a unified whole. One way that Multics has been maligned was the choice of what we now regard (in this age of mega-bit DRAMS) as relatively small limits on segment sizes. I am certain that that problem would have been addressed in the successor to Multics if it hadn't been killed by Honeywell. With gigabyte segment sizes (combined with paging), one would have the choice between a paged, monolithic address and data space, or logical segmentation. And it should be noted that even data space can sometimes benefit from logical segmentation -- consider how data-base record locking might be done with a good segmentation system (one segment per record, maybe?). Would anyone from Honeywell or Burroughs (or former employees familiar with these systems) care to comment on or add to the above? Paul S., are you out there?
henry@utzoo.uucp (Henry Spencer) (05/16/88)
> There's one case where segmentation (of code) is a Big Win... Shared > libraries. ...Sys V.3 has a kludge that reserves various places in a > (linear) 32-bit address space for certain libraries and types of libraries. > This is necessary because the libraries must appear at the same address in > every process... On a segmented machine, all shared libraries could start > at address 0, in different segments in different processes. It is a much > more elegant solution... This depends an awful lot on the details of your MMU. In particular, the problem with shared libraries is usually not code, but data: it's not that big a deal to make the code position-independent on most machines, but doing that for data is usually a colossal pain. Segmentation is a win only if the segment numbers of shared-library code and data can be chosen at run time without performance penalty. Otherwise it's the same as taking a 32-bit linear address space and calling the top 8 bits "segment number" and the bottom 24 "offset". Contrariwise, if a 32-bit-linear-address machine has some careful support for shared libraries, as in National's 32000 series for example, it can do exactly the same things as a segmented machine. The fundamental requirement for efficient relocatable shared libraries is the ability to choose the address of a shared library's data (and code) at run-time without having to alter the code or use grossly slow address arithmetic everywhere. Support for this is quite independent of what sort of address space you have: some linear-address machines can do it, and some segmented ones can't. -- NASA is to spaceflight as | Henry Spencer @ U of Toronto Zoology the Post Office is to mail. | {ihnp4,decvax,uunet!mnetor}!utzoo!henry
rroot@edm.UUCP (uucp) (05/16/88)
From article <3039@bsu-cs.UUCP>, by neubauer@bsu-cs.UUCP (Paul Neubauer): > In article <8722@ames.arc.nasa.gov> Hugh LaMaster writes: >>In article <1988May12.162207.16764@utzoo.uucp> Henry Spencer writes: >>>Probably because practically every machine in existence routes *all* >>>traps and interrupts to the kernel, which can pass them on to the user >>>if it pleases. I know of no machine, offhand, whose hardware has any >>>notion of a "user handler". > A third machine (family) (not very exotic, in fact, downright mundane) that > permits user-mode, user-written traps is the IBM 370 series. The Program > Status (double) Word is a 64-bit double-word that contains the address of > the next instruction, a condition code, and some other information on the > status of the process. There are also 5 8-byte locations in low (virtual) > memory where a programmer can put predefined PSW's for 5 classes of > interrupts, so that when an interrupt makes that PSW current, the process > will be placed into the appropriate error-handler for that interrupt class. > out-of-limits addresses, or vice-versa. This can all be done in user-mode > with no special privileges. > I beg to differ: (sortof) The 370 DOES definitely allow an interrupt to go directly into user state, but I doubt that the user is ACTUALLY allowed to modify the interrupt PSWs directly for one good reason: 1) Security: Since the PSW includes the supervisor state bit, being able to change the interrupt PSW means that you could gain full system control in the following manner: MVC INT_PSW,MEAN_PSW L R0,=F'0' DIV R0,R0 MEAN_PSW PSW SUPPERVISOR,ZERO_KEY,=A(TROJAN_HORSE) When you do the divide by zero, the OS jumps to your trojan horse in supervisor state. What is more likely is that the OS catches your attempt to change the new PSW, checks to make sure that all is OK, and then either changes the new PSW commensurate with your wishes or leaves a pointer for a TRUSTED interrupt program that then sets things up and jumps to your new nominee. On the '370, MMU protection is limited to a 2K granularity. (some newer systems have special provisions for the lowest 256 bytes which are REAL sensitive), so you can't make just SOME of the real vectors writable by random users. It's either all or nothing -- (but you can make it LOOK like a user has access via virtual memory techniques). -- ------------- Stephen Samuel {ihnp4,ubc-vision,vax135}!alberta!edm!steve or userzxcv@uqv-mts.bitnet
terry@wsccs.UUCP (Every system needs one) (05/17/88)
In article <1735@alliant.Alliant.COM>, Geoff Steckel writes: > Yes, a whole lot of C code would break if INT and LONG aren't the same type. ^^^ ^^^^ ------------------- What? Since when have 'int' and 'long' been the same "type", except in Lattice C on a 68000? My code won't break; it doesn't make silly assumptions based on implementation dependancies! terry@wsccs
greg@xios.XIOS.UUCP (Greg Franks) (05/17/88)
In article <3384@drivax.UUCP> alexande@drivax.UUCP (Mark Alexander) writes: >Maybe I'm slow, but I can't see how segmentation makes these things >much easier, compared with a typical non-segmented paging system. We >did a port of FlexOS to both the 386 and the NEC V60, and the V60 was >actually easier to deal with precisely because we didn't have to muck >with all that segment stuff. And on the V60 it was very easy to >achieve all these Nice Things you mentioned, like code/data >separation, relocatability, inter-task protection, etc. > >Hoping someone can explain this to me in a followup article. >-- >Mark Alexander (UUCP: amdahl!drivax!alexande) >"Bob-ism: the Faith that changes to meet YOUR needs." --Bob (as heard on PHC) Well, I hope this helps. First, the Intel way of doing segmenting is totally brain damaged. I am sure that you will hear few arguments in this group to the contrary. The biggest win when segmenting is implemented properly, is that it allows you to put individual items into separate segments and then use the hardware protection mechanisms to catch illegal references. Bounds checking can be done in parallel to program execution because explicit 'check' instructions need not be included in the program. Ideally, one would have an unlimited supply of segment registers so that every `object' in a program could be isolated from one and another. Trampling over data is worse than trampling over code, simply because you will more likely get the *wrong* answer instead of *no* answer. Of course, some dated-but-popular languages ignore things like checking bounds on arrays, so they merrily allow one to trample all over data with errant pointers and indicies. In fact, I think that C makes array bounds checking almost impossible due to the uncontrolled manner in which pointers can be used (Is that the fire alarm I hear ringing?) However, please don't quote me on that thought because I am *not* a compiler designer. The *BIG BOTCH* in the Intel 80*86 scheme of things is the position of the protection and table descriptor bits. They occupy the low three bits of the segment register. So address 0x00010000 in a flat address space is represented as 0x00080000. Consequently one gets all sorts of convoluted 'memory models' and performace hits to perform: real_address = (virt_address & 0x1fff0000) << 3 + virt_address & 0x0000ffff Had Intel put these bits at the high end of the address register, the 80286 could have a *flat* address space, albeit _only_ (:-)) 29 bits wide. Personally, I find it interesting that each entry in the descriptor tables is 8 bytes long ( 1 << 3 == 8 ). The machine could also stand quite a few more segment registers, and a more convenient way of loading them. Oh well, too late now.... Segmenting can be faked on a paging system by putting each object in its own set of pages in memory. For example, each UNIX process has two segments, text and data. An 80286 would probably run version 6 quite nicely! :-). See Structured Computer Organization by Andy Tanenbaum for more info. Personally, I suspect that segmenting (done right) may catch on again as object oriented languages like C++ and (gasp) Ada catch on. Good day eh! -- Greg Franks XIOS Systems Corporation, 1600 Carling Avenue, utzoo!dciem!nrcaer!xios!greg Ottawa, Ontario, Canada, K1Z 8R8. (613) 725-5411. "Those who stand in the middle of the road get hit by trucks coming from both directions." Evelyn C. Leeper.
mch@computing-maths.cardiff.ac.uk (Major Kano) (05/18/88)
In article <3450@omepd> mcg@iwarpo3.UUCP (Steve McGeady) writes: > >In article <353@cf-cm.UUCP> mch@computing-maths.cardiff.ac.uk (Major Kano) writes: >> This is a partial reprint of an article that I posted in mid March. ... >> >>In article <2904@omepd> mcg@iwarpo3.UUCP (Steve McGeady) writes: >>> >>>It can't be elegance of design, for (e.g.) the 80386 and the MIPSco processor >>>are each somewhat inelegant in their own ways (for those who don't wish to >>>fill in the blanks, segmentation and register model in the former case, >> ^^^^^^^^^^^^ ^^^^^ >> ** WHAT THE $@#@++%%& HELL ?!? ** >> > >After having taken the above quote completely out of context, Mr. Howe then >goes on to attempt to rekindle the justifiably extinct *86 memory model >discussion. > >To set the record straight, yet again: > > 1) I was not implying that there is anything wrong with the 80386 > (or *86, for that matter) memory or register models, simply > that THERE MAY EXIST PEOPLE WHO BELIEVE that there is something > wrong. That particular rhetorical distinction is apparently > lost on Mr. Kano. I no longer have a copy of the original article. Maybe I'm slow, but I can remember nothing in it to suggest that the remark about elegance was anything but Mr. McGready's own opinion. I can see that it was not the point of his article, but this was the first criticism of the segmented model that I'd seen on the net, and so I replied to it. The reason I "took it out of context" was that I was only interested in the segmenting vs linear/paging aspect, rather than in Mr. McGready's article as a whole. I was not attempting to ascribe to Mr. McGready any opinions that he had not expressed, and I aplologise to you, Mr. McGready, if that was the impression that you (or the net) got. As for the "justifiably extinct" discussion, I consider that since I had never seen a * rational * comparison between segmenting and linear/paging, and that a recent "Byte" article on the 80386 and Unix made a "sort-of" comparison that the issue of segmenting * in general * and its various implementations is clearly NOT extinct. The repiles to and postings about my article that I have seen on comp.arch and received (via e-mail) so far tend to support that view. As for the *86 memory model * in particular *, I dont' think anyone would disagree that the implementation of segmenting on the '286 is poor, and that the 8086/8/186/188 "segments" are not segments at all. Since in the micro world, Intel are the main proponents of segmenting, it is inevitable that their processors will be the ones against which all others are compared, whether favourably or not. Also, one tends to mention the processors with which one is familiar, when talking about something and quoting examples, and for a lot of us, the 80x86 series is just that, so again, the subject still has SOME life left in it. > > The point, perhaps worth repeating, is that there are some people > who are as aesthetically offended by the exposure of the pipeline > in the MIPSco processors as others are offended by segmentation, > etc. > > 2) The 80386 butters my bread, and will continue to do so for some > time. Yes, I got that point in Mr. McGready's followup to his original article. > > 3) Even if Mr. Howe's inference were correct, which it is not, I do ^^^^ > not speak for Intel Corporation. My views are my own, etc, and ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > often they are not even that, but temporarily adopted for > didactic discourse. Yes, I got that point in Mr. McGready's followup to his original article. > >[Mr. Howe then descends into discourse on the merits of segmentation, which >I have no interest in addressing.] > >As a final plea, it would be awfully nice if we could have a discussion on >the network that did not devolve into a semiconductor race war: my chip is >{bigger,faster,longer,stronger,prettier} than yours. Yeesh. I'm not sure what is meant by this. Is it being suggested that the segmenting vs paging/linear argument * on usenet * has become a "Motorola vs Intel vs The Rest Of The World" one (as I have seen happen in magazines) ? It does not seem to have done so far, but in case it does, I should point out that this was not my intention. I too, hope that the discussion stays architectural, rather than becoming confined to company vs company flames. (Of course people will have to mention specific examples in order to continue the discussion). Two people so far have mentioned "capabilites". What are they, please, so's I can continue to follow the discussion. Also, references have been made to a large linear address space, having (max segment length)*(no of segments) bytes in it. This occured to me a couple of weeks ago as a possible idea, but surely paging such a thing into a *MUCH* smaller physical memory (say, 16 Mbytes) might be difficult. Anyone got any ideas on this ? >S. McGeady >Intel Corp. I have seen many postings, and had a few replies. Perhaps I should point out that I was not really interested in the intel 80x86 with x<3, since these are becoming dead ducks. The 80386 is a nice machine, and should replace all other 86-series machines, in my opinion (tho' I bet it won't :-). If I see enough distinct followups and e-mail replies, I will post a summary. Thanks for the discussion so far. Regards to you all, -mch -- Martin C. Howe, University College Cardiff | "You actually program in 'C' mch@vax1.computing-maths.cardiff.ac.uk. | WITHOUT regular eye-tests ?!" -------------------------------------------+-----+------------------------------ My cats know more about UCC's opinions than I do.| MOSH! In the name of ANTHRAX!