henry@utzoo.uucp (Henry Spencer) (05/07/89)
In article <89May6.165030edt.10782@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >... we looked at a lot of old machines that had all kinds >of really neat features that you just don't see any more. My favorite >was the Buroughs 6600 (?) and it's segmentation scheme... >... a lot of access bugs could be detected >at runtime with no performance penalty... Unless you count the performance penalties implicit in implementing the rather complex design. One can debate the size of said penalties, but it isn't correct to say that the bug detection is free. >Unfortunately, a machine like this can't run C very well, 'cause C presupposes >a flat, uniform memory space... Untrue. (Consider all the C compilers for the revolting 8086 and its ugly descendants, which most definitely don't have a flat uniform memory space.) There would be problems in implementing C on one of those machines, but segmentation per se isn't among them. >... I see language-specific >machines like the Symbolics LISP machines ultimately failing in the >market because they are too specialized, and the non-specialized machines >catch up to them in price-performance too quickly... As folks like John Mashey tend to point out, unspecialized devices (static RAMs rather than special cache chips, general-purpose CPUs rather than specialized ones, etc.) have awesome amounts of money driving their further development at top speed. Often some of the benefits of the specialized devices can be duplicated on the general-purpose ones if you put enough brainpower into clever ways to do it; the result may not perform as well, but the general-purpose devices get faster at a terrifying rate. An economically viable special-purpose machine either has to be a *whole lot* better than current general-purpose ones so that it retains its advantage for a while, or it has to catch on quickly enough to build up momentum of its own before it's overtaken. -- Mars in 1980s: USSR, 2 tries, | Henry Spencer at U of Toronto Zoology 2 failures; USA, 0 tries. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
bradb@ai.toronto.edu (Brad Brown) (05/07/89)
In article <1989May6.234007.23517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >In article <89May6.165030edt.10782@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >>Unfortunately, a machine like this can't run C very well, 'cause C presupposes >>a flat, uniform memory space... > >Untrue. (Consider all the C compilers for the revolting 8086 and its ugly >descendants, which most definitely don't have a flat uniform memory space.) >There would be problems in implementing C on one of those machines, but >segmentation per se isn't among them. But consider how C is implemented on the 8086 and it's brethren: You simulate a flat address space and give the programmer a bunch of different memory models so they can decide how big their flat space is. The point is that C wants to be able to point anywhere in memory, more or less, whereas really segmented machines like the Buroughs I was talking about get thier protection ability from having a segment for each (more or less) object in the system. I'm sure that C *could* be implemented on that machine, but the architecture would get in the way. I'm sure that C does not literally require a perfectly flat address space, but it sure encourages it! (-: Brad :-)
dave@lethe.UUCP (Dave Collier-Brown) (05/08/89)
>In article <89May6.165030edt.10782@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: | >... we looked at a lot of old machines that had all kinds | >of really neat features that you just don't see any more. My favorite | >was the Buroughs 6600 (?) and it's segmentation scheme... | >... a lot of access bugs could be detected | >at runtime with no performance penalty... In article <1989May6.234007.23517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: | Unless you count the performance penalties implicit in implementing the | rather complex design. One can debate the size of said penalties, but it | isn't correct to say that the bug detection is free. Well, it doesn't seem to add any extra overhead to the memory address computation on a Honeywell, but I can't say for sure if the time taken is actually dominated by instruction decoding, etc. I'd guess "yes" on the older series 66 line, and a rather strong "no" on the DPS-90 (NEC-designed) machines. Designing a CISC is hard, but sometimes one finds that "wasted" time can be usefully filled up. The risc people seem to have noticed this too (;-)) --dave -- David Collier-Brown, | {toronto area...}lethe!dave 72 Abitibi Ave., | Joyce C-B: Willowdale, Ontario, | He's so smart he's dumb. CANADA. 223-8968 |
pcg@aber-cs.UUCP (Piercarlo Grandi) (05/08/89)
In article <89May7.001514edt.39763@neat.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes:
But consider how C is implemented on the 8086 and it's brethren: You
simulate a flat address space and give the programmer a bunch of
different memory models so they can decide how big their flat space is.
Well, the problem with the Intel architecture is that it was designed for
Pascal, whose pointers can only point to heap allocated objects (and no
arithmetic on them is allowed). As far as Pascal is concerned, the Intel
architecture has (three/four) flat address spaces, and the fact that they
are "distinct" is not a problem. The 386 is the same, but the common trick is
to superimpose them (all three/four segment registers have the same value),
which is practical because they are large. You can do that also on the
smaller ones, getting a 64k flat address space, just like non sep I/D
PDP-11s (on which I used to run 2.9BSD with 5 users on a 248k machine...).
Or you can simulate sep I/D by overlapping only the stack and data address
spaces.
The point is that C wants to be able to point anywhere in memory, more
or less, whereas really segmented machines like the Buroughs I was
talking about get thier protection ability from having a segment for
each (more or less) object in the system.
If I remember well, the average segment size in a Burroughs is of the order
of seven-ten words. Moreover addresses do not mean much in a Burroughs
type architecture; segments are accessed via descriptors, not addresses.
I find hard to say that the Burroughs has a concept of "address space" like
say a 68000 or even a 80x86.
I'm sure that C *could* be implemented on that machine, but the
architecture would get in the way.
I remember having seen on Soft.Pract.Exp. a paper on this -- I don't remember
whether it was a BCPL or C implementation (does not make much difference to
our discussion here), but indeed they found that there is a (huge) mismatch
between what is almost a capability, fragmented address space architecture,
and C/BCPL. It can be papered over, but not easily. Moreover the protection
scheme of the Burroughs makes things even more difficult (see Organick, the
B6700 architecture, MIT press) for C.
--
Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
barmar@think.COM (Barry Margolin) (05/08/89)
In article <89May7.001514edt.39763@neat.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >In article <1989May6.234007.23517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >>In article <89May6.165030edt.10782@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >>>Unfortunately, a machine like this can't run C very well, 'cause C presupposes >>>a flat, uniform memory space... >>Untrue. (Consider all the C compilers for the revolting 8086 and its ugly >>descendants, which most definitely don't have a flat uniform memory space.) >But consider how C is implemented on the 8086 and it's brethren: You simulate >a flat address space and give the programmer a bunch of different memory models >so they can decide how big their flat space is. Don't confuse Intel segmentation with segmentation in general. On 8086-like processors, segments get in the way, and dealing with multiple data segments and/or multiple code segments in one program results in performance penalties. The various memory models are compromises so that programs that don't need the benefits of multiple segments can avoid the performance problems. Multics and Burroughs segments, however, don't get in the way in that manner, unless you want to have a single object that doesn't fit in a segment (in the case of Multics, the max size is 1MB). All pointers include a segment number, there aren't separate segment and offset registers (i.e. a pointer register can address any location in memory). Unless a program cares about segments, it can ignore them. C has been implemented on Multics, and the segmentation had no affect on the implementation. > The point is that C wants to >be able to point anywhere in memory, more or less, whereas really segmented >machines like the Buroughs I was talking about get thier protection ability >from having a segment for each (more or less) object in the system. Why can't a segmented pointer point anywhere in memory? I don't know about Burroughs software, but on Multics we don't assign a segment to every object. There's a segment for every command or subroutine library, a stack segment for each ring, and some number of heap segments for dynamic allocation. In addition, when files are mapped into the address space (which is the only way to access files) they are assigned unique segments, and applications may allocate temporary segments for data rather than allocating out of the heap segment (this is generally done for performance reasons, as it can reduce page faults by separating unrelated data). Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
firth@sei.cmu.edu (Robert Firth) (05/08/89)
In article <912@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >Well, the problem with the Intel architecture is that it was designed for >Pascal, whose pointers can only point to heap allocated objects (and no >arithmetic on them is allowed). As far as Pascal is concerned, the Intel >architecture has (three/four) flat address spaces, and the fact that they >are "distinct" is not a problem. Leaving aside the question whether the Intel 8086 architecture was "designed" for anything, let alone Pascal, I still have some difficulty with the above claim. True Pascal pointers, things of type ^T, indeed point into a heap, and moreover the heap can be segmented by type, so that, for instance, p1 : ^T1 p2 : ^T2 can each be represented as a 16-bit offset into a segment bound to the type. There is some inefficiency, but it can be made to work, within the 64k limit per heap-allocated type. However, Pascal also has implicit pointers by virtue of the VAR parameter mode. Consider for instance procedure Munge ( var X : T ); This can be called with any of (a) an actual of type T declared at the outermost level (b) an actual of type T declared locally (c) an actual expression p^, where p is of type ^T If you want to implement all that and still pass only a 16-bit pointer as the actual parameter, then you must map static, stack-based, and heap spaces all on top of each other. Otherwise, every VAR parameter is 32 bits and can be accessed only by first loading the segment base into ES. Very expensive.
neubauer@bsu-cs.bsu.edu (Paul Neubauer) (05/08/89)
In article <40284@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: >Multics and Burroughs segments, however, don't get in the way in that >manner, unless you want to have a single object that doesn't fit in a >segment (in the case of Multics, the max size is 1MB). [...] > >[...] In addition, when files are >mapped into the address space (which is the only way to access files) >they are assigned unique segments, and applications may allocate Does this mean that the maximum file size is 1MB? How difficult does this make it to deal with, e.g., large databases? -- Paul Neubauer neubauer@bsu-cs.bsu.edu neubauer@bsu-cs.UUCP <backbones>!{iuvax,pur-ee}!bsu-cs!neubauer
barmar@think.COM (Barry Margolin) (05/09/89)
In article <7135@bsu-cs.bsu.edu> neubauer@bsu-cs.bsu.edu (Paul Neubauer) writes: >In article <40284@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: >>Multics and Burroughs segments, however, don't get in the way in that >>manner, unless you want to have a single object that doesn't fit in a >>segment (in the case of Multics, the max size is 1MB). [...] >> >>[...] In addition, when files are >>mapped into the address space (which is the only way to access files) >>they are assigned unique segments, and applications may allocate > >Does this mean that the maximum file size is 1MB? How difficult does this >make it to deal with, e.g., large databases? Yes and no. To handle larger needs, there is a library of routines for handling Multi-Segment Files. MSFs are specially marked and organized directories which are treated by many programs as a single file. The stream and record-oriented file I/O library will automatically create and use MSFs, and standard file manipulation commands (copy, delete, etc.) operate on them as if they were ordinary files. Database managers use MSFs, of course (in fact, keyed-sequential files, the basis of the MRDS relational database, are ALWAYS implemented as MSFs, with one segment holding the key B-tree, and the others holding the data). User programs that wish to do their own memory-mapped segment access (rather than calling the file I/O routines) and permit operation on both segments and MSFs can call msf_manager_ library routines (these make ordinary segments look like single-component MSFs, automatically create new MSF components, and automatically convert segments into MSFs when necessary). Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
dave@lethe.UUCP (Dave Collier-Brown) (05/09/89)
In article <40284@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: >Multics and Burroughs segments, however, don't get in the way in that >manner, unless you want to have a single object that doesn't fit in a >segment (in the case of Multics, the max size is 1MB). [...]>> >[...] In addition, when files are >mapped into the address space (which is the only way to access files) In article <7135@bsu-cs.bsu.edu> neubauer@bsu-cs.bsu.edu (Paul Neubauer) writes: >Does this mean that the maximum file size is 1MB? How difficult does this >make it to deal with, e.g., large databases? Well, both yes and no. Multics files are either single-segment or multi-segment. A multi-segment file, not unexpectedly, can hold a fair bit of data. This allows for things like MRDS (one of the first relational databases) and other "large" creatures. It can even be turned to advantage, since you can work with locality-of-reference and physical database design tricks (clustering) to give good, inexpensive access patterns. The usual problem lies with implementation: a user can tell if a file is single-segment versus multisegment, and come to depend on one or the other case being true. (An MSF is a sort-of-directory, with contents called 0, 1, 2, ...). And one could run out of bits to represent the length, which was a bit count stored in an integer (36-bit machine word). About the time I was leaving (:-{) the experiments with active objects (an early object-orientation scheme) promised to make the distinction moot. Alas, I don't know if that came true. --dave (source: organick) c-b -- David Collier-Brown, | {toronto area...}lethe!dave 72 Abitibi Ave., | Joyce C-B: Willowdale, Ontario, | He's so smart he's dumb. CANADA. 223-8968 |
tbray@watsol.waterloo.edu (Tim Bray) (05/10/89)
In article <40284@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: > Multics and Burroughs segments, however, don't get in the way in that > manner, unless you want to have a single object that doesn't fit in a > segment (in the case of Multics, the max size is 1MB). ... > Unless a program cares about segments, it can ignore them. Hear ye! Segments are evil. If my usable address space is N, it is a priori *wrong* to assume that I won't mind dealing with segments of any size < N. A trivial example: In processing the OED, it is often desired to take a couple thousand matches to some search pattern, and group them, for example, by containing entry, of which there are 290 thousand. If you have the memory (we do, this is a 32M machine), the simplest and best way to do this is to represent the entries by an array of 290K pointers and do a Hwang&Lin type merge through them. Now assuming that my pointers are about 32 bits in size, I've just blown the totally artificial 1-Mb restriction. 32 bits is barely liveable as it is; if an application wants to segment its memory, that's hunky-dory, but let me use my 32 bits the way I want. Cheers, Tim Bray, New OED Project, U of Waterloo, Ontario
dricejb@drilex.UUCP (Craig Jackson drilex1) (05/10/89)
In article <40284@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: >In article <89May7.001514edt.39763@neat.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >>In article <1989May6.234007.23517@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >>>In article <89May6.165030edt.10782@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >>>>Unfortunately, a machine like this can't run C very well, 'cause C presupposes >>>>a flat, uniform memory space... >>>Untrue. (Consider all the C compilers for the revolting 8086 and its ugly >>>descendants, which most definitely don't have a flat uniform memory space.) >>But consider how C is implemented on the 8086 and it's brethren: You simulate >>a flat address space and give the programmer a bunch of different memory models It really is true that a segmented machine which uses the segmentation to control access rights (most do) must make some compromises to implement C. Most of these compromises come under the heading of simulating a flat address space. >Multics and Burroughs segments, however, don't get in the way in that >manner, unless you want to have a single object that doesn't fit in a >segment (in the case of Multics, the max size is 1MB). All pointers >include a segment number, there aren't separate segment and offset >registers (i.e. a pointer register can address any location in >memory). Unless a program cares about segments, it can ignore them. > >> The point is that C wants to >>be able to point anywhere in memory, more or less, whereas really segmented >>machines like the Buroughs I was talking about get thier protection ability >>from having a segment for each (more or less) object in the system. Generally true. Wouldn't have to be, though. >Why can't a segmented pointer point anywhere in memory? It can. However, some architectures (such as the Burroughs) place restrictions on how pointers can be manipulated. This is done to prevent illegal pointers from being formed. It is up to the compilers to enforce these restrictions. Illegal pointers cannot be formed because there is nothing stopping them from being used-- the normal address space/protection mechanisms for keeping tasks separated are not there. (At one time, this was considered a Good Idea: put your protection in the compilers, where it only needs to be done per compilation, rather than in the hardware, where it needs to be done on every memory access. See Per Brinch Hansen's books. Nowadays, the universality of C makes this idea impossible and the cleverness of the hardware architects seems to make this idea unnecessary.) (Aside to the real architects of the audience: would it be possible to shave any nanoseconds off the cycle time if it were unnecessary to perform any protection checking on memory references?) >I don't know about Burroughs software, but on Multics we don't assign >a segment to every object. There's a segment for every command or >subroutine library, a stack segment for each ring, and some number of >heap segments for dynamic allocation. In addition, when files are >mapped into the address space (which is the only way to access files) >they are assigned unique segments, and applications may allocate >temporary segments for data rather than allocating out of the heap >segment (this is generally done for performance reasons, as it can >reduce page faults by separating unrelated data). >Barry Margolin >Thinking Machines Corp. Burroughs machines (now called the Unisys A-Series) really do allocate a segment for every object. Segments may vary in size from 6 bytes to 360k bytes. There is a mechanism for doing hidden fixed-size segmentation on very large segments--this is close to paging. However, each segment has its own set of pages and page tables. On these machines, it isn't really meaningful to talk about a virtual address space size-- every new object (array, subroutine code area, etc) is its own address space. C has been announced for the A-Series, for delivery in June-July 1989. It simulates a flat address space. It has four memory models, which allow the programmer to choose the nature of the flat address space simulation. The choice is basically a tradeoff between the size of the simulated address space, and efficiency of access. Note: this compiler has a number of other interesting aspects; I hope to post a description in comp.lang.c when I get around to it (and after I make sure that I'm not violating non-disclosures). -- Craig Jackson {bbn,ll-xn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}
blarson@skat.usc.edu (Bob Larson) (05/10/89)
In article <13688@watdragon.waterloo.edu> tbray@watsol.waterloo.edu (Tim Bray) writes: >Hear ye! Segments are evil. If my usable address space is N, it is a priori >*wrong* to assume that I won't mind dealing with segments of any size < N. Segments are NOT evil, you just havn't seen an implmentation of them you like. For a segmented pointer format to be used for the next decade or so I would propose: bit offset from segment base: 64 bits segment number: 32 bits protection ring: 16 bits misc flags: 16 bits (includes info like non-snapped pointer for dynamic linking, etc.) If the offset field of the pointer is big enough to address all your virtual address space, segmentation is NOT a limit. (On the other hand, poorly designed segments (intel) are horid.) -- Bob Larson Arpa: blarson@skat.usc.edu Uucp: {uunet,cit-vax}!usc!skat!blarson Prime mailing list: info-prime-request%ais1@ecla.usc.edu usc!ais1!info-prime-request
smryan@garth.UUCP (s m ryan) (05/13/89)
>Hear ye! Segments are evil. If my usable address space is N, it is a priori >*wrong* to assume that I won't mind dealing with segments of any size < N. Segments are a way to manage the disc image when the virtual image is sparse. Consider a procedure which adds properties P to some graph and second procedure which uses property P to add properties Q to the graph. If the storage used to contain P can be released in the middle of the second procedure, you have two strategies in a Unix style non-sparse address space: - free P while allocating Q, thus requiring an intermingling of frees and allocates which can expand the working set and increase overhead in allocation procedures. - use a mark/release which ties up space for P long after it is no longer necessary. In a sparse address space, P and Q can be allocated in two separate regions simply by having a current base and incrementing it for each allocate. A region is freed by simply removing it from the virtual address space. In a nonsegmented system with sparse addressing, you have a problem of mapping the virtual address to the address of the disc image. For small machines, it is possible to have the entire virtual space mapped onto the disc, but if it is sparse, a better technique would exploit that sparseness and reduce the size of the disc image. How? Try looking at VSOS for the Cyber 20x. It implemented this, but the map was limited size so that many programs aborted because the map was too disorganised and ran out of space. In a segmented system, the programmer can be required to start each region in a segment, and each segment has a separate disc image. If the region is itself fairly compact, it is sufficient to have the entire segment from base to highest used address mapped to the disc, thus simplifying the virtual memory <--> disc image mapping. This is used in NOS/VE which I am told is derived from Multics. An additional advantage of segments is the segment=disc file philosophy. So that instead of copying a disc file to virtual memory which may be paged in and out of another disc file, as on Unix, the paging is directly between virtual memory and the original disc. Just as a parallel machine can be used in fundamentally different ways than a sequential machine, a sparse virtual address space can be used in fundamentally different ways than a PDP-11's physical address space. -- 17. The Aesir filled the otter's skin Steven Ryan: ingr!garth!smryan and heaped the gold above its chin. 2400 Geng Road, Palo Alto, CA But Hreithmar saw a whisker bare holy corn oil, that is oil from the corns of and bid them cover every hair. of holy men, or as it is known in your country
davecb@yunexus.UUCP (David Collier-Brown) (05/16/89)
In article <2844@garth.UUCP> smryan@garth.UUCP (s m ryan) writes: [quoting someone from watdragon who says:] >>Hear ye! Segments are evil. If my usable address space is N, it is a priori >>*wrong* to assume that I won't mind dealing with segments of any size < N. No, Sir Dragon, silly restrictions are evil, and tend to disguise the elegance of good ideas, as well as merely fair ones. If my address space with current processor speeds/densities/etc is N, where N is about half a terabyte, then I am likely enough to accept intel-like segments of about 4 gigabytes. As we are supposedly using up address space at the rate of 1 bit every eight months, this will hold us to the end of the century. I wouldn't bet on it doing much better (:-)). I'd really like a multi-level naming scheme, so I can say fred->mary[j] and not be terribly concerned whether fred is outrageously large or not. If its not, it should be in the "data" or "bss" segment. If it is, it should be in a segment all its own. I certainly would **not** want to have to worry about its size in the general case. If it's in some way special (eg, if it's a file), I'll accept the fact that it's special and declare/initiate/open it in the required special way. I hope to post a series of articles (a sort of mini-review) of different ways of approaching the multiple -vs- sparse address space issues this weekend, if the weather doesn't stay nice. --dave (only four more days to monomania!) c-b