[comp.arch] Intel memory model

radford@calgary.UUCP (Radford Neal) (05/23/88)

In article <524@xios.XIOS.UUCP>, greg@xios.XIOS.UUCP (Greg Franks) writes:

> The biggest win when segmenting is implemented properly, is that it
> allows you to put individual items into separate segments and then use
> the hardware protection mechanisms to catch illegal references.  Bounds
> checking can be done in parallel to program execution because explicit
> 'check' instructions need not be included in the program...

I don't think one necessarily gets this advantage, at least not from
segments as I understand the term.

One would like to get a run-time error on execution of the following
program:

    int a[10], b[10]; 
    int i;
    i = 15;
    ...
    a[i] = 0;

One wouldn't want to have this just modify b[5] instead of the
non-existent a[15]. 

I presume that you are proposing allocating the arrays a and b
to different segments, and protecting all but the first ten
words of each. This may work with i==15, but how about i==12345678?
The address of a[i] is likely to be computed as &a[0] + i. If 
the segment number for a is the high-order part of &a[0], then
adding i can change this to the segment number for b, and it's 
possible that the segment offset will also end up being valid.

A scheme like this might work if the machine has addition instructions
that carefully work on just the offset part of a combined segment
number/offset pointer. My impression is that this is not typically
considered part and parcel of "segmented memory". It may be a good
idea of course.

Regardless of any of this, you will need explicit check instructions
if you want to detect the following Pascal error:

   var a: array [1..10,1..10] of integer;
       i, j: integer;

   begin
       i := 1; j := 11;
       ...
       a[i,j] := 0;
   end;

The final address computed from the two subscripts is valid, but
one of them is out of range, and therefore in error. You could allocate
each row of the array a different segment, but that is getting
to be a bit ridiculous.

   Radford Neal

ok@quintus.UUCP (Richard A. O'Keefe) (05/24/88)

In article <1623@vaxb.calgary.UUCP>, radford@calgary.UUCP (Radford Neal) writes:
> I don't think one necessarily gets this advantage, at least not from
> segments as I understand the term.
Neal is thinking of "segments" as Intel (ab)use the term.

> I presume that you are proposing allocating the arrays a and b
> to different segments, and protecting all but the first ten
> words of each.
No, the point is that each array is a segment and NOTHING BUT the data of
each array EXISTS: in a Multics/B6700 segmented architecture there simply
isn't anything else to protect.

> This may work with i==15, but how about i==12345678?
> The address of a[i] is likely to be computed as &a[0] + i. If 
> the segment number for a is the high-order part of &a[0], then
> adding i can change this to the segment number for b, and it's 
> possible that the segment offset will also end up being valid.
I don't know about Multics, but on a B6700 the idea of calculating the
address of a[i] doesn't make sense.  To reference an element of an
array (alias "segment") you present TWO things to the appropriate
instruction: (the value of) the index and (a pointer to) the descriptor.
This isn't how the B6700 does it, but you might imagine two instructions:
	IXLD	<register>,<descriptor>,<index>
	IXST	<register>,<descriptor>,<index>
so a[i] := b[i] might turn into
	LD	r1,i
	IXLD	r2,a,r1
	IXST	r2,b,r1
The "address of" a[i] need never be computed as such.

Obviously, this sort of scheme isn't going to do wonders for C, and in fact
getting BCPL (more-or-less typeless C) to run on B6700s was extremely hard.
But it can be argued that the reason for that is that BCPL and C import a
large chunk of the _problem_ into something which purports to be part of
the _solution_.

> A scheme like this might work if the machine has addition instructions
> that carefully work on just the offset part of a combined segment
> number/offset pointer.
Again, the problem is the assumption that a pointer is really just a thinly
disguised number (there is an implicit assumption that there is a fixed
smallish number of segments and that all segments are the same size), and
that array indexing is a sort of addition.

> Regardless of any of this, you will need explicit check instructions
> if you want to detect the following Pascal error:
> 
>    var a: array [1..10,1..10] of integer;
>        i, j: integer;
> 
>    begin
>        i := 1; j := 11;
>        ...
>        a[i,j] := 0;
>    end;
> 
> The final address computed from the two subscripts is valid, but
> one of them is out of range, and therefore in error. You could allocate
> each row of the array a different segment, but that is getting
> to be a bit ridiculous.

It is far from ridiculous.  It's exactly what the B6700 normally does.
A 10-word segment?  Why not?  The overhead for a segment was 2 words.
It is also extremely useful: you can have a triangular (or even a ragged)
array where each slice is properly bounds-checked.  Even on flat address
space machines ("antiques") it can be useful to implement multidimensional
arrays as vectors of pointers as that way you can do without multiplication
(especially useful on the several RISCs which do not have a multiply
instruction).

The trouble with C is that it is thinly disguised assembly code for a
certain class of machines: admittedly a very popular and very useful
class, but if we make "can it run C?" our criterion for computer
architectures we'll be unable to make much progress.  Even COBOL and
Fortran 77 aren't that limiting!  (PL/I _is_.)

daveb@geac.UUCP (David Collier-Brown) (05/24/88)

| In article <524@xios.XIOS.UUCP>, greg@xios.XIOS.UUCP (Greg Franks) writes:
| The biggest win when segmenting is implemented properly, is that it
| allows you to put individual items into separate segments and then use
| the hardware protection mechanisms to catch illegal references.  Bounds
| checking can be done in parallel to program execution because explicit
| 'check' instructions need not be included in the program...

In article <1623@vaxb.calgary.UUCP> radford@calgary.UUCP (Radford Neal) writes:
| I don't think one necessarily gets this advantage, at least not from
| segments as I understand the term.
...
| I presume that you are proposing allocating the arrays a and b
| to different segments, and protecting all but the first ten
| words of each. This may work with i==15, but how about i==12345678?
| The address of a[i] is likely to be computed as &a[0] + i. If 
| the segment number for a is the high-order part of &a[0], then
| adding i can change this to the segment number for b, and it's 
| possible that the segment offset will also end up being valid.

  The example (elided) would produce "correct" errors on the large
segmented Honeywells, since the address is really (segment.$a[0]+i)
(pardon my lisp), and an overflow of the second part does not show
up in the first part. 
   The behavior you describe does occur on the Intel (huge memory
model), which is why people who've used segmented machines get so
annoyed at it.  User-@!%$#&$*#$@!!!! visible segment registers!

As to the second example:
| 
|    var a: array [1..10,1..10] of integer;
|        i, j: integer;
| 
|    begin
|        i := 1; j := 11;
|        ...
|        a[i,j] := 0;
|    end;
| 
   The only machine I know which gets that one right is the ICL, which
has "array descriptors" (there may be others: I just don't know
about them).

   On the third hand, neither Honeywell nor ICL typically **use** their
segment registers/descriptors to trap either of these array funnys.

 --dave ((:-<) c-b
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  | "His Majesty made you a major 
 350 Steelcase Road,   |  because he believed you would 
 Markham, Ontario.     |  know when not to obey his orders"