[net.arch] Orthogonal addressing doesn't help multis.

gnu@sun.uucp (John Gilmore) (06/15/85)

> And furthermore, the orthogonal sequence is normally atomic...
> 	Mark Wittenberg

This is untrue on machines I know of.  Typical memory-to-memory
instructions do NOT lock out other memory cycles in between; this
requires a specific "test-and-set" or "compare-and-swap" instruction.
This is because often other cycles can be interspersed between the
reads of the operands and the write of the result -- e.g. an
instruction prefetch, or another data fetch.  I have not checked
this in National data books (I'm at home) but I'd be surprised if
they lock the bus for the duration.

Assuming the above is true, it means that even a single instruction
e.g. add  _a,_b  cannot assume that 'b' will not be accessed by another
processor (or dma I/O device) in mid-instruction.  This is equivalent
to the load, add, store case.  On some machines the window where you
can get hurt is smaller for the 1-instruction case, but it still exists.

I once proposed adding a mode to the 68000 that would cause bus locking
for memory read/write instructions (eg, add 1 to counter), but they never
implemented it -- probably for good reason.  The 68020 includes compare
and swap, which allows arbitrary multi-instruction sequences of operations
and verifies that nobody has snuck in in the middle, with very minimal
penalty (an extra read) in the common case.  The IBM 370 principles
of operation manual used to have excellent examples of how to use it.
The 68020 manual does too, I believe.

doug@terak.UUCP (Doug Pardee) (06/18/85)

> > And furthermore, the orthogonal sequence is normally atomic...
> 
> This is untrue on machines I know of.  Typical memory-to-memory
> instructions do NOT lock out other memory cycles in between; this
> requires a specific "test-and-set" or "compare-and-swap" instruction.
> This is because often other cycles can be interspersed between the
> reads of the operands and the write of the result -- e.g. an
> instruction prefetch, or another data fetch.  I have not checked
> this in National data books (I'm at home) but I'd be surprised if
> they lock the bus for the duration.

The NS320xx provides the special instructions SBITI, CBITI, and IBITI
(Set/Clear/Invert Bit Interlocked) which activate the "ILO*" pin on the
CPU for the duration of the instruction.
-- 
Doug Pardee -- Terak Corp. -- !{ihnp4,seismo,decvax}!noao!terak!doug
               ^^^^^--- soon to be CalComp

edler@cmcl2.UUCP (Jan Edler) (06/20/85)

I was also going to post something along these lines, but didn't get around
to it.  But there was another point I wanted to make, so I'll make it now:
Even if you have a uniprocessor with instructions like add-to-memory,
and those instructions are atomic with respect to interrupts,
it is probably not possible to take advantage of this property when
writing in a language other than assembly.  The programmer would not generally
know when the "atomic" instructions would be used, and when the compiler
might optomize them away, so he/she wouldn't be able to depend on them.


	Jan Edler
	New York University
	cmcl2!edler
	edler@nyu

mark@rtech.UUCP (Mark Wittenberg) (06/20/85)

> > And furthermore, the orthogonal sequence is normally atomic...
> > 	Mark Wittenberg
> 
> This is untrue on machines I know of.  Typical memory-to-memory
> instructions do NOT lock out other memory cycles in between; this
> requires a specific "test-and-set" or "compare-and-swap" instruction.

I was thinking of the PDP-11 (eg, inc used the DATIO cycle) on which it
is true (unless my brain has fogged beyond repair, which is possible --
it's been several years now).  Can't think of any micros that do so, though.
Thanks for pointing this out.
-- 

Mark Wittenberg
Relational Technology
zehntel!rtech!mark
ucbvax!mtxinu!rtech!mark

jans@mako.UUCP (Jan Steinman) (06/25/85)

In article <2305@sun.uucp> gnu@sun.uucp (John Gilmore) writes, quotes:
>>Mark Wittenberg:
>>And furthermore, the orthogonal sequence is normally atomic...
>
>This is untrue on machines I know of.  Typical memory-to-memory instructions
>do NOT lock out other memory cycles in between; this requires a specific
>"test-and-set" or "compare-and-swap" instruction...  I have not checked this
>in National data books but I'd be surprised if they lock the bus...

It appears Nati has done this properly.  Operands which are read and written
are of the special access class "rmw", which in itself is nothing special,
because it is desirable to allow

>... other cycles... [to] be interspersed between the reads of the operands
>and the write of the result...

and therefore the CPU is not the proper chip to decide to lock the bus.  The
Nati processors display the fact that a read cycle of a "rwm" cycle is about
to begin *one clock before it actually begins*, allowing a user-supplied
interlock on multi-processor systems plenty of time to keep other processors
out without penalizing uni-processor designs with excessivly long rwm cycles.
It would be nice if you could throw a switch in the Nati MMU to do this for
you, but I guess they only had so much silicon...

Asking designers to handle this is not unusual and is not at all bad.  The
"interlock" operations on Nati bit instructions simply wag the ILO line,
leaving the designer with the task of determining the most efficient manner to
utilize this information, while the 68000 goes to the trouble of providing a
special, 20 clock memory cycle for a single instruction, TAS, which simply
tests and sets the MSB of the operand byte.  With the proper  hardware looking
at the CPU status, it should be easy to insure atomic access during any "rwm"
cycle, which is used orthogonally throughout the Nati instruction set.

It is interesting to note that all three of the papers presented in the Usenix
Conference Multi-Processing session were on Nati processors, although neither
machine evidently implemented an atomic "rmw" access class.
-- 
:::::: Jan Steinman		Box 1000, MS 61-161	(w)503/685-2843 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::

jqj@cornell.UUCP (J Q Johnson) (06/25/85)

In article <482@cmcl2.UUCP> edler@cmcl2.UUCP (Jan Edler) writes:
>Even if you have a uniprocessor with instructions like add-to-memory,
>and those instructions are atomic with respect to interrupts,
>it is probably not possible to take advantage of this property when
>writing in a language other than assembly.  The programmer would not generally
>know when the "atomic" instructions would be used, and when the compiler
>might optomize them away, so he/she wouldn't be able to depend on them.

I don't understand.  Presumably such instructions would be generated by
corresponding high-level constructs in your favorite concurrent programming
language (e.g. all operations on variables declared as semaphores use
the atomic instructions).  If the parser can figure out when you intend
to take advantage of such a feature, it can certainly tell the optimizer!

edler@cmcl2.UUCP (Jan Edler) (07/08/85)

In article <2693@cornell.UUCP> jqj@gvax.UUCP (J Q Johnson) writes:
>In article <482@cmcl2.UUCP> edler@cmcl2.UUCP (Jan Edler) writes:
>>...The programmer would not generally
>>know when the "atomic" instructions would be used, and when the compiler
>>might optomize them away, so he/she wouldn't be able to depend on them.
>...
>Presumably such instructions would be generated by
>corresponding high-level constructs in your favorite concurrent programming
>language (e.g. all operations on variables declared as semaphores use
>the atomic instructions)....

Correct.  I have gotten too used to doing parallel programming with
serial programming languages.  On the other hand, concurrent programming
languages are not yet commonly used for OS implementation, especially
in the uniprocessor world.

	Jan Edler			cmcl2!edler
	New York University		edler@nyu.arpa