brooks@lll-crg.ARPA (Eugene D. Brooks III) (06/09/85)
> And furthermore, the orthogonal sequence is normally atomic; > in an OS kernel the non-orthogonal sequence might easily have to > be protected by a "disable/enable interrupt" sequence around it, > or "test-and-set" or some such in a multi-processor system > (e.g., "a" and "b" might be global vars). > Multi-process user-programs would need "enter/exit monitor" or > "block-on-semaphore" sequences. Besides being a pain (sometimes > a royal pain) this has the potential for eating a lot of CPU time. > -- Considerations for multiprocessing are one of the strongest arguments in favor of a load/store type of instruction set. The fundamental problem to be overcome in a multiprocessor is memory latency. You increase efficiency in an environment with high memory latency by using a load/store type of instruction set in conjunction with a processor composed of pipelined functional units and careful instruction ordering. For example: a += b; load r0,_a load r1,_b add r0,r1 store r0,_a The performance gain is achieved with there is more work to do. For example: a += b; c += d; load r0,_a load r1,_b load r2,_c load r3,_d add r0,r1 add r2,r3 store r0,_a store r2,_c The loads overlap their latencies resulting in a higher performance than is capable with the sequence add _a,_b add _c,_d