[comp.arch] More von N

pjd@demon.siemens.com (dr. funk) (03/29/91)

While we are sharing course notes on vonNeumann machines....
Here's mine from an undergrad course that I taught in '84.
[My thinking has shifted some since then so save the flames.]

drongowski/siemens corporate research

**************************************************************

von Neumann Computer Model
ECMP 338 Spring 1984
P.J. Drongowski

References.

  Preliminary discussion of the logical design of an
  electronic computing instrument (1946)
  Arthur W. Banks, Herman H. Goldstine, John von Neumann
  Reprinted in "Computer Structures: Readings and Examples.   

  The Computer from Pascal to von Neumann, Herman H. Goldstine,
  Princeton University Press, 1972

  The Origins of Digital Computers, Brian Randall (editor),
  Springer-Verlag, 1975

Computer architecture checklist -- Or what questions to ask
when evaluating a machine for the first time.

  * What is the overall structure of the machine (from the
    programmer's viewpoint?)
  * What primitive datatypes are supported by the machine?
    + Examples: Integer, unsigned integer, character, etc.
    + How are these datatypes represented in the machine?
  * How many addresses per instruction are there? This will
    have a substantial effect on the style of programming.
  * What are the addressing modes?
  * What are the instructions?
  * How is I/O performed? Are there special I/O instructions?
    How are interrupts performed?
  * What exception conditions are detected and how are they
    reported to the operating system or user programs?

A really hot computer!

  * 40 bits per data word.
  * 20 bits per instruction.
  * 4096 word memory.
  * 5-50 microsecond memory access time.
  * 0.5 MHz clock rate.
  * 20,000 operations per second.

The technology of 1946.

  * Vacuum tube logic circuits.
  * Memory.
    + Vacuum tube flip/flops.
    + RCA Selectron storage tubes.
    + Mercury delay lines.

Number system.

  * Binary.
    + Simplicity.
    + Speed.
    + Decision making tends to be logical.
  * Whay not decimal?
    + Can convert from decimal to binary.
    + Decimal hardware is more complicated.

von Neumann computer is composed of four major units.

  + Arithmetic unit.
  + Memory.
  + Control unit.
  + Input and output devices.

Arithmetic organ.
  + Performs primitive arithmetic and logical operations.
  + Trade-off: Native vs. coded operations.
    - Native: Fast, requires more hardware.
    - Coded: Slow, less hardware in implementation.
  + Integer arithemetic only.
  + Floating point was considered too expensive.

Memory.

  + A flip/flop memory was regarded as impractical.
  + Delay lines were rejected (dynamic memory.)
  + RCA Selectron tubes were the chosen technology.
    - Can store 4096 bits.
    - Put 40 tubes in parallel.
    - Use "function tables" to address the memory.
    - Buffer the address and data in flip/flop registers.
  + Memory characteristics are parallel, random access.
    - Serial memory presents data one bit at a time and is slower.
    - EDVAC was a serial machine, consistent with serial
      delay line memory.

Control unit.

  + Compromise between instruction set and hardware complexity.
  + Instructions consist of:
    - Operation code.
    - Memory addresses and mode bits.
  + Sequential instruction execution.
  + Unconditional and conditional branches.
    - Needed for decision making.
    - Loops (iterative execution.)

Characteristics of the BGN computer (1946)

  * Instructions & data are stored together in memory.
  * Instructions & data are binary codes (indistinguishable.)
  * A linearly addressable memory of fixed length words.
  * Internal registers for counting and intermediate results.
  * A program counter to sequence instructions.
  * Unconditional and conditional goto's.
  * Arithmetic and logical instructions that operate on
    parallel words.
  * All system timing is derived from a single clock
    (including I/O transfers.)

Instruction set.

  * Single address per instruction.
  * Two instruction stored per 40-bit memory word.
    - Provided one instruction lookahead.
    - Reduced number of instruction read operations.
  * Engineering principle for instruction set design -- Incorporate
    only those features that
    - are necessary to have a complete system, or
    - are highly convenient because of the frequency with which
      they occur in application or mathematical practice.

Addresses per instruction -- an important machine characteristic.

  0-address stack machines (or pure register) machines.
  1-address machine.
    - Only a small amount of memory is used for address information.
    - Most operations read a value from memory, combine the value
      with an AC value, and leave the result in the AC.
  2-address machine.
    - Essentially a compromise between 1 and 2 address machines.
    - Memory to memory MOVE instructions are possible.
    - PDP-11 is an example of a 2-address machine.
  3-address machine.
    - Three addresses are need for dyadic operations such as addition.
    - A <- B + C where A, B, and C are in memory.
    - Easy to compile 3-address code.
    - Results usually remain in a register, so results are rarily
      written back into memory.
  4-address machine.
    - EDVAC is an example.
    - Has the usual three address operation.
    - 4th address was next instruction pointer.
    - Chaotic ordering of instructions in memory is usually
      not required.
    - Many horizontally programmed micro-engines have an
      explicit next address pointer.

IAS instruction set.

  register AC<39:0>,     ! Accumulator
           R<39:0>,      ! Arithmetic register
           CC<11:0>;     ! Control counter

  memory   M[4095:0]<39:0>;

  AC <- M[x]                 Load AC from memory
  AC <- -M[x]                Load complement of value from memory
  AC <- abs M[x]             Load absoluate value from memory
  AC <- - abs M[x]
  AC <- AC + M[x]            Add
  AC <- AC - M[x]            Subtract
  AC <- AC + abs M[x]
  AC <- AC - abs M[x]
  R <- M[x]                  Load R from memory

  AC <- R * M[x]<78:40>;     Multiply
  R <- R * M[x]<39:0>; next

  R <- AC / M[x];            Divide
  AC <- AC mod M[x]; next

  CC <- M[x]<39:20>          Jump (left-hand)
  CC <- M[x]<19:0>           Jump (right-hand)

  if AC >= 0 then CC <- CC <- M[x]<39:20>
  if AC >= 0 then CC <- CC <- M[x]<19:0>

  M[x] <- AC                 Store to memory
  M[x]<39:26> <- AC<39:26>   Store address (left-hand)
  M[x]<19:6> <- AC<39:26>    Store address (right-hand)

  AC <- AC * 2               Left shift
  AC <- AC / 2               Right shift

Self-modifying code.

  + Instructions and data are stored in same memory and both can
    be manipulated.
  + Instructions can be changed to point to new operands (arguments.)
  + Used in IAS to pass subroutine arguments.
  + Eventually replaced by index registers (which first appeared
    in the Manchester MARK I machine.)
  + Arguments against self-modifying code.
    - Difficult to debug.
    - Cannot have pure or reentrant procedures.

A self consistent set of machine principles --
characteristics that reinforce each other.

  * Linear memory, synchronous control, sequential program
    flow, goto's, look step input/output.
  * Stored program concept, fixed length words, parallel
    operations.
  * The "double channel" organization:

          Mp --- Pc --- I/O devices

    centralized synchronous control, sequential programming,
    lock step I/O.

Programming.

  * For efficiency, programming languages were designed with
    a BGN bias.
  * However, our view of programming and problem solving has
    changed.
    + Recursion is an extremely useful device.
    + Datatypes are richer. (Sets, bags, strings, sequences, etc.)
    + The solutions of some problems decompose nicely into
      parallel programs.
    + Flexible control structures (e.g., goal directed control,
      dataflow, etc.) also simply problem solving.
    + Verification and formal program properties are important.
    + Infinite precision arithmetic (without round off error) is
      desirable.
  * Our programming style is becoming incompatible with the BGN!
    + The linear memory is incompatible with the new datatypes.
    + The machine operations are too elemental (too weak.)

What features are missing from this discussion of IAS?

  + IAS used special instructions for controlling I/O devices.
  + Direct memory access I/O was not invented yet.
  + Concurrent I/O and interrupts were not implemented either.

**************************************************
* These comments should precede dataflow section *
**************************************************

Improvements in technology (Optional.)

  * BGN tried to maintain a balance between processor and
    memory speed.
  * When technology advanced, the rate imbalance became worse.

       Year  Machine       Clock     Access time  Technology
       ----  -------       -----     -----------  ----------
       1950  Manchester              8   usec     Storage tube
             EDSAC         0.5 MHz   40  usec     Delay line
             Whirlwind I   1-2 MHz   64  usec     Drum
       1955  IBM 704       1   MHz   12  usec     Core
       1960  IBM 7090      4.5 MHz   2.2 usec     Core
       1965  CDC 6600      40  MHz   1   usec     Core
       1968  CDC 7600      60  MHz   .25 usec     Core
  * Processors were "kludged up" to work around the
    "channel bottleneck."
      + General register set machines are the rage.
      + Instruction look ahead.
        - Single instruction buffer.   IBM 7094 (1964.) 
        - Multiple instruction buffer. CDC 6600 (1965.) 
      + Instruction and data cache memory. IBM 360/85 (1968.)
      + Pipelining. IBM 360/91 (1965.)

Concurrency (optional.)

  * Unfortunately, engineers began to electrively change some of
    these principles. These changes caused unavoidable conflicts.
  * Pipelining and concurrent I/O and interrupts were invented to
    increase parallelism (dissatisfaction with synchronous,
    sequential control.)
  * But, a theory dealing with concurrent computing had not yet
    been developed!
    + Operating systems and hardware began to fail mysteriously
      due to synchronization failure.
    + One pipelined machine was constructed which could not
      determine where an interrupt occured within the instruction
      stream!

Technological constraints of VLSI systems (Optional.)

  * Space, speed and power can be traded against each other.
  * Active circuitry is cheap; wires are expensive.
    + Off-chip communications are very slow.
    + Wiring on-chip is a painfully expensive process.
    + Silicon wires are subject to "diffusion delays."
  * Concurrency is easily bought.
  * Global signals (power and ground?) are not very easily
    distributed.
  * A global clock is difficult to distribute. Synchronous timing
    is difficult to verify.

New machine principles (Optional.)

  * New self consistent principles must be found that fit the
    modern view of programming and VLSI technology.
  * Glushkov, et al.
    + Machines should be extensible for new datatypes and
      control disciplines.
    + If the operands for an operation are available, the
      operation should be performed.
    + Memory should be restructurable to support new datatypes.
    + There are no limits on the number of machine elements.
    + The computer organization is reprogrammable.
  * Davis (among others.)
    + Only asynchronous, distributed control machines can be
      easily expanded.
    + No module can determine the total system state.
    + Simultaneity cannot be enforced.

********************************************************************


--
paul j. drongowski
siemens corporate research inc    pjd@demon.siemens.com
princeton, new jersey 08540       (617) 734-6547