pjd@demon.siemens.com (dr. funk) (03/29/91)
While we are sharing course notes on vonNeumann machines....
Here's mine from an undergrad course that I taught in '84.
[My thinking has shifted some since then so save the flames.]
drongowski/siemens corporate research
**************************************************************
von Neumann Computer Model
ECMP 338 Spring 1984
P.J. Drongowski
References.
Preliminary discussion of the logical design of an
electronic computing instrument (1946)
Arthur W. Banks, Herman H. Goldstine, John von Neumann
Reprinted in "Computer Structures: Readings and Examples.
The Computer from Pascal to von Neumann, Herman H. Goldstine,
Princeton University Press, 1972
The Origins of Digital Computers, Brian Randall (editor),
Springer-Verlag, 1975
Computer architecture checklist -- Or what questions to ask
when evaluating a machine for the first time.
* What is the overall structure of the machine (from the
programmer's viewpoint?)
* What primitive datatypes are supported by the machine?
+ Examples: Integer, unsigned integer, character, etc.
+ How are these datatypes represented in the machine?
* How many addresses per instruction are there? This will
have a substantial effect on the style of programming.
* What are the addressing modes?
* What are the instructions?
* How is I/O performed? Are there special I/O instructions?
How are interrupts performed?
* What exception conditions are detected and how are they
reported to the operating system or user programs?
A really hot computer!
* 40 bits per data word.
* 20 bits per instruction.
* 4096 word memory.
* 5-50 microsecond memory access time.
* 0.5 MHz clock rate.
* 20,000 operations per second.
The technology of 1946.
* Vacuum tube logic circuits.
* Memory.
+ Vacuum tube flip/flops.
+ RCA Selectron storage tubes.
+ Mercury delay lines.
Number system.
* Binary.
+ Simplicity.
+ Speed.
+ Decision making tends to be logical.
* Whay not decimal?
+ Can convert from decimal to binary.
+ Decimal hardware is more complicated.
von Neumann computer is composed of four major units.
+ Arithmetic unit.
+ Memory.
+ Control unit.
+ Input and output devices.
Arithmetic organ.
+ Performs primitive arithmetic and logical operations.
+ Trade-off: Native vs. coded operations.
- Native: Fast, requires more hardware.
- Coded: Slow, less hardware in implementation.
+ Integer arithemetic only.
+ Floating point was considered too expensive.
Memory.
+ A flip/flop memory was regarded as impractical.
+ Delay lines were rejected (dynamic memory.)
+ RCA Selectron tubes were the chosen technology.
- Can store 4096 bits.
- Put 40 tubes in parallel.
- Use "function tables" to address the memory.
- Buffer the address and data in flip/flop registers.
+ Memory characteristics are parallel, random access.
- Serial memory presents data one bit at a time and is slower.
- EDVAC was a serial machine, consistent with serial
delay line memory.
Control unit.
+ Compromise between instruction set and hardware complexity.
+ Instructions consist of:
- Operation code.
- Memory addresses and mode bits.
+ Sequential instruction execution.
+ Unconditional and conditional branches.
- Needed for decision making.
- Loops (iterative execution.)
Characteristics of the BGN computer (1946)
* Instructions & data are stored together in memory.
* Instructions & data are binary codes (indistinguishable.)
* A linearly addressable memory of fixed length words.
* Internal registers for counting and intermediate results.
* A program counter to sequence instructions.
* Unconditional and conditional goto's.
* Arithmetic and logical instructions that operate on
parallel words.
* All system timing is derived from a single clock
(including I/O transfers.)
Instruction set.
* Single address per instruction.
* Two instruction stored per 40-bit memory word.
- Provided one instruction lookahead.
- Reduced number of instruction read operations.
* Engineering principle for instruction set design -- Incorporate
only those features that
- are necessary to have a complete system, or
- are highly convenient because of the frequency with which
they occur in application or mathematical practice.
Addresses per instruction -- an important machine characteristic.
0-address stack machines (or pure register) machines.
1-address machine.
- Only a small amount of memory is used for address information.
- Most operations read a value from memory, combine the value
with an AC value, and leave the result in the AC.
2-address machine.
- Essentially a compromise between 1 and 2 address machines.
- Memory to memory MOVE instructions are possible.
- PDP-11 is an example of a 2-address machine.
3-address machine.
- Three addresses are need for dyadic operations such as addition.
- A <- B + C where A, B, and C are in memory.
- Easy to compile 3-address code.
- Results usually remain in a register, so results are rarily
written back into memory.
4-address machine.
- EDVAC is an example.
- Has the usual three address operation.
- 4th address was next instruction pointer.
- Chaotic ordering of instructions in memory is usually
not required.
- Many horizontally programmed micro-engines have an
explicit next address pointer.
IAS instruction set.
register AC<39:0>, ! Accumulator
R<39:0>, ! Arithmetic register
CC<11:0>; ! Control counter
memory M[4095:0]<39:0>;
AC <- M[x] Load AC from memory
AC <- -M[x] Load complement of value from memory
AC <- abs M[x] Load absoluate value from memory
AC <- - abs M[x]
AC <- AC + M[x] Add
AC <- AC - M[x] Subtract
AC <- AC + abs M[x]
AC <- AC - abs M[x]
R <- M[x] Load R from memory
AC <- R * M[x]<78:40>; Multiply
R <- R * M[x]<39:0>; next
R <- AC / M[x]; Divide
AC <- AC mod M[x]; next
CC <- M[x]<39:20> Jump (left-hand)
CC <- M[x]<19:0> Jump (right-hand)
if AC >= 0 then CC <- CC <- M[x]<39:20>
if AC >= 0 then CC <- CC <- M[x]<19:0>
M[x] <- AC Store to memory
M[x]<39:26> <- AC<39:26> Store address (left-hand)
M[x]<19:6> <- AC<39:26> Store address (right-hand)
AC <- AC * 2 Left shift
AC <- AC / 2 Right shift
Self-modifying code.
+ Instructions and data are stored in same memory and both can
be manipulated.
+ Instructions can be changed to point to new operands (arguments.)
+ Used in IAS to pass subroutine arguments.
+ Eventually replaced by index registers (which first appeared
in the Manchester MARK I machine.)
+ Arguments against self-modifying code.
- Difficult to debug.
- Cannot have pure or reentrant procedures.
A self consistent set of machine principles --
characteristics that reinforce each other.
* Linear memory, synchronous control, sequential program
flow, goto's, look step input/output.
* Stored program concept, fixed length words, parallel
operations.
* The "double channel" organization:
Mp --- Pc --- I/O devices
centralized synchronous control, sequential programming,
lock step I/O.
Programming.
* For efficiency, programming languages were designed with
a BGN bias.
* However, our view of programming and problem solving has
changed.
+ Recursion is an extremely useful device.
+ Datatypes are richer. (Sets, bags, strings, sequences, etc.)
+ The solutions of some problems decompose nicely into
parallel programs.
+ Flexible control structures (e.g., goal directed control,
dataflow, etc.) also simply problem solving.
+ Verification and formal program properties are important.
+ Infinite precision arithmetic (without round off error) is
desirable.
* Our programming style is becoming incompatible with the BGN!
+ The linear memory is incompatible with the new datatypes.
+ The machine operations are too elemental (too weak.)
What features are missing from this discussion of IAS?
+ IAS used special instructions for controlling I/O devices.
+ Direct memory access I/O was not invented yet.
+ Concurrent I/O and interrupts were not implemented either.
**************************************************
* These comments should precede dataflow section *
**************************************************
Improvements in technology (Optional.)
* BGN tried to maintain a balance between processor and
memory speed.
* When technology advanced, the rate imbalance became worse.
Year Machine Clock Access time Technology
---- ------- ----- ----------- ----------
1950 Manchester 8 usec Storage tube
EDSAC 0.5 MHz 40 usec Delay line
Whirlwind I 1-2 MHz 64 usec Drum
1955 IBM 704 1 MHz 12 usec Core
1960 IBM 7090 4.5 MHz 2.2 usec Core
1965 CDC 6600 40 MHz 1 usec Core
1968 CDC 7600 60 MHz .25 usec Core
* Processors were "kludged up" to work around the
"channel bottleneck."
+ General register set machines are the rage.
+ Instruction look ahead.
- Single instruction buffer. IBM 7094 (1964.)
- Multiple instruction buffer. CDC 6600 (1965.)
+ Instruction and data cache memory. IBM 360/85 (1968.)
+ Pipelining. IBM 360/91 (1965.)
Concurrency (optional.)
* Unfortunately, engineers began to electrively change some of
these principles. These changes caused unavoidable conflicts.
* Pipelining and concurrent I/O and interrupts were invented to
increase parallelism (dissatisfaction with synchronous,
sequential control.)
* But, a theory dealing with concurrent computing had not yet
been developed!
+ Operating systems and hardware began to fail mysteriously
due to synchronization failure.
+ One pipelined machine was constructed which could not
determine where an interrupt occured within the instruction
stream!
Technological constraints of VLSI systems (Optional.)
* Space, speed and power can be traded against each other.
* Active circuitry is cheap; wires are expensive.
+ Off-chip communications are very slow.
+ Wiring on-chip is a painfully expensive process.
+ Silicon wires are subject to "diffusion delays."
* Concurrency is easily bought.
* Global signals (power and ground?) are not very easily
distributed.
* A global clock is difficult to distribute. Synchronous timing
is difficult to verify.
New machine principles (Optional.)
* New self consistent principles must be found that fit the
modern view of programming and VLSI technology.
* Glushkov, et al.
+ Machines should be extensible for new datatypes and
control disciplines.
+ If the operands for an operation are available, the
operation should be performed.
+ Memory should be restructurable to support new datatypes.
+ There are no limits on the number of machine elements.
+ The computer organization is reprogrammable.
* Davis (among others.)
+ Only asynchronous, distributed control machines can be
easily expanded.
+ No module can determine the total system state.
+ Simultaneity cannot be enforced.
********************************************************************
--
paul j. drongowski
siemens corporate research inc pjd@demon.siemens.com
princeton, new jersey 08540 (617) 734-6547