zik@bruce.cs.monash.OZ.AU (Michael Saleeba) (01/02/91)
When I first started learning about computer architecture one thing struck me as an incredibly obvious improvement - asynchronism. At that time I was messing around with 6800s and such things where one clock cycle was the same as one bus cycle. This seemed pretty silly since some operations didn't even use the bus, yet they still had to hang around for an entire microsecond. It seemed that a sensible architecture would only wait for as long as was warranted, not some arbitrary time. When the 68k came along I was pleased to see that it had provision for asynchrnous bus timing. Essentially this was due to the processor clock being so much faster than the usual bus cycle. Even so, most designs used synchronous circuits to simplify design. And these days asynchronism has basically gone by the wayside in favour of things like burst modes. Still, the basic concept still applies. Why not design a processor without any sort of clock at all? A processor which is based on the thought that you shouldn't have to wait any longer than absolutely necessary for _anything_. So your ALU would have x inputs, y outputs, and also timing inputs and outputs which wait until all operands became available, and then cause a delay of only the minimum length of time necessary to complete that particular operation. An entire CPU based on this system would be pretty complex, but surely todays's >1 million transistor pipelined, cached, etc devices would exceed this complexity greatly. Take this a step further and you find yourself in wierd territory. What if you wait only as long as it takes for outputs to settle, rather than waiting for the rated delay of the device? In this way you could have chips rated on their individual ability, rather than lumped into x-MHz categories. If your 100ns RAM responded in an average of 85ns, you'd reap the benefit! And your machine wouldn't crash on the odd occasion when things took longer. Of course this idea had quite a few problems... but it'd be exciting! It'd be really nice to be able to accelerate your machine by just popping in a faster processor or faster RAM and watching things just happen faster without any extra twiddles. Now I'm aware of quite a few reasons why totally asynchronous machines haven't been made much, but I can think of work-arounds to nearly all of them. Would anyone like to offer a concrete reason why this system is so little used? Or mention some machines that have used similar systems? ------------------------------------------------------------------------------ Michael Saleeba - sortof postgraduate student - Monash University, Australia zik@bruce.cs.monash.edu.au
rauletta@gmuvax2.gmu.edu (R. J. Auletta) (01/03/91)
In article <3523@bruce.cs.monash.OZ.AU> zik@bruce.cs.monash.OZ.AU (Michael Saleeba) writes: >When I first started learning about computer architecture one thing struck >me as an incredibly obvious improvement - asynchronism. At that time I You may wish to look at the paper "The Design of an Asynchronous Microprocessor" by A. Martin, S. Burns, et.al. in Advanced Research in VLSI, Proceedings of the Decennial Caltech Conference on VLSI, March 1989 or "The Design of a Delay-Insensitive uP: An example of Circuit Synthesis by Program Transformation" by A. Martin in Hardware Specification, Verification and Synthesis: Mathematical Aspects, Lecture Notes in Computer Science Vol. 408. --R J Auletta
kyriazis@iear.arts.rpi.edu (George Kyriazis) (01/03/91)
In article <3523@bruce.cs.monash.OZ.AU> zik@bruce.cs.monash.OZ.AU (Michael Saleeba) writes: >Still, the basic concept still applies. Why not design a processor without >any sort of clock at all? > This is already done at Caltech. There is a team (I don't remember names, but I can give the reference later) that produced a microprocessor made out of asunchronous logic. It worked quite fast, and when you cooled it down, it worked even faster. It was built from the top down, using communicating processes to simulate the different design blocks and message exchange between them. They claim it was a successful design. >Now I'm aware of quite a few reasons why totally asynchronous machines haven't >been made much, but I can think of work-arounds to nearly all of them. Would >anyone like to offer a concrete reason why this system is so little used? Or >mention some machines that have used similar systems? > Such systems are a pain to design and eat up a lot of wire area. You don't only have to say that is the value of a signal, but also if the signal has arrived or not. This is usually done by 2-rail logic and/or some additional 2-cycle or 4-cycle handshaking protocols. There is a chapter is Mead & Conway's book "Introduction to VLSI Systems", that is devoted to asynchronous VLSI circuits. You might want to take a look. Disclaimer: I don't know too much about asynchronous circuits, but I know the above were true a while back (1 to 2 years). Things might have changed by now. ---------------------------------------------------------------------- George Kyriazis kyriazis@rdrc.rpi.edu kyriazis@iear.arts.rpi.edu
petera@chook.adelaide.edu.au (Peter Ashenden) (01/03/91)
Another interesting reference on the topic is Ivan Sutherland's Turing Award Lecture: "Micropipelines", Communications of the ACM, Vol 32 No 6 (June 1989) pp 720-738. Peter A
naumann@autarch.acsu.buffalo.edu (Dirk Naumann) (01/04/91)
There are a number of people working on this subject right now. There is a paper from the university of utah at Salt Lake City, which does a pretty good survey of available techniques, including A. Martins approach to Self-Timed Circuits. If you are interested I can send you a the complete references. -- Dirk Naumann naumann@eng.buffalo.edu, ECE Department, SUNY at Buffalo
naumann@autarch.acsu.buffalo.edu (Dirk Naumann) (01/04/91)
Although it is right to say that totally asynchronous systems are hard to design, this can not be said about self timed circuits. In conventional asynchronous design you have to deal with races and hazards, which make designing quite a experience. This is different in self timed designs like the ones designed by Martin et al. Races and hazards are eliminated by design, since due to the four phase handshaking two signals never change their value at the same time. Another advantage, which comes to my mind is that this method of designing self timed circuits can be automated. Ada is also bassed on message passing to communicate with parallel processes. Couldn't it be possible to describe self timed circuits in ada. Something like this has been done be Sutherland and ???. They were using INMOS occam to describe an asynchronous (self-timed?) circuit. -- Dirk Naumann naumann@eng.buffalo.edu, ECE Department, SUNY at Buffalo
daveh@cbmvax.commodore.com (Dave Haynie) (01/08/91)
In article <3523@bruce.cs.monash.OZ.AU> zik@bruce.cs.monash.OZ.AU (Michael Saleeba) writes: >When I first started learning about computer architecture one thing struck >me as an incredibly obvious improvement - asynchronism. At that time I >was messing around with 6800s and such things where one clock cycle was >the same as one bus cycle. This seemed pretty silly since some operations >didn't even use the bus, yet they still had to hang around for an entire >microsecond. It seemed that a sensible architecture would only wait for as >long as was warranted, not some arbitrary time. Well, in lots of systems, these kinds of chips were driven in a somewhat asynchonous manner. In several 6502 family systems I worked on (same bus interface as a 6800 for the most part), we played games with the CPU clock. Basically, it would run ordinary cycles at the fastest clock speeds around, stretching part of the system as necessary to deal with slower devices. On such systems, real wait states were both hard to add and inefficient, so this was always the best approach. Modern CPUs are generally too dynamic to deal with clock stretching, and it's a bad idea anyway, since with big pipelines, you may have six things happening at one, only one of which is an external bus cycle. >When the 68k came along I was pleased to see that it had provision for >asynchrnous bus timing. Essentially this was due to the processor clock being >so much faster than the usual bus cycle. Even so, most designs used >synchronous circuits to simplify design. And these days asynchronism has >basically gone by the wayside in favour of things like burst modes. >Still, the basic concept still applies. Why not design a processor without >any sort of clock at all? A processor which is based on the thought that >you shouldn't have to wait any longer than absolutely necessary for _anything_. It might be kind of difficult for a processor to work this way. The hardest part of an asynchronous system is generating the timings -- delay lines are easy to buy for a systems design, but did you ever try to build a really tight one, on a chip. Whereas, a clock is easy to build timers, state machines, etc. with. I think asynchronous systems certainly have their place, though. In fact, the Amiga 3000 expansion bus protocol (called the "Zorro III" bus) I designed is completely asynchronous. A bus master and bus slave negotiate various phases of a bus cycle with different strobe lines. So a bus cycle need only take as long as the fastest bus master/slave combination, but if slower devices are on the bus, the cycles are naturally extended. Of course, reality sets in when you have to put something synchronous, like a 68030, on as bus master. The theoretical maximum speed of the bus is 50 MB/s (excluding burst cycles), the maximum speed the current 68030 bus-master implementation can hit is around 20 MB/s. Though many I/O and memory devices work very naturally as asynchronous slave devices. >It'd be really nice to be able to accelerate your machine by just popping in >a faster processor or faster RAM and watching things just happen faster without >any extra twiddles. In theory, that's what happens on a Zorro III bus. If you add a faster bus master, any memory boards capable of going faster do so. Then you add faster memory, and jump up again. Of course, if the bus master is a 68030 or similar synchronous device, things get quantizied into 68030 cycles -- you'll run in 40ns quanta on the current 25MHz system, though that's just an implementation detail, not part of the bus specification, and a different bus master might work much closer to the asynchronous ideal. > Michael Saleeba - sortof postgraduate student - Monash University, Australia > zik@bruce.cs.monash.edu.au -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy "Don't worry, 'bout a thing. 'Cause every little thing, gonna be alright" -Bob Marley
henry@zoo.toronto.edu (Henry Spencer) (01/09/91)
In article <17214@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes: >... In several 6502 family systems I worked on (same bus >interface as a 6800 for the most part), we played games with the CPU clock. >Basically, it would run ordinary cycles at the fastest clock speeds around, >stretching part of the system as necessary... Not unknown even in (somewhat) more modern machines. The early Sun 3's did 68020 memory accesses with 1.5, rather than 2, wait states by a similar trick. (Folks who were there have said, roughly, "we could have gotten it down to 1 if we'd really tried, but after years of building machines that were on the ragged edge of timing specs, there was considerable interest in a more robust design that would be easier to build in quantity".) -- If the Space Shuttle was the answer, | Henry Spencer at U of Toronto Zoology what was the question? | henry@zoo.toronto.edu utzoo!henry