koopman@A.GP.CS.CMU.EDU (Philip Koopman) (04/26/88)
One aspect of RISC processors for real time control that I have not seen discussed is the conflict between deadline scheduling and the statistical nature of RISC performance figures. Real-time control programs often have a situation where only X microseconds are available to perform a task. Therefore, the code to perform the task must be GUARANTEED to complete within X microseconds. In real-time control, a late answer is a wrong answer. The problem with RISC designs is that they promise a performance of Y MIPS in the average case over large sections of code and relatively long periods of time. It seems to me that this is not an applicable performance measure for real-time control. What is more important is worst-case performance (maximum possible cache misses for that program, branch-target buffer misses, etc.) It may be the case that a slower processor with uniform performance can be rated at a higher usable MIPS rate than a RISC processor with inconsistent instantaneous performance. So, what is a real-time control designer to do? -- De-rate the RISC MIPS ratings to assume 100% cache misses? -- Use (probably) non-existent tools to compute worst-case program execution time under all possible conditions? -- Not use RISC in an environment with short deadline events? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Phil Koopman 5551 Beacon St. ~ ~ Pittsburgh, PA 15217 ~ ~ koopman@faraday.ece.cmu.edu (preferred address) ~ ~ koopman@a.gp.cs.cmu.edu ~ ~ ~ ~ Disclaimer: I'm a PhD student at CMU, and I do some ~ ~ work for WISC Technologies. ~ ~ My opinions are my own, etc. ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
larry@mips.COM (Larry Weber) (04/26/88)
In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > >One aspect of RISC processors for real time control that I >have not seen discussed is the conflict between >deadline scheduling and the statistical nature of >RISC performance figures. > > ... >So, what is a real-time control designer to do? > >-- De-rate the RISC MIPS ratings to assume 100% cache misses? > >-- Use (probably) non-existent tools to compute worst-case > program execution time under all possible conditions? > >-- Not use RISC in an environment with short deadline events? > Cache effects can be present in any machine that has a cache: CISC or RISC. Answer 1 will provide a general guide-line of the effect only if you know how YOUR application maps onto the MIPS rating. Even if your program followed the MIPS rating in a number of trials, you still have to know how the time is allocated between memory references and other operations which do not have a statistical nature. Answer 2 will give a worst case bound on the performance. The MIPS compilers have tools that will inform you of the number cycles, instructions and memory references for a given run of the program. Computing worst case times is really a matter of multiplication. This answer is really over kill because not all applications require worst case times to be used for every part of the problem. For example, assume you had to accept a piece of data and queue it for processing while interupts were disabled. The critical time is how long are interupts disabled because data could be lost in that period. Answer 3 is like throwing out the baby with the bath water - this solution should be generalized to any hardware that has a statistical nature. This leaves out the 68020 and 030 too. -- -Larry Weber DISCLAIMER: I speak only for myself, and I even deny that. UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!larry, DDD:408-720-1700, x214 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
henry@utzoo.uucp (Henry Spencer) (04/26/88)
> So, what is a real-time control designer to do?
The same thing he does with a high-powered CISC: swear loudly, try to
estimate worst-case performance, and contemplate going back to the Z80.
At least RISC instruction times are more or less predictable, unlike those
of, say, the 68020.
More generally, there is a fundamental clash between trying to make the
performance simple and predictable and trying to maximize it by exploiting
regularities in the workload. If you want absolutely predictable speed,
then (for example) you will either have to live without caches or else
manage them very carefully so you know what they're doing. The same applies
to optimizing compilers, buffered I/O devices, asynchronous buses, etc etc.
--
"Noalias must go. This is | Henry Spencer @ U of Toronto Zoology
non-negotiable." --DMR | {ihnp4,decvax,uunet!mnetor}!utzoo!henry
koopman@A.GP.CS.CMU.EDU (Philip Koopman) (04/26/88)
In article <1521@pt.cs.cmu.edu>, koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > One aspect of RISC processors for real time control that I > have not seen discussed is the conflict between > deadline scheduling and the statistical nature of > RISC performance figures. > [stuff deleted] Thanks for the response so far. I have received several replies of the form that any machine with cache has problems with predictability of performance. I agree, but that isn't the whole question/answer. I thought that RISCs had a higher cache miss rate (in misses per second, not miss ratio) since they need more instructions, or is this solved with increased line size/prefetching? A better question is: is it appropriate to be using a RISC on embedded applications? What if you can't afford off-chip cache memory -- doesn't the increased instruction bandwidth required for a RISC cause problems? I get the feeling that cache helps a CISC somewhat, but that a RISC simply dies without a lot of cache -- is that really the case? Another concern has to do with program size. Everything I've seen says that RISCs have programs about twice as big as CISCs. What does that do in an embedded environment -- NO, Memory is NOT cheap when it costs power/weight/cooling/volume/dollars/chip count in a highly constrained application! Thanks for the feedback, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Phil Koopman 5551 Beacon St. ~ ~ Pittsburgh, PA 15217 ~ ~ koopman@faraday.ece.cmu.edu (preferred address) ~ ~ koopman@a.gp.cs.cmu.edu ~ ~ ~ ~ Disclaimer: I'm a PhD student at CMU, and I do some ~ ~ work for WISC Technologies. ~ ~ (No one listens to me anyway!) ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
aglew@urbsdc.Urbana.Gould.COM (04/26/88)
>So, what is a real-time control designer to do? > >-- De-rate the RISC MIPS ratings to assume 100% cache misses? You have to do this for CISCs with caches, not just RISCs. >-- Use (probably) non-existent tools to compute worst-case > program execution time under all possible conditions? In a hard real time environment you have to do thisd for CISCs as well as RISCs. I don't know of any tools to do this *well* in either camp, but building them should be considerably easier for a RISC than a CISC, given the preponderance of short, single cycle instructions, and explicitness of timing constraints. On a CISC you never know what interlock is going to bite you. In fact, wasn't this one of the original reasons for RISC - simple instructions make performance of code sequences easier to calculate, and hence easier to choose between in optimization? >-- Not use RISC in an environment with short deadline events? I rather think that the GE RPM-40 guys will disagree with you about that... aglew@gould.com
schmitz@FAS.RI.CMU.EDU (Donald Schmitz) (04/26/88)
In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: >Real-time control programs often have a situation where only >X microseconds are available to perform a task. Therefore, >the code to perform the task must be GUARANTEED to complete >within X microseconds. In real-time control, a late answer >is a wrong answer. This may be straying somewhat from the original point, but what sort of applications really have such exact timing deadlines? I have done a little real-time motion control, using a CPU to implement a discrete position control law for robot axes, and in general a few percent deviation in cycle time has next to no effect. As long as the deviation is small and well distributed, ie. delays of no more than 20% and occuring less than 10 sample periods in a row, I can't imagine a mechanical system reacting to the error. Don Schmitz (schmitz@fas.ri.cmu.edu)
petolino%joe@Sun.COM (Joe Petolino) (04/26/88)
>One aspect of RISC processors for real time control that I >have not seen discussed is the conflict between >deadline scheduling and the statistical nature of >RISC performance figures. > . . . >So, what is a real-time control designer to do? First (as others have pointed out) this problem has more to do with having a cache than with using any particular type of processor. RISC processors complicate this a little by providing opportunities for varying levels of optimization for a given piece of code. However, once it's cast into machine code, execution time (barring memory system effects) is quite predictable for most processors (either CISC or RISC), and could be determined with a good simulator. You could attack the cache problem by clever system design. A former employer of mine at one point contemplated building a RISC-based system aimed at real-time applications. Our plan was to use a set-associative instruction cache, and include a control bit in each cache set (writable by the operating system) which could 'lock' one of the elements of the set into the cache: if the bit was set, that cache block would never get swapped out of the cache (the rest of the set was still available for 'non-critical' stuff, which would suffer a higher miss rate due to the reduced cache size). If you loaded your response-critical code into the cache, then locked it in, one big variable went away. Unfortunately, this system never was built. Has anyone else done something like this? -Joe
bcase@Apple.COM (Brian Case) (04/27/88)
In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: >One aspect of RISC processors for real time control that I >have not seen discussed is the conflict between >deadline scheduling and the statistical nature of >RISC performance figures. ?????? And CISC (or whatever you consider an alternative to RISC) doesn't have the so-called "statistical nature" of performance?!?! >The problem with RISC designs is that they promise a performance >of Y MIPS in the average case over large sections of code and >relatively long periods of time. ?????? How do alternatives to RISC differ? >What is more important is worst-case performance (maximum >possible cache misses for that program, branch-target buffer >misses, etc.) Worst-case performance is always *most* important for real-time systems. Because of fundamental limitations of technology (big DRAMs are slower than small SRAMs), any processor that runs as fast as the technology will allow will rely on caching to some degree (I claim). To the extent that your real-time code can't depend on the cache(s) containing your working set (probably can't depend on it at all), you may be better off, in terms of cost, designing the hardware without caches. If the caches are on-chip, then you have no choice of course. Now, it *is* possible that, in an environment where the cache(s) is(are) always missing, cache(s) will actually make the system run slower. However, it will be more and more difficult to find any fast processor, CISC, RISC, or whatever-ISC, without on-chip caches. In fact, many CISCs will soon be implemented with a very RISC- like core. Oops, I guess I could have summarized this whole spiel by simply saying "your problem isn't RISC, its statistical techniques in general. These techniques are universally used." Maybe a good-old 68000 is your best bet?
bob@pedsga.UUCP (04/27/88)
In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU.UUCP writes: > { questioning the suitability of RISC processors for Real-Time use } > ... It seems to me that it is much *easier* to predict worst case performance for RISC processors because 1) Most execute one instruction/clock. You don't have to figure out how many cycles each instruction actually takes. 2) Most don't have interuptible instructions. Who knows how long it takes? If you are really concerned about cache misses, you would design your system so that all the memory was fast enough for the processor. And you wouldnt do demand-paging either. Just my opinion. Bob Weiler.
fdr@joy.ksr.com (Franklin Reynolds) (04/27/88)
Another similar question about RISC vrs. realtime is whether the philosophy of optimising for the general case instead of the exception is appropriate. As I understand it, optimising for the general case is fundamental to most RISC designs. Modern, sophisticated realtime systems that have to deal with hard time constraints and overload conditions might be better served by architectures that are optimized for various exceptional conditions. You could imagine an architecture optimized for speedy interrupt handling, context switching, process ordering, IPC, etc. This architecture might have advantages for certain types of realtime applications over designs that optimized for throughput in the general case. Franklin Reynolds Kendall Square Research Corporation fdr@ksr.uucp Building 300 / Hampshire Street ksr!fdr@harvard.harvard.edu One Kendall Square harvard!ksr!fdr Cambridge, Ma 02139
jmd@granite.dec.com (John Danskin) (04/28/88)
We have a leetle teeny ucode engine (read risc by Weitek) that needs some things locked into cache (a real time constraint that involves the bus hanging if we slip by even one cycle (our fault, not weitek's)). Fortunately, our system uses direct mapped caches, so we changed the linker so that modules which should be locked into cache get unique addresses (modulo the cache size). This works just fine, and since we have hardly any of this critical code, caused only a 2% overall code growth (because of all of the little holes) -- John Danskin | decwrl!jmd DEC Technology Development | (415) 853-6724 100 Hamilton Avenue | My comments are my own. Palo Alto, CA 94306 | I do not speak for DEC.
david@daisy.UUCP (David Schachter) (04/28/88)
In article <1534@pt.cs.cmu.edu> schmitz@FAS.RI.CMU.EDU (Donald Schmitz) writes: >In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: >>Real-time control programs often have a situation where only >>X microseconds are available to perform a task. Therefore, >>the code to perform the task must be GUARANTEED to complete >>within X microseconds. In real-time control, a late answer >>is a wrong answer. > >This may be straying somewhat from the original point, but what sort of >applications really have such exact timing deadlines?... >[I]n general a few percent deviation in cycle >time has next to no effect. As long as the deviation is small and well >distributed, ie. delays of no more than 20% and occuring less than 10 >sample periods in a row, I can't imagine a mechanical system reacting to the >error. Not all real-time system control mechanical objects. I wrote code for a radio-controlled clock. The microcontroller takes a non- maskable interrupt every millisecond. If the interrupt service routine ever takes more than a millisecond to execute, the results are: 1) The stack may get trashed, or it may not. 2) The clock will lose a millisecond. 3) Certain I/O ports may not be completely updated. 4) The clock may lose an output character (sending time to the host) 5) The clock may lose input characters (receiving commands from the host.) Depending on the customer's usage of the clock, the result could be a simple as a traffic light "slipping" a millisecond" or as bad as a wide-area network losing packets and not being able to restart after a network crash. I put in code to reset the clock if nested NMI's occur and I spent a lot of time counting clocks and doing measurements with an oscilloscope, to insure the interrupt service routine will alway take a less than a millisecond. Worst case time: 900 microseconds. Usual case: 100 microseconds. Before the work, the clock would often crash for no apparent reason. Turned out the previous programmer (this is two years ago) was allowing the ISR to take more than ten milliseconds (i.e. nesting NMI's ten levels deep!) Disclaimer: this article was written by Schroedinger's cat, Bill.
peter@athena.mit.edu (Peter J Desnoyers) (04/28/88)
Problems like this have already cropped up in the modem field, where you have RISC-like processors (e.g. TMS32020) which require very fast memory running code which has to run every sample time, and then a lot of random code to control the front panel, RS232, MNP, and other random piddling stuff. The solution until now was to use an 8 bit micro (sometimes a 68000) to do the piddling stuff that took up 80-90% of the code volume, and a signal processing micro to do the fast stuff, and give them each their own slow and fast memory, respectively. Things have changed. It is now possible to get at least one of these chips (I think it's the 32020) to do wait states on memory, and someone (I don't remember who) has now put their MNP implementation and a few other things on this processor, in slow ROM, while their signal processing code runs in fast (20ns?) RAM. It takes a lot more ROM space than an eight bit micro (simple, fixed-length (32 bit?) instructions, poor handling of anything but integer multiplies and accumulates) , but you still end up with fewer chips, lower cost, and a negligible load added to the signal processor. The interesting thing to notice is that there is no need for fast memory to be used as a cache in an embedded application. Just load your time-critical code into fast memory, and your random stuff into slow memory. If the time-critical part of the code is huge, then a cache wouldn't help anyway. Peter Desnoyers peter@athena.mit.edu
pardo@june.cs.washington.edu (David Keppel) (04/29/88)
I talked with our local real-time guru, Alan Shaw, who said something to the effect of (not an exact quote, but I'll try to get the message across): Doing any kind of timing analysis is very hard. You can't assume in your analysis that there's going to be bus contention every memory cycle, or your estimated performance is going to look much worse than it ever will in practice. What people really do is come up with reasonable figures based on the probability of there being N consecutive bus contention cycles, and make your timing analysis based on some number of contention cycles that will happen with a probability that is smaller than the chance of other catastrophic failure. Note that this analysis is independent of RISC/CISC or almost anything else. The key point here is that you can measure and estimate probabalistically, and in practice the failure rate from other sources (e.g., hardware failures) will be the dominant mode of failure. ;-D on ( Well it looked good when I closed my eyes ) Pardo
rick@pcrat.UUCP (Rick Richardson) (05/01/88)
In article <1532@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > >A better question is: is it appropriate to be using a RISC >on embedded applications? What if you can't afford off-chip cache >memory -- doesn't the increased instruction bandwidth required >for a RISC cause problems? I get the feeling that cache helps a CISC >somewhat, but that a RISC simply dies without a lot of cache -- is >that really the case? > I'm still looking for the RISC that does ~4K (C language) Dhrystones, has no cache, clocks around 4 Mhz, has a 16 bit bus, can address maybe 1MB, is a power miser, can't do floating point, and costs no more than $15. In HUGE quantities. Just think of the millions and millions of next generation consumer products that could use the extra performance, while still meeting EMI, power consumption, and cost requirements. Come on guys, I know that there's a lot of prestige in having the fastest micro-* around, but theres a LOT of HIGH VOLUME applications out there that just can't use all that power. You might sell 10K-100K of these super high performance chips. Wouldn't you rather sell *tens of millions*? -- Rick Richardson, President, PC Research, Inc. (201) 542-3734 (voice, nights) OR (201) 834-1378 (voice, days) uunet!pcrat!rick (UUCP) rick%pcrat.uucp@uunet.uu.net (INTERNET)
aglew@urbsdc.Urbana.Gould.COM (05/01/88)
>As far as I know, no one has solved the virtual cache coherency >problem yet... There sure are a lot of folk who think they have, though not commercially (yet). The virtual cache consistency problem is just like the physical cache consistency problem, except that you need a physical index for bus snooping. [Knowing I'm gonna get flamed :-) ]: of course, Alliant doesn't have too much to do with cache consistency - after all, the CEs talk to the same cache, don't they, so don't have any consistency problems? But how far can this scale? I suppose that the IPs have to be kept coherent, and I believe that's writeback, but the duty cycle doesn't have to be very high. aglew@gould.com
bcase@Apple.COM (Brian Case) (05/03/88)
In article <476@pcrat.UUCP> rick@pcrat.UUCP (Rick Richardson) writes: >I'm still looking for the RISC that does ~4K (C language) Dhrystones, >has no cache, clocks around 4 Mhz, has a 16 bit bus, can address maybe 1MB, >is a power miser, can't do floating point, and costs no more than $15. Oh, that's easy! The Acorn RISC Machine (ARM). Yes, I know it has a 32-bit bus now, but just talk to VTI (they have the ARM and use it as a cell, I think): if you are right about volumes, they'll make a mod to give it a 16-bit bus. On every other account, the ARM is what you want. I think you could even get it for around $10 instead of $15 (I think that price is currently available for large quantities). On second thought, with a 16-bit bus, it might slow down a lot. It seems worth looking into though.
jesup@pawl18.pawl.rpi.edu (Randell E. Jesup) (05/03/88)
In article <476@pcrat.UUCP> rick@pcrat.UUCP (Rick Richardson) writes: >I'm still looking for the RISC that does ~4K (C language) Dhrystones, >has no cache, clocks around 4 Mhz, has a 16 bit bus, can address maybe 1MB, >is a power miser, can't do floating point, and costs no more than $15. Yeah, and what technology is this wonder-chip implemented in??? Whatever it is, I can think of dozens of Si companies that would give away all their current facilites for that process. Oh, and I'm not even worrying about cost. Back to reality, it just can't be done, except MAYBE with a state of the art chip optimized to NOTHING but fast dhrystones (which, by the way, are a pretty poor predicter for most applications, due to string handling.) 4 Mhz is REAL slow. A 4Mhz rpm-40 would be equivalent to maybe a 14Mhz 68000 (note: not '020). At such slow speeds, CISC chips may well show superiority due to wanting to maximize the usefulness of every bus cycle. // Randell Jesup Lunge Software Development // Dedicated Amiga Programmer 13 Frear Ave, Troy, NY 12180 \\// beowulf!lunge!jesup@steinmetz.UUCP (518) 272-2942 \/ (uunet!steinmetz!beowulf!lunge!jesup) BIX: rjesup (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)
bcase@Apple.COM (Brian Case) (05/04/88)
In article <833@imagine.PAWL.RPI.EDU> jesup@pawl18.pawl.rpi.edu (Randell E. Jesup) writes:
= Yeah, and what technology is this wonder-chip implemented in???
=Whatever it is, I can think of dozens of Si companies that would give away
=all their current facilites for that process. Oh, and I'm not even worrying
=about cost.
=
= Back to reality, it just can't be done, except MAYBE with a state of
=the art chip optimized to NOTHING but fast dhrystones (which, by the way,
=are a pretty poor predicter for most applications, due to string handling.)
=4 Mhz is REAL slow. A 4Mhz rpm-40 would be equivalent to maybe a 14Mhz
=68000 (note: not '020). At such slow speeds, CISC chips may well show
=superiority due to wanting to maximize the usefulness of every bus cycle.
On the contrary. Let me say it again: the ARM from VTI and ACORN. At low
clock rates (so that memory access time isn't an issue), the ARM gets about
1K dhrystones per MHz (using the rather decent ACORN C compiler). The
process is (was) junky 2 or 3 micron CMOS. Current price for the ARM
(VTI 86000 I think is the part number) is very low in quantity, < $15 I
think. The only problem for meeting the original poster's requirements is
the 32-bit bus of the ARM.
baum@apple.UUCP (Allen J. Baum) (05/04/88)
-------- [] >In article <476@pcrat.UUCP> rick@pcrat.UUCP (Rick Richardson) writes: > >I'm still looking for the RISC that does ~4K (C language) Dhrystones, >has no cache, clocks around 4 Mhz, has a 16 bit bus, can address maybe 1MB, >is a power miser, can't do floating point, and costs no more than $15. > Except for the 16bit bus, the ARM chip seems to meet your qualifications. It looks very good for controller kinds of applications. Its simple, small (die size) and therefore, cheap. It does not require a cache, and knows how to talk to DRAMs with page mode access cycles to get good performance with no cache. -- {decwrl,hplabs,ihnp4}!nsc!apple!baum (408)973-3385
steckel@Alliant.COM (Geoff Steckel) (05/04/88)
In article <4921@bloom-beacon.MIT.EDU> peter@athena.mit.edu (Peter J Desnoyers) writes: >Things have changed. It is now possible to get at least one of these >chips (I think it's the 32020) to do wait states on memory, and >someone (I don't remember who) has now put their MNP implementation >and a few other things on this processor, in slow ROM, while their >signal processing code runs in fast (20ns?) RAM. The scheme mentioned is very close to one with which I am currently working. I recently surveyed all the DSP chips for which I could get documentation. Only the TI 320xxx series have a 'memory access done' pin. All the other chips (Moto, AD, NEC, OKI, ...) either have a programmable # wait states or assume external program or data memory is sufficiently fast to work synchronously. This makes ganging of DSP chips using shared (peer-to-peer) global memory difficult, and makes using mixed slow and fast program memory impossible. The designers seem to assume: 1) All parts of the application must run equally fast. 2) Programs will be small. 3) Data will be small or only accessed a little at a time. 4) The DSP chip will own all resources to which it is connected. 5) Any resource the DSP chip does not own are: a) connected via a serial port (a la Transputer, etc), or b) sufficiently unimportant that polling a ready line is good enough, or c) very fast, or d) nonexistent Can any of the DSP mavens comment on DSP architectures which 1) Can be connected to large (> 64K) shared memories, which the DSP may use, but does not own (i.e. must request and be granted access) and whose access time has an upper bound but is not deterministic below that bound. 2) Can run 'background' tasks (servicing panels, SCSI, etc., etc.) which require serious processing but much less than the 'foreground' task does, preferably with the code in slow (> 70nS, cheap!) memory. while doing 'foreground' classic DSP? Right now only TI's 320xx chips seem to have some of the hardware support, with the large advantage of an extremely narrow program memory path (16 bits!). The corresponding disadvantage is an extremely baroque and assymmetrical instruction set. The chip described is very close to a general purpose RISC chip, but with the following differences: 1) Onboard multiply must be very very fast (for convolutions, etc). 2) sub-wordsize (byte, etc.) performance not very important DSP almost (ha) never does divides, but 1000000s of multiplies. 3) barrel shifter very useful to required 4) extended precision adder for multiply and accumulate vital (e.g. if a * b yields 32 bits, at least 34 bits in the sum, preferably more like 40!). You don't have time to check for overflow. 5) Floating point is **really** nice, but many applications can be bludgeoned into fixed point. Painfully. If you do put in floating point, make it FAST. Like 2-3 cycles. 6) Cheaper than the RISC chips are running. $100/ea in moderate quantity. geoff steckel (steckel@alliant.COM)
sedwards@esunix.UUCP (Scott Edwards) (05/05/88)
From article <1534@pt.cs.cmu.edu>, by schmitz@FAS.RI.CMU.EDU (Donald Schmitz): > In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > >>Real-time control programs often have a situation where only >>X microseconds are available to perform a task. ..... > > This may be straying somewhat from the original point, but what sort of > applications really have such exact timing deadlines? I have done a little > real-time motion control, .... I worked on a project a while back that implimented a motion control servo loop with a microprocessor and every time the uP didn't make the deadline the loop would go unstable and lost all control. It was fun to watch! We finally had to change the time period so that the processor always completed it's job on time, even tho in other modes it was idle 60% of the time. -- Scott
peter@sugar.UUCP (Peter da Silva) (05/08/88)
In article <1534@pt.cs.cmu.edu>, schmitz@FAS.RI.CMU.EDU.UUCP writes: > In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) talks about hard realtime when he writes: > >Real-time control programs often have a situation where only > >X microseconds are available to perform a task. > This may be straying somewhat from the original point, but what sort of > applications really have such exact timing deadlines? How about jet engine control systems in fighters? Or the software that lands the space shuttle? -- -- Peter da Silva `-_-' ...!hoptoad!academ!uhnix1!sugar!peter -- "Have you hugged your U wolf today?" ...!bellcore!tness1!sugar!peter -- Disclaimer: These aren't mere opinions, these are *values*.
jack@swlabs.UUCP (Jack Bonn) (05/08/88)
From article <1534@pt.cs.cmu.edu>, by schmitz@FAS.RI.CMU.EDU (Donald Schmitz): > In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > >>Real-time control programs often have a situation where only >>X microseconds are available to perform a task. ..... > > This may be straying somewhat from the original point, but what sort of > applications really have such exact timing deadlines? I have done a little > real-time motion control, .... The worst system for real time deadlines I ever worked on was one that implemented the control functions for a bottle making machine. This wasn't a bottler; it took molten glass and formed it into bottles. We had a 2.5 MHz Z-80 and a periodic interrupt whose period was 1 msec. Doesn't leave much time for background processing. The worst case was if an output to the scoop was delayed. Rather than catching the molten gob of glass in flight, it would fling it across the plant floor. If it hit anyone, it would stick to their skin and most likely result in an amputation. Since I had previously worked on central office software, this gave me a much more clear view of real time. I used to worry about what would happen if a dial tone or compelled signaling tone was delayed. Ah, the good old days. -Jack -- Jack Bonn, <> Software Labs, Ltd, Box 451, Easton CT 06612 uunet!swlabs!jack
nather@ut-sally.UUCP (Ed Nather) (05/09/88)
In article <832@swlabs.UUCP>, jack@swlabs.UUCP (Jack Bonn) writes: > From article <1534@pt.cs.cmu.edu>, by schmitz@FAS.RI.CMU.EDU (Donald Schmitz): > > In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: > > > > This may be straying somewhat from the original point, but what sort of > > applications really have such exact timing deadlines? > > We had a 2.5 MHz Z-80 and a periodic interrupt whose period was 1 msec. > Doesn't leave much time for background processing. > Our data acquisition system for time-series analysis of variable stars also had 1 msec interrupts, imposed on a Nova minicomputer, ca. 5 usec add time reg to reg. If your interrupt routine chews up 100 usec, you still have 90% of the CPU left to do "background" processing (I always thought of it as "forground," because it's what the user sees -- keyboard response, display, etc.) That meant keeping the interrupt routine short in the worst case, and allowing ONLY the timing interrupt -- all other I/O was polled or DMA. That allowed us to specify the worst case condition -- when everything was active all at once -- and verify we'd never lose an interrupt. It was a disaster if we did: we'd get data that looked fine but was actually wrong. Not as dramatic as slinging molten glass at someone, of course, but still awful. I suspect time-critical software design will become more and more common as computers get faster, just because you can consider software control where only hardware was fast enough before. -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather@astro.AS.UTEXAS.EDU
mcdonald@uxe.cso.uiuc.edu (05/10/88)
>In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: >>Real-time control programs often have a situation where only >>X microseconds are available to perform a task. Therefore, >>the code to perform the task must be GUARANTEED to complete >>within X microseconds. In real-time control, a late answer >>is a wrong answer. > >This may be straying somewhat from the original point, but what sort of >applications really have such exact timing deadlines?... >[I]n general a few percent deviation in cycle >time has next to no effect. As long as the deviation is small and well >distributed, ie. delays of no more than 20% and occuring less than 10 >sample periods in a row, I can't imagine a mechanical system reacting to the >error. Sometimes microseconds can matter. Our most complicated real-time system runs a scanning interferometer and a laser. The interferometer is a mechanical plunger riding in a sleeve 0.00025 inch larger in diameter than the moving part, at a temperature of -196 Kelvin, on a cushion of pressurized helium. The "wiggle" tolerance on the motion is +- 0.000005 inch. This can only be achieved if the motion is smooth; this part is taken care of by servo hardware. This hardware detects the position of the mirror mounted on the plunger by counting interference fringes of a laser. It sends signals to the computer every 100 microseconds. The computer converts several error signals from the hardware and decides if they are within tolerance. If not, it skips a data point. If they are OK it starts the complicated process of firing the various parts of the laser so that the sixth anticipated trigger signal will occur just at the time the laser is really ready to go; the actual firing is by hardware. The computer again checks to see if the collected data is OK or garbage. Then it can start over again. The computer also checks on the "quality" of the servo loop inputs; if they get weak the moving parts have been known to self-destruct ($5000) - there are hardware "stops" to prevent destruction, but using them ruins the alignment and we have to warm up to room temperature to fix it, a three day process. We are using a PDP-11/73, with ALL interrupts disabled. The program was written in assembler, checking the timing of every instruction -- we can see by its outputs on a scope how much time we have to spare, and of course there are variations due to the cache hit/not hit probability, but we know FOR SURE that it won't overrun, as we give it 25% to spare, in the worst case. The code was an absolute nightmare to write, but it is actually rather simple , in fact only about 3000 lines. I would consider this to be "real-time". Doug McDonald
phil@osiris.UUCP (Philip Kos) (05/13/88)
>In article <1521@pt.cs.cmu.edu> koopman@A.GP.CS.CMU.EDU (Philip Koopman) writes: >This may be straying somewhat from the original point, but what sort of >applications really have such exact timing deadlines?... I worked on some real-time data acquisition applications at the University of Illinois between 1980 and 1984, and if my program wasn't ready to read that data word and put it someplace appropriate when it was ready to be read (affectionaly known as "overrun"), we had to throw out the whole trial and do it over again. Some of the experiments I assisted were simple enough, but most were not easily reproducible (particularly the ones dealing with muscle fatigue) and I never again want to suffer the wrath of a grad student facing a grant or thesis deadline. Like the original article said, if it's late, it might as well be wrong. Phil Kos Information Systems ...!uunet!pyrdc!osiris!phil The Johns Hopkins Hospital Baltimore, MD
mark@hubcap.UUCP (Mark Smotherman) (05/14/88)
What type of work has been done on benchmarks for real-time systems? The applications seem so specialized as to make most comparisons into apples versus oranges. Are there any standard, "representative" tasks that could be used to indicate the relative merit of a machine/OS? In evaluating a machine, do you rely mainly on interrupt latency measures, or on what? Please email responses and I will post a summary. Thanks. -- Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 INTERNET: mark@hubcap.clemson.edu UUCP: gatech!hubcap!mark
aronsson@sics.se (Lars Aronsson) (05/16/88)
>>>the code to perform the task must be GUARANTEED to complete >>>within X microseconds. In real-time control, a late answer >>>is a wrong answer. >> >>This may be straying somewhat from the original point, but what sort of >>applications really have such exact timing deadlines?... > >Sometimes microseconds can matter. Our most complicated real-time >system runs a scanning interferometer and a laser. The interferometer Enough! Obviously, real-time applications do exist. No more interferometers in this news group, please. A few years ago, there was a discussion on why you wouldn't use UNIX for real-time applications. This was because of the virtual memory system. Today, we have UNIX clones which allow you to lock a process in main memory, just like the UNIX kernel. Since a virtual memory system is but a cache mechanism for the disk, the following thoughts come naturally to me: Before I start: This might turn out to be Todays Dumb Suggestion. Maybe my ideas are already implemented on lots of systems or totally useless. Please, let me know! As far as I know, RISC instruction caches are a gain only when the processor runs through loops. What about the ability to declare cache-resident functions (procedures/subroutines)? This might not be the solution to real-time applications, but seems potentially useful in many other cases. Things normally managed by super-CISC instructions (decimal arithmetics, string instructions and the like) in such machines, would then be done with neat library functions declared as "register". The CISC equivalent to this would be to allow users to define new machine instructions at run-time. Of course, you would have to decide on what to do on a context switch. Maybe the the register functions should belong to a shared library and be more or less permanently in the cache. Perhaps, this kind of register functions would make the RISC vs CISC debate fade a little.
billo@cmx.npac.syr.edu (Bill O) (05/18/88)
In article <1924@sics.se> aronsson@sics.se (Lars Aronsson) writes: >Before I start: This might turn out to be Todays Dumb Suggestion. >Maybe my ideas are already implemented on lots of systems or totally >useless. Please, let me know! Yes, I think they have been to a certain extent. More in a bit... > >As far as I know, RISC instruction caches are a gain only when the >processor runs through loops. What about the ability to declare >cache-resident functions (procedures/subroutines)? This might not be >the solution to real-time applications, but seems potentially useful >in many other cases. > >Things normally managed by super-CISC instructions (decimal >arithmetics, string instructions and the like) in such machines, would >then be done with neat library functions declared as "register". The >CISC equivalent to this would be to allow users to define new machine >instructions at run-time. > >Of course, you would have to decide on what to do on a context switch. >Maybe the the register functions should belong to a shared library and >be more or less permanently in the cache. Actually, there is no need to use *associative* cache for this purpose, because the "associative" part is really just a mechanism to enable the computer to keep in fast memory a portion of the code which it predicts will be referenced in the near future (the prediction is usually based on past use). For functions declared as being "fast" or, as suggested, "register", all you really need is good old fashioned fast memory. What follows are excerpts from a couple of recent (past few months) postings relating to the way this sort of thing was done on the pdp 10 and 11 (the second excerpt gives new meaning to the declaration "register") [Dean W. Anneser, Pratt & Whitney Aircraft] -We have 7 of these beasties [pdp-11/55], and they're still running -strong. The memory configuration is 0-32kw bipolar, and 32-124kw MOS. -We keep the time- critical code in the bipolar. DEC has never -produced a faster PDP-11. We have benchmarked and are currently using -the 11/73, 11/83, and 11/84, and the 11/55 will still run circles -around them... [Brian Utterback, Cray Research Inc.] -Another advantage the PDP-10 had by mapping the registers to the -memory space, other than indexing, was in execution. You could load a -short loop into the registers and jump to them! The loop would run -much faster, executing out of the registers. Bill O'Farrell, Northeast Parallel Architectures Center at Syracuse University (billo@cmx.npac.syr.edu)