gnu@hoptoad.uucp (John Gilmore) (02/16/86)
In article <156@motatl.UUCP>, wayne@motatl.UUCP (R.W.McGee) writes: > The use of software timing loops on an asyncronous > microprocessor should be discouraged...Public floggings would provide > a cure, but would be hard to implement. People who design microprocessors, who don't want software to depend on the timings of individual instructions in particular systems, should provide a system-independent way to delay for a specified amount of time. We use whatever you give us, guys! E.g. in meeting the recovery time of a particularly good USART chip with a horrible bus interface, the Z8530, you need to wait 2.2us between writes to it. Give me a good way to wait 2.2us *without* depending on instruction timing, and I'll consider your request. PS: if your answer is "add more chips", a lot of people will cheap out and use "free" software timing loops. -- John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa
mat@amdahl.UUCP (Mike Taylor) (02/17/86)
In article <530@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: > E.g. in meeting the recovery time of a particularly good USART chip > with a horrible bus interface, the Z8530, you need to wait 2.2us > between writes to it. Give me a good way to wait 2.2us *without* > depending on instruction timing, and I'll consider your request. Well, on a *real* computer, you just set the TOD clock comparator for now+2.2 us. and go do something useful while you wait. Sorry, couldn't resist. -- Mike Taylor ...!{ihnp4,hplabs,amd,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]
phil@amdcad.UUCP (Phil Ngai) (02/17/86)
In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >E.g. in meeting the recovery time of a particularly good USART chip >with a horrible bus interface, the Z8530, you need to wait 2.2us >between writes to it. Give me a good way to wait 2.2us *without* >depending on instruction timing, and I'll consider your request. In a design I did with the 8530, the device selection logic made all 8530 cycles about 3 uS long with wait states. For the first 2.2 uS of the cycle, the 8530 was actually not being accessed. This guaranteed the cycle recovery time needed. I had to use a PAL state machine to assure another parameter (address set up time) and so this cycle recovery time didn't cost anything extra, except the time it took me to think it up. I must admit part of my motivation for doing this was nightmares I had of obscure bugs showing up because the programmer didn't bother to read the specs carefully and violating the cycle recovery time. (or not even understanding what cycle recovery time was) In this example, it was possible to idiot proof the hardware at no incremental cost. I imagine it is possible to come up with cases where it does cost more but in my experience a sufficiently innovative design engineer can do it at no or very low cost (extra pin on PAL). -- Real men don't have answering machines. Phil Ngai +1 408 749 5720 UUCP: {ucbvax,decwrl,ihnp4,allegra}!amdcad!phil ARPA: amdcad!phil@decwrl.dec.com
davet@oakhill.UUCP (Dave Trissel) (02/17/86)
In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: > >People who design microprocessors, who don't want software to depend >on the timings of individual instructions in particular systems, >should provide a system-independent way to delay for a specified >amount of time. We use whatever you give us, guys! > I grew up with the early IBM 360 and it's built-in interval timer. Later models and the 370's had a time of day clock as well. I have often yearned to have the same common system-independent timing facilities in micros as well. But how do you accomplish that without forcing every system designer to hook up a constant frequency clock to every microprocessor in the family? Of course, the problem is that the basic clock frequency driving the chip is variable depending on the system. If we implemented on-chip a clock or timer register from where would it derive its frequency? Having an "adjust divisor" register setup by the system to factor the system clock would just push the problem right back into the hands of the O.S. coders where it is now since code somewhere would have to then setup the proper divisor. We currently have one customer running the MC68020 at 15 Meghertz, so we can't assume that the "standard" test frequencies of 12.5 and 16.6666 will be used. Customers are already preparing for the 20 and 25 Megahertz versions but there is no way to know now what their exact frequencies will end up. If you or anyone else has any suggestions on how to do this give a yell. -- Dave Trissel Motorola Semiconductor, Austin, Texas {ihnp4,seismo}!ut-sally!oakhill!davet [Sorry, BITNET, ARPANET etc. will not work as destinations from our mailer.]
bmw@aesat.UUCP (Bruce Walker) (02/17/86)
>.... in meeting the recovery time of a particularly good USART chip >with a horrible bus interface, the Z8530, you need to wait 2.2us >between writes to it. Give me a good way to wait 2.2us *without* >depending on instruction timing, and I'll consider your request. > >PS: if your answer is "add more chips", a lot of people will cheap >out and use "free" software timing loops. >-- >John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa You must be clocking your 8530 at 3 MHz. The spec for Valid Access Recovery Time is 6TcPC+200 (nS (130 for the 'A' part)) where TcPC is the bus clock cycle time. At 4MHz you should wait a minimum of 1.7uS and at 6MHz you only need to wait 1.2uS. The kind of people that "cheap out" are the kind of people that cripple their machines in a multitude of other subtle ways which are only appropriate for closed-architecture "games machines". Designers who are creating machines with a future growth path would put in the extra hardware (which only amounts to a small, registered PAL anyway). Bruce Walker {allegra,ihnp4,linus,decvax}!utzoo!aesat!bmw "I'd feel a lot worse if I wasn't so heavily sedated." -- Spinal Tap
jack@boring.UUCP (02/17/86)
> >E.g. in meeting the recovery time of a particularly good USART chip >with a horrible bus interface, the Z8530, you need to wait 2.2us >between writes to it. Give me a good way to wait 2.2us *without* >depending on instruction timing, and I'll consider your request. Sorry, but this bad makes it a particularly *bad* USART chip, regardless of any other features. Imagine writing a device driver for it, finding out that the C compiler generates such code that there's far more than 2.2us between writes, and leaving the place. Then, two years later, the site gets a new C compiler with a much better optimizer.......... -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.
nather@utastro.UUCP (Ed Nather) (02/17/86)
In article <647@oakhill.UUCP>, davet@oakhill.UUCP (Dave Trissel) writes: > Of course, the problem is that the basic clock frequency driving the chip > is variable depending on the system. If we implemented on-chip a clock > or timer register from where would it derive its frequency? Having an > "adjust divisor" register setup by the system to factor the system clock > would just push the problem right back into the hands of the O.S. coders > where it is now since code somewhere would have to then setup the proper > divisor. > > We currently have one customer running the MC68020 at 15 Meghertz, so we can't > assume that the "standard" test frequencies of 12.5 and 16.6666 will be used. > Customers are already preparing for the 20 and 25 Megahertz versions but > there is no way to know now what their exact frequencies will end up. > > If you or anyone else has any suggestions on how to do this give a yell. Some years ago I was faced with the problem of "upgrading" to a faster mini and wanted to use the same program for the "old" and "new" ones. They were enough different internally to require code to identify which was which, and adapt accordingly. I used a counting loop (once, on program start-up) to see whether the program was running in the fast or slow machine, by checking to see how far it got in a known amount of time. In that case, I used an attached teletype machine as a timer, since it took about 0.1 sec to print a character, and I watched its "busy" flag in the counting loop. I'm not proposing to put a TTY on a chip alongside the CPU (I doubt you can do that ...) but rather a simple, independent (and not very accurate) timer whose sole job would be to find out how fast the CPU clock is running. Simple software could then set the proper value into an adjustable count- down divider so a built-in timer, running off the divided CPU frequency, would be practical. The built-in timer need only be accurate enough to choose among a set of (quantized) clock frequencies. -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather@astro.UTEXAS.EDU
campbell@sauron.UUCP (Mark Campbell) (02/17/86)
In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >In article <156@motatl.UUCP>, wayne@motatl.UUCP (R.W.McGee) writes: >> The use of software timing loops on an asyncronous >> microprocessor should be discouraged...Public floggings would provide >> a cure, but would be hard to implement. > >People who design microprocessors, who don't want software to depend >on the timings of individual instructions in particular systems, >should provide a system-independent way to delay for a specified >amount of time. We use whatever you give us, guys! > >E.g. in meeting the recovery time of a particularly good USART chip >with a horrible bus interface, the Z8530, you need to wait 2.2us >between writes to it. Give me a good way to wait 2.2us *without* >depending on instruction timing, and I'll consider your request. > >PS: if your answer is "add more chips", a lot of people will cheap >out and use "free" software timing loops. >-- >John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa I was going to leave this one alone until I heard the H/W developers on the other side of the wall giggling about it. NCR has some damned good H/W engineers; and some of the best of these work in my division. These guys are so good that they actually let the software drive the architecture of a machine (what I consider the theoretical ideal, which is seldom obtained in the "real-world"). Of course, the term "drive" implies a high level of cooperation; however these guys really listen and make an effort to support those hardware features that we want. Unfortunately, there are often constraints that cause us to miss seeing eye to eye. As an example... Recently, we began work on a new machine. The very first thing the H/W guys did was obtain a copy of "All the Chips that Fit" (by Lyon and Skudlarek, of Sun) and proclaim that we wouldn't make the mistakes that Sun made. The major premise of the paper was that there were many chips that were on unfriendly terms with Unix; and that these chips caused a great deal of pain to a Unix implementation. Unfortunately, management then stepped in and gave us an unit price that was terrifyingly low. At the next review of the H/W, we suddenly found that we were getting many of those chips, or clones of those chips, that were specifically mentioned in the paper. We screamed, they screamed, etc. After digging through the manuals, however, we found that there was very little that could be done given their stringent constraints. A great example was our specification of what we now call "the mythical 32-bit, low-powered CMOS, battery-backed binary counter". I keep getting told that a certain BCD TOD chip is really fast. That doesn't do me a whole hell of a lot of good when I have to use 2 or 3 pages of conversion code to support it. The one thing we did insist upon, though, was glue logic to support those chips that had timing problems. Mr. Gilmore stated that microprocessor designers should design system independed ways of dealing with delays. What should really happen is that the chip manufacturers incorporate the delay logic within those chips. The use of the term "particularly good" when referring to a device with a major flaw such as this is a non sequiteur. Software delay loops not only cause poor performance (due to race conditions, interrupt latency, etc.) during porting but usually come back to haunt you a year or two later, when you switch to cheaper alternative sources for the devices. -- Mark Campbell Phone: (803)-791-6697 E-Mail: !ncsu!ncrcae!sauron!campbell
cmt@myrias.UUCP (Chris Thomson) (02/18/86)
In article <2795@amdahl.uucp> Mike Taylor writes: > Well, on a *real* computer, you just set the TOD clock comparator for > now+2.2 us. and go do something useful while you wait. Sorry, couldn't > resist. C'mon Mike! Even a 5860 takes >5 us to context switch (twice).
jimb@amdcad.UUCP (Jim Budler) (02/19/86)
In article <6780@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes: >... >Sorry, but this bad makes it a particularly *bad* USART chip, regardless >of any other features. >Imagine writing a device driver for it, finding out that the C compiler >generates such code that there's far more than 2.2us between writes, and >leaving the place. Then, two years later, the site gets a new C compiler >with a much better optimizer.......... Sorry, but this sounds like a bad device driver to me, not a bad device. Depending on a poor C compiler for timing is just as bad as depending on any other non-portable 'feature' of a C compiler. The device driver could be broken by a new C compiler in any of a few thousand other ways. -- Jim Budler Advanced Micro Devices, Inc. (408) 749-5806 Usenet: {ucbvax,decwrl,ihnp4,allegra,intelca}!amdcad!jimb Compuserve: 72415,1200
gnu@hoptoad.uucp (John Gilmore) (02/19/86)
In article <2795@amdahl.UUCP>, mat@amdahl.UUCP (Mike Taylor) writes: > In article <530@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: > > Give me a good way to wait 2.2us *without* > > depending on instruction timing, and I'll consider your request. > > Well, on a *real* computer, you just set the TOD clock comparator for > now+2.2 us. and go do something useful while you wait. Sorry, couldn't > resist. I didn't think you could do anything useful in 100 System/370 instructions anyway. In fact, it probably takes more than that to set the clock comparator (timer queues ya know). Sorry, didn't resist. Something like the System/370 TOD clock and comparator is the kind of facility I was talking about, though: a standard, high precision clock that doesn't change regardless of what system model you have. -- John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa
rshepherd@euroies.UUCP (Roger Shepherd INMOS) (02/20/86)
Have a look at the transputer. Apart from a bug in REV A devices, all transputers (no matter what speed selection) run of a standard (5 or 25 Mhz ) clock frequency. This is used to derive the standard comms link speed and the processor clock. The transputer's real time clock/alarm runs at 1 tick per micro-second at high priority, this means that the occam program below will look at (eg) a uart once every 5 uS. The accuracy achievable is quite good as the iunterrupt latency (time to go from low to high priority) is about 58 cycles worst case (2.9 uS for a T414-20 - currently available parts are -12 so are have 4.64 uS latency). PRI PAR SEQ ... initialisation WHILE polling SEQ TIME ? AFTER nextinstance ... poll uart or whatever nextinstance := nextinstance + 5 -- 1 tick per uS ... -- rest of system at low priority -- Roger Shepherd, INMOS Ltd, Whitefriars, Lewins Mead, Bristol, BS1 2NP, UK Tel: +44 272 290861 UUCP: ...!mcvax!euroies!rshepherd
mat@amdahl.UUCP (Mike Taylor) (02/20/86)
In article <221@myrias.UUCP>, cmt@myrias.UUCP (Chris Thomson) writes: > In article <2795@amdahl.uucp> Mike Taylor writes: > > Well, on a *real* computer, you just set the TOD clock comparator for > > now+2.2 us. and go do something useful while you wait. Sorry, couldn't > > resist. > > C'mon Mike! Even a 5860 takes >5 us to context switch (twice). Yes, but context switching isn't useful! Actually, just trying to point out that for timing like 2.2 us., regardless of how good your timing facility is (S/370 architecturally has 244 picosecond resolution), you can't ignore instruction timing. Fielding the external interrupt when the clock comparator "hits," even without a full context switch, will take quite a few cycles to save registers, etc. before being able to do any useful work. What is even worse is the more or less unpredictable timing delays due to cache effects, (consistency, misses) to say nothing of EC level changes (a cycle here or there...) It is probably true to say that you can't usefully time anything to a resolution better than plus or minus 20 cycles (300 ns.) even on a machine which has good timing facilities. -- Mike Taylor ...!{ihnp4,hplabs,amd,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]
andrew@aimmi.UUCP (Andrew Stewart) (02/21/86)
In article <2795@amdahl.UUCP> mat@amdahl.UUCP writes: >In article <530@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: >> E.g. in meeting the recovery time of a particularly good USART chip >> with a horrible bus interface, the Z8530, you need to wait 2.2us >> between writes to it. Give me a good way to wait 2.2us *without* >> depending on instruction timing, and I'll consider your request. > >Well, on a *real* computer, you just set the TOD clock comparator for >now+2.2 us. and go do something useful while you wait. Sorry, couldn't >resist. > On a *real* computer, you use the front panel keys. USARTS??? TOD clock??? Ha! (Remember the PDP-8, only bigger and better...) Andrew Stewart -- ------------------------------------------- Andrew Stewart USENET: ...!mcvax!ukc!aimmi!andrew "My axioms just fell into a Klein bottle"
farren@well.UUCP (Mike Farren) (02/22/86)
In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes: >(S/370 architecturally has 244 picosecond resolution) I admit to knowing little about the S/370, but a 4 GHz clock rate? Can someone verify this, please? I don't remember seeing any microwave plumbing in a 370... :-) -- Mike Farren uucp: {your favorite backbone site}!hplabs!well!farren Fido: Sci-Fido, Fidonode 125/84, (415)655-0667
ka@hropus.UUCP (Kenneth Almquist) (02/22/86)
>> Sorry, but this bad makes it a particularly *bad* USART chip, regardless >> of any other features. >> Imagine writing a device driver for it, finding out that the C compiler >> generates such code that there's far more than 2.2us between writes, and >> leaving the place. Then, two years later, the site gets a new C compiler >> with a much better optimizer.......... [JACK JANSEN] > > Sorry, but this sounds like a bad device driver to me, not a bad device. > Depending on a poor C compiler for timing is just as bad as depending > on any other non-portable 'feature' of a C compiler. The device driver > could be broken by a new C compiler in any of a few thousand other > ways. [JIM BUDLER] I don't like the idea of depending upon the C compiler for timing, but what is the alternative? Write the specific parts of the device driver in assembly language? This introduces maintainance problems of its own. You frequently have to depend upon non-portable features of C when writing device driver code, but of course device drivers are non-portable anyway. Kenneth Almquist ihnp4!houxm!hropus!ka (official name) ihnp4!opus!ka (shorter path)
jack@boring.uucp (Jack Jansen) (02/22/86)
In article <9645@amdcad.UUCP> jimb@amdcad.UUCP (Jim Budler) writes: >In article <6780@boring.UUCP> I wrote: >>... >>Sorry, but this bad makes it a particularly *bad* USART chip, regardless >>of any other features. >>Imagine writing a device driver for it, finding out that the C compiler >>generates such code that there's far more than 2.2us between writes, and >>leaving the place. Then, two years later, the site gets a new C compiler >>with a much better optimizer.......... > >Sorry, but this sounds like a bad device driver to me, not a bad device. >Depending on a poor C compiler for timing is just as bad as depending >on any other non-portable 'feature' of a C compiler. The device driver >could be broken by a new C compiler in any of a few thousand other >ways. The point here is that, even if you notice the small print in the datasheet, you look at your driver and say "oh, there's more than enough time in between writes", and forget about the whole timing constraint in five minutes. You're right that the driver could be broken in thousands of ways by a new compiler, but this is *not* due to the driver, it is due to the *device*. There is no way you'll write a device driver that is guaranteed to work with any C compiler, if you have to take care of timing considerations (unless you're willing to pay the penalty of using a *real* timer, of course). -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.
jimb@amdcad.UUCP (Jim Budler) (02/22/86)
In article <295@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes: >>> Sorry, but this bad makes it a particularly *bad* USART chip, regardless >>> of any other features. >>> Imagine writing a device driver for it, finding out that the C compiler >>> generates such code that there's far more than 2.2us between writes, and >>> leaving the place. Then, two years later, the site gets a new C compiler >>> with a much better optimizer.......... [JACK JANSEN] >> >> Sorry, but this sounds like a bad device driver to me, not a bad device. >> Depending on a poor C compiler for timing is just as bad as depending >> on any other non-portable 'feature' of a C compiler. The device driver >> could be broken by a new C compiler in any of a few thousand other >> ways. [JIM BUDLER] > >I don't like the idea of depending upon the C compiler for timing, but >what is the alternative? Write the specific parts of the device driver >in assembly language? This introduces maintainance problems of its own. > >You frequently have to depend upon non-portable features of C when writing >device driver code, but of course device drivers are non-portable anyway. I guess I wasn't quite clear. If whatever code you generated cannot guarantee 2.2uS when run through the optimizer then you cannot say that you have written a good device driver. I got another flame from somewhere asking me how I thought it should be done without timing loops or additional hardware. I didn't say not to use a timing loop, but if you are going to do software timing DO software timing. i.e. put some real code in there to GUARANTEE whatever time you want. And yes, in a situation like this, trying to guarantee 2.2uS, I think a short piece of assembly code to wait 3uS IS the answer. And how much of a maintenance problem can 5 or 6 lines of assembly code be. I've seen many cases like the Vax _doprint in the Berkeley code and a few other pieces of code with a couple of lines of in line assembly code in them. -- Jim Budler Advanced Micro Devices, Inc. (408) 749-5806 Usenet: {ucbvax,decwrl,ihnp4,allegra,intelca}!amdcad!jimb Compuserve: 72415,1200
davet@oakhill.UUCP (Dave Trissel) (02/23/86)
In article <689@well.UUCP> farren@well.UUCP (Mike Farren) writes: >>(S/370 architecturally has 244 picosecond resolution) > > I admit to knowing little about the S/370, but a 4 GHz clock rate? >Can someone verify this, please? I don't remember seeing any microwave >plumbing in a 370... :-) Back when I was working with 370's as a systems programmer the time of day (TOD) clock systems guaranteed that the resolution was greater than the shortest possible instruction time. In other words, you would always get a unique value from the TOD clock even if you read it with back to back store clock instructions. What this indicated was that the clock resolution depended on the machine model of 370. The bottom line 370s (370/25 if I remember correctly) were so slow that a clock frequency of several microseconds would have sufficed. -- Dave Trissel Motorola Austin {seismo,ihnp4}!ut-sally!im4u!oakhill!davet
cmt@myrias.UUCP (Chris Thomson) (02/24/86)
> >(S/370 architecturally has 244 picosecond resolution) > I admit to knowing little about the S/370, but a 4 GHz clock rate? The 370 architecture has timers that are 64 bits wide, with the 12th bit from the right (low order) end being 1 microsecond. It is model-dependent how many of the low-order 12 bits actually count, as opposed to holding zero values. However, the timer resolution should be similar to instruction execution time, since the instruction store time of day is required to give a different answer each time it is used, even on a multiple-CPU configuration. Current high-end 370 models have resolutions of a few nanoseconds.
aglew@ccvaxa.UUCP (02/24/86)
>/* Written 11:14 pm Feb 22, 1986 by mjs@sfsup.UUCP in ccvaxa:net.arch */ >As a kernel hacker, I would maintain that a device that requires a >certain latency and neither rejects further commands nor signals an >iterrupt until it's ready is a botch. Why patch software when the >hardware CAN do it right? Software is not the answer to hardware >designer ineptitude. Even if it has to be done at the board level, >the proper choice is to add the hardware to disable access to the >device until its latency period is over. As an apprentice kernel hacker (well, not quite apprentice - I'm learning by doing) and an aspiring hardware designer, I respond that those nice features you want in your devices are probably provided by firmware, which is a lot cheaper than extra hardware, and that this firmware has to be programmed in some language. You don't want to condemn firmware programmers to always working in assembly, do you? I agree that devices interfacing to a large, multiuser, UNIX system should be well behaved, but you don't necessarily want to pay that price in small systems - hell, on small systems you can't talk so blithely about the board level, boards are damned expensive. And even on large systems, there are devices that respond quickly enough, and for which you cannot afford the extra delay provided by hardware lockouts, that direct control is necessary. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ...!ihnp4!uiucdcs!ccvaxa!aglew ARPAnet: aglew@gswd-vms
jer@peora.UUCP (J. Eric Roskos) (02/24/86)
> I admit to knowing little about the S/370, but a 4 GHz clock rate? > Can someone verify this, please? I don't remember seeing any microwave > plumbing in a 370... :-) Actually this involves a really interesting aspect of the nature of "time" on a computer. Suppose you have (hopefully without loss of generality :-)) a machine all of whose instructions take the same amount of time to execute. Suppose it can execute 32,768 instructions per second. Now, suppose you have a clock that counts in 1/65536ths of a second. Then, as far as you are concerned, it's impossible to tell the clock is running that fast... depending on when you began execution relative to the counter in the clock, the low-order bit will always be a zero or one every time you look at it. Although it's possibly not as obvious, the same thing happens if you don't have such "round" numbers... if the timer is counting faster than your CPU's basic cycle time (and if the CPU runs with a fixed-rate clock) then the timer's counter will appear to be being incremented by some constant value, and there's no way to tell it's going faster than that. So, you can replace the faster timer with a much slower one that increments its counter by this integer, and no one will be able to tell the difference. Of course, this assumes that the timer doesn't do anything else, e.g., control some external devices which rely on the faster clock rate. Actually this generalizes to an even more interesting idea, viz., that if a CPU doesn't have any kind of reliable external clock to measure time against, then if you stop the CPU's clock occasionally, or make it run irregularly, the CPU's "idea" of time will be such that events external to it that are happening at a constant rate will appear to the CPU to be occurring irregularly. So you get to experiment a little with the relativistic nature of time this way. -- UUCP: Ofc: jer@peora.UUCP Home: jer@jerpc.CCUR.UUCP CCUR DNS: peora, pesnta US Mail: MS 795; CONCURRENT Computer Corp. SDC; (A Perkin-Elmer Company) 2486 Sand Lake Road, Orlando, FL 32809-7642
mat@amdahl.UUCP (Mike Taylor) (02/24/86)
In article <689@well.UUCP>, farren@well.UUCP (Mike Farren) writes: > In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes: > >(S/370 architecturally has 244 picosecond resolution) > > I admit to knowing little about the S/370, but a 4 GHz clock rate? > Can someone verify this, please? I don't remember seeing any microwave > plumbing in a 370... :-) > Architecture, not necessarily implemented. See S/370 XA Principles of Operation, IBM pub# SA-22-7085 pp.4-20,4-21 -- Mike Taylor ...!{ihnp4,hplabs,amd,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]
sher@rochester.UUCP (David Sher) (02/24/86)
Just to introduce a theoretical note: Wouldn't an entirely self timed architecture avoid the issue of software timing loops? I would think that this would put the problem where it belongs, in the hardware. Of course you pay a certain factor for self timing (I think it depends on the size of the chunks of hardware that are self timed). Probably this has no relevance to current real machines but I'm an academic anyway. -- -David Sher sher@rochester seismo!rochester!sher
dick@ucsfcca.UUCP (Dick Karpinski) (02/26/86)
In article <6790@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes: > >There is no way you'll write a device driver that is guaranteed >to work with any C compiler, if you have to take care of timing >considerations (unless you're willing to pay the penalty of using >a *real* timer, of course). I thought someone suggested a solution: Build a timing loop, but set its constant (how many cycles) using some other, possibly low precision, timer to see how fast the loop is _today_ with this compiler/clock/cpu-chip/whatever. I _think_ that one can usually count on those things remaining constant _during_ this run of the program, i.e. between reboots of the OS. I have heard of systems which change their cpu clock on the fly, but you probably know that when you write the device driver. Is that enough? Dick -- Dick Karpinski Manager of Unix Services, UCSF Computer Center UUCP: ...!ucbvax!ucsfcgl!cca.ucsf!dick (415) 476-4529 (12-7) BITNET: dick@ucsfcca Compuserve: 70215,1277 Telemail: RKarpinski USPS: U-76 UCSF, San Francisco, CA 94143
rb@ccivax.UUCP (rex ballard) (02/27/86)
What is really needed is a standard "time base" signal that is independent of "Chip Clock" speed. When Chip makers create a "wait X" instruction where X is the time in microseconds, and put the neccessary loop/wait in the micro-code, I will be GLAD to stop using loops. I have always hated them (bursty DMA can blow timing too). Something is needed for that little window that is smaller than the RTC interrupt, and bigger than a NOP or wait state. Even a constant period NOP would be nice. With the 68020, even the old trick of running a big loop and timing it against the RTC chip doesn't work (cache misses in shorter loops). There are still a few places where very short timing intervals are still needed, and have to be reasonably accurate. Things like hand-shaking, line-turnaround, device locking (where TAS won't do, or needs to be done more than once). The original point of using them for copy-protection schemes and long trivial loops is valid. Don't do it.
ron@brl-smoke.ARPA (Ron Natalie <ron>) (02/27/86)
> In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes: > >(S/370 architecturally has 244 picosecond resolution) > > I admit to knowing little about the S/370, but a 4 GHz clock rate? > Can someone verify this, please? I don't remember seeing any microwave > plumbing in a 370... :-) > Who said anything about a 4Ghz clock rate? All he said was the architecture supported that resolution. If the 370 time register were incremented by one, it would be updated every 244 picoseconds. However each machine in the line increments it by a somewhat larger number corresponding to the speed of the clock on that processor. Hence, the higher order bits have been consistant accross 10 years of processors and will continue to be so for quite a few years to come I would expectr. -Ron
gnu@hoptoad.uucp (John Gilmore) (02/28/86)
In article <613@sauron.UUCP>, campbell@sauron.UUCP (Mark Campbell) writes: > Recently, we began work on a new machine. The very first thing the H/W > guys did was obtain a copy of "All the Chips that Fit" (by Lyon and > Skudlarek, of Sun) and proclaim that we wouldn't make the mistakes that > Sun made. > > Unfortunately, management then stepped in... > > The one thing we did insist upon, though, was glue logic to support > those chips that had timing problems. Mr. Gilmore stated that > microprocessor designers should design system independed ways of > dealing with delays. What should really happen is that the chip > manufacturers incorporate the delay logic within those chips. Peripheral chips are driven by strobes from the outside world. In almost all cases (except tightly coupled coprocessor style chips), the peripheral chip does not tell the CPU when it is done with a request; the system designer is expected to have read the data sheet and set up the right number of wait states and such. This makes the peripheral chip usable in many systems no matter whose CPU or bus you are using. Given that piece of reality, how should chip manufacturers "incorporate the delay logic within those chips"? What should the chip do if it gets a strobe and isn't ready for one? I currently prefer the approach of having a status pin saying "ready" or "not ready", which external hardware (or software) can use to avoid strobing the chip when it is not ready. However, pins are expensive, so this only happens when there's a spare pin around. If they're going to add a few pins to make the chip nicer, I vote for a few other uses first (like multiple address lines to avoid the write-the-address-then- write-the-data approach, which falls down if you take an interrupt in the middle). System designers can nowadays use a PAL to generate the right number of wait states, when a few years ago there was just a decoder and you were stuck with whatever timing it could provide. This does cost quite a bit more than a few extra bytes of software, though. > The use of the term "particularly good" when referring to a device with > a major flaw such as this is a non sequiteur. The Z8530 *is* a particularly good chip. I would rate it as one of the best chips Zilog has designed. Its bus interface has clearly shown up as the weakest part of the chip though -- especially the vectored interrupt "support". If one of the seventy-leven companies which are second sourcing it and putting their own part numbers on it would instead fix the bus interface, I bet they'd get some sales. -- John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa
marty@fritz.UUCP (Marty McFadden) (03/02/86)
I must agree that even well documented code is still burdensome if timing loops are needed. I recently ported Unix* System V from the 68000 to the 68020 (16 2/3 Mhz), there were unfortunately a few timing loops that were put into the kernel that had to be changed. (talk about finding a needle in a haystack!!) *Unix is a trademark of Bell Laboratories Martin J. McFadden FileNet Corp trwrb!fritz!marty
campbell@sauron.UUCP (Mark Campbell) (03/03/86)
In article <566@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >In article <613@sauron.UUCP>, campbell@sauron.UUCP (Mark Campbell) writes: >> ... What should really happen is that the chip >> manufacturers incorporate the delay logic within those chips. > > [Fundamental H/W realities...] > >Given that piece of reality, how should chip manufacturers "incorporate >the delay logic within those chips"? What should the chip do if it >gets a strobe and isn't ready for one? Obviously you don't let the chip get a strobe for which it's not ready; you delay DSACK on the first access to the chip until the delay time has expired (i.e., until the device may be accessed legally again). What you've done is implemented the registered PAL approach (state-machine) at the device level. >I currently prefer the approach of having a status pin saying "ready" >or "not ready", which external hardware (or software) can use to avoid >strobing the chip when it is not ready. Just use the status pin to indicate when it is safe to contine, and you've solved the problem. In essence, you've constructed the same delay mechanism which you proposed to the uP guys (more on that later). > However, pins are expensive, >so this only happens when there's a spare pin around. If they're going >to add a few pins to make the chip nicer, I vote for a few other uses >first (like multiple address lines to avoid the write-the-address-then- >write-the-data approach, which falls down if you take an interrupt in >the middle). Spare pins aren't usually an all or nothing proposition...a single delay status pin is a lot cheaper than demultiplexing the address/data lines. Besides, I'd rather perform an extra two instructions (raising the priority of the processor, writing the address/data, and lowering the priority) than worry about ALL of the problems associated with delays. >System designers can nowadays use a PAL to generate the right number of >wait states, when a few years ago there was just a decoder and you were >stuck with whatever timing it could provide. This does cost quite a bit >more than a few extra bytes of software, though. The cost is three to five dollars per registered PAL...and you're right, it ain't cheap. I realize that this translates to at least nine to fifteen dollars to the customer. But how much do you think it will cost the customer when he gets an intermittent character lost somewhere down the road? And how much S/W development time do you think it will cost each time the fault is "discovered"? You can't use a single point model of cost in this situation; it just doesn't apply. >> The use of the term "particularly good" when referring to a device with >> a major flaw such as this is a non sequiteur. > >The Z8530 *is* a particularly good chip. [...] And the 432 *was* a particularly good chip (set), it was just a little slow. (:-) Seriously, it very well might be an *excellent* chip, just like the TOD chip I previously mentioned. However, both are *extremely bad* chips in the context of supporting Unix, because both contain major flaws with respect to Unix. H/W exists only to execute S/W (just like an OS exists only to execute applications), and if the H/W does it poorly, then it just isn't good H/W...regardless of the cost. I believe the last sentence will cause a lot of flamage. Before you H/W designers go on a crazed rampage, let me say that I consider this subject somewhat moot. The original posting came from a guy at Motorola warning against timing loops. This was followed by John Gilmore suggesting uP modifications that would solve this problem. Well, I believe that John, in his last posting (not the message that this is in response to) gave a solution which is adequate. I hope it is implemented; however, I also hope that peripheral chip manuafacturers will clean-up their bus interface problems. -- Mark Campbell Phone: (803)-791-6697 E-Mail: !ncsu!ncrcae!sauron!campbell
jer@peora.UUCP (J. Eric Roskos) (03/04/86)
John Gilmore (gnu@hoptoad.UUCP) writes: > Peripheral chips are driven by strobes from the outside world. In > almost all cases (except tightly coupled coprocessor style chips), the > peripheral chip does not tell the CPU when it is done with a request; > the system designer is expected to have read the data sheet and set up > the right number of wait states and such. I must disagree with this! Although possibly my confusion (and possibly part of the debate itself) may arise because we are thinking of different types of "peripheral chips". Peripheral chips that perform I/O operations -- UARTs, disk controllers, DMA controllers, etc. -- should certainly tell the CPU when it is done with a request. That is what interrupts are for! I think that is what the original poster was referring to. It would be bad to convince people who may design new peripheral parts that they should do away with the "ready" pins on their devices; if they did that, the ability of I/O software to work in a reasonable manner would rapidly diminish, especially for "multitasking" systems. On the other hand, wait states for slow parts are a different matter, and I think maybe that was what the above poster was referring to? Still, it would be far better to handle this in hardware -- for example, suspending the processor's execution if it tries to access the part again before it has completed the previous operation -- than expecting a software timing loop to do it. Of course, this may have an adverse effect in the case of real time applications, where you need to know how long it will take to perform each operation; you could suspend execution as soon as the original access is done, until it is completed, which would give a constant delay regardless of the time between accesses. But for many applications, you could do something useful during that time. -- UUCP: Ofc: jer@peora.UUCP Home: jer@jerpc.CCUR.UUCP CCUR DNS: peora, pesnta US Mail: MS 795; CONCURRENT Computer Corp. SDC; (A Perkin-Elmer Company) 2486 Sand Lake Road, Orlando, FL 32809-7642 LOTD(5)=O ---------------------- Amusing error message explaining reason for some returned mail recently: > 554 xxxxxx.xxxxxx.ATT.UUCP!xxx... Unknown domain address: Not a typewriter (The above message is true... only the names have been changed...)
grr@cbm.UUCP (George Robbins) (03/04/86)
In article <566@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >In article <613@sauron.UUCP>, campbell@sauron.UUCP (Mark Campbell) writes: > > [Fundamental H/W realities...] > >> The use of the term "particularly good" when referring to a device with >> a major flaw such as this is a non sequiteur. > >The Z8530 *is* a particularly good chip. [...] Speaking of moot problems, the recovery time specification for the Z8530 is a non-problem in most applications. The data sheet basically specifies that so many PCLK cycles must elapse between accesses. Unless you are using an unusually slow PCLK, the overhead of the C style inb()/outb() subrountine calls will eat up the requisite cycles. Assembly code may need a nop or two to guarentee cycles. To avoid interrupt hassles, you can define C routines outoutb() and outinb() that save interrupt status, off interrupts, write pointer, nop, read/write data, and restore interrupts. Also, the DMA status pins can be used to generate hardware wait states or be sensed by software (through some other chip) to indicate when the chip is ready to accept another operation. The real hardware world is full of these little kludges in much the same sense as unix is blessed with features and warts. The 8530 does just about anything you would want a serial interface to do, short of ethernet, and does it for two channels. Neither is perfect, but they let you get the job done. -- George Robbins - now working with, uucp: {ihnp4|seismo|caip}!cbm!grr but no way officially representing arpa: cbm!grr@seismo.css.GOV Commodore, Engineering Department fone: 215-431-9255 (only by moonlite)
franka@mmintl.UUCP (Frank Adams) (03/04/86)
In article <6780@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes: >>E.g. in meeting the recovery time of a particularly good USART chip >>with a horrible bus interface, the Z8530, you need to wait 2.2us >>between writes to it. > >Sorry, but this bad makes it a particularly *bad* USART chip, regardless >of any other features. It seems to me that from theoretical considerations, there will always be *some* time dependencies in any device. If you run it with a fast enough processor, it will stop working. Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108
rb@ccivax.UUCP (rex ballard) (03/06/86)
In article <443@ucsfcca.UUCP> dick@ucsfcca.UUCP (Dick Karpinski) writes: >In article <6790@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes: >> >I thought someone suggested a solution: Build a timing loop, but >set its constant (how many cycles) using some other, possibly low >precision, timer to see how fast the loop is _today_ with this >compiler/clock/cpu-chip/whatever. I _think_ that one can usually >count on those things remaining constant _during_ this run of the >program, i.e. between reboots of the OS. I have heard of systems >which change their cpu clock on the fly, but you probably know >that when you write the device driver. Is that enough? > >Dick >-- This works ok UNLESS you have a high speed cache (like the 68020) AND are servicing non-maskable interrupts. The timing in a tight loop can wander all over the place, depending on how often the interrupts scramble your cache. Problems can ALSO occur when co-processors or multiple DMA devices are sharing the bus and your processor has low priority.
greg@utcsri.UUCP (Gregory Smith) (03/10/86)
In article <1162@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes: >It seems to me that from theoretical considerations, there will always be >*some* time dependencies in any device. If you run it with a fast enough >processor, it will stop working. False. A processor-to-device interface can be designed in such a way that an access to a slow device will cause the processor to be 'stopped' until the device is ready. This can be done in a port-dependent way, i.e. if there is only one slow device on the bus, the processor will only be slowed when that device is accessed. The 'stopped' state of the processor is sometimes called a 'wait' state. On many systems, this technique is the rule rather than the exception - I think UNIBUS is an example. -- "So this is it. We're going to die." - Arthur Dent ---------------------------------------------------------------------- Greg Smith University of Toronto ..!decvax!utzoo!utcsri!greg
campbell@sauron.UUCP (Mark Campbell) (03/14/86)
In article <25@cbm.UUCP> grr@cbm.UUCP (George Robbins) writes: > >Speaking of moot problems, the recovery time specification for the Z8530 is >a non-problem in most applications. The data sheet basically specifies that >so many PCLK cycles must elapse between accesses. Unless you are using an >unusually slow PCLK, the overhead of the C style inb()/outb() subrountine calls >will eat up the requisite cycles. Assembly code may need a nop or two to >guarentee cycles. Absolute last word on the subject (we clock at 4MHz, but I'll assume 6MHz): 8530 Set-Up Delay = 6 x T x PCLK + 200ns = 6 x 165ns + 200ns = 1.19us (6MHz, best case) = 6 x 2us + 200ns = 12.2us (6MHz, worst case) MC68020 NOP Time = 2 x 60ns = 120ns (16.67MHz, loop) = 2 x 80ns = 160ns (12.5MHz, loop) 1.19us / 120ns = 10 NOP's; 1.19us / 160ns = 8 NOP's (best case) 12.2us / 120ns = 102 NOP's; 12.2us / 160ns = 77 NOP's (worst case) I'm not familiar with the "C style inb()/outb" routines you mention. However, I would respectfully suggest that if these routines can guarantee a 12.2us delay through normal code execution, you fire the guy who wrote it. You'll probably notice that the number of NOP's required are a bit more than "a nop or two" you predicted. This is compounded by the fact that interrupts are disabled at this time, increasing latency. *This* is compounded by the fact that we have to assume worst case with no H/W chip level support. *This* is compounded by the fact that this software solution still does not solve the problems associated with 8530-related DMA accesses. I included the best case timings to illustrate what the proper H/W can do. I suggest that anyone out there having problems with the 8530 see the April 4 issue of EDN, pages 274-275, for a nice H/W solution for the 8530's recovery time problems. -- Mark Campbell Phone: (803)-791-6697 E-Mail: !ncsu!ncrcae!sauron!campbell
gnu@hoptoad.uucp (John Gilmore) (03/16/86)
In article <25@cbm.UUCP>, grr@cbm.UUCP (George Robbins) writes: > Speaking of moot problems, the recovery time specification for the Z8530 is > a non-problem in most applications. The data sheet basically specifies that > so many PCLK cycles must elapse between accesses. Unless you are using an > unusually slow PCLK, the overhead of the C style inb()/outb() subroutine calls > will eat up the requisite cycles. Assembly code may need a nop or two to > guarentee cycles. This is only true if you have an unusually slow CPU. Ours overruns the chip without trouble. Maybe Commodore's doesn't. -- John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa