[net.micro.68k] timing loops

wayne@motatl.UUCP (R.W.McGee) (02/09/86)

The use of software timing loops on an asyncronous
microprocessor should be discouraged. On the 68k family,
these loops will get different results depending on dtack
being the same for rom or ram, and will be different from
the 68000, 68010, and 68020 based on the loop mode of the 
68010 or cache of the 68020. Public floggings would provide
a cure, but would be hard to implement.

Wayne McGee  mot!motatl!wayne

Standard Disclaimer #10

gnu@hoptoad.uucp (John Gilmore) (02/16/86)

In article <156@motatl.UUCP>, wayne@motatl.UUCP (R.W.McGee) writes:
> The use of software timing loops on an asyncronous
> microprocessor should be discouraged...Public floggings would provide
> a cure, but would be hard to implement.

People who design microprocessors, who don't want software to depend
on the timings of individual instructions in particular systems,
should provide a system-independent way to delay for a specified
amount of time.  We use whatever you give us, guys!

E.g. in meeting the recovery time of a particularly good USART chip
with a horrible bus interface, the Z8530, you need to wait 2.2us
between writes to it.  Give me a good way to wait 2.2us *without*
depending on instruction timing, and I'll consider your request.

PS:  if your answer is "add more chips", a lot of people will cheap
out and use "free" software timing loops.
-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa

mat@amdahl.UUCP (Mike Taylor) (02/17/86)

In article <530@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes:
> E.g. in meeting the recovery time of a particularly good USART chip
> with a horrible bus interface, the Z8530, you need to wait 2.2us
> between writes to it.  Give me a good way to wait 2.2us *without*
> depending on instruction timing, and I'll consider your request.

Well, on a *real* computer, you just set the TOD clock comparator for
now+2.2 us. and go do something useful while you wait.  Sorry, couldn't
resist.

-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

phil@amdcad.UUCP (Phil Ngai) (02/17/86)

In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>E.g. in meeting the recovery time of a particularly good USART chip
>with a horrible bus interface, the Z8530, you need to wait 2.2us
>between writes to it.  Give me a good way to wait 2.2us *without*
>depending on instruction timing, and I'll consider your request.

In a design I did with the 8530, the device selection logic made all
8530 cycles about 3 uS long with wait states. For the first 2.2 uS of
the cycle, the 8530 was actually not being accessed.  This guaranteed
the cycle recovery time needed. I had to use a PAL state machine to
assure another parameter (address set up time) and so this cycle
recovery time didn't cost anything extra, except the time it took me
to think it up.

I must admit part of my motivation for doing this was nightmares I had
of obscure bugs showing up because the programmer didn't bother to
read the specs carefully and violating the cycle recovery time.  (or
not even understanding what cycle recovery time was) In this example,
it was possible to idiot proof the hardware at no incremental cost.  I
imagine it is possible to come up with cases where it does cost more
but in my experience a sufficiently innovative design engineer can do
it at no or very low cost (extra pin on PAL).
-- 
 Real men don't have answering machines.

 Phil Ngai +1 408 749 5720
 UUCP: {ucbvax,decwrl,ihnp4,allegra}!amdcad!phil
 ARPA: amdcad!phil@decwrl.dec.com

davet@oakhill.UUCP (Dave Trissel) (02/17/86)

In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>
>People who design microprocessors, who don't want software to depend
>on the timings of individual instructions in particular systems,
>should provide a system-independent way to delay for a specified
>amount of time.  We use whatever you give us, guys!
>

I grew up with the early IBM 360 and it's built-in interval timer.  Later
models and the 370's had a time of day clock as well.  I have often
yearned to have the same common system-independent timing facilities
in micros as well.  But how do you accomplish that without forcing
every system designer to hook up a constant frequency clock to every
microprocessor in the family?

Of course, the problem is that the basic clock frequency driving the chip
is variable depending on the system.  If we implemented on-chip a clock
or timer register from where would it derive its frequency?  Having an
"adjust divisor" register setup by the system to factor the system clock
would just push the problem right back into the hands of the O.S. coders
where it is now since code somewhere would have to then setup the proper
divisor.

We currently have one customer running the MC68020 at 15 Meghertz, so we can't
assume that the "standard" test frequencies of 12.5 and 16.6666 will be used.
Customers are already preparing for the 20 and 25 Megahertz versions but
there is no way to know now what their exact frequencies will end up.

If you or anyone else has any suggestions on how to do this give a yell.

  --  Dave Trissel  Motorola Semiconductor, Austin, Texas
	{ihnp4,seismo}!ut-sally!oakhill!davet
[Sorry, BITNET, ARPANET etc. will not work as destinations from our mailer.]

bmw@aesat.UUCP (Bruce Walker) (02/17/86)

>.... in meeting the recovery time of a particularly good USART chip
>with a horrible bus interface, the Z8530, you need to wait 2.2us
>between writes to it.  Give me a good way to wait 2.2us *without*
>depending on instruction timing, and I'll consider your request.
>
>PS:  if your answer is "add more chips", a lot of people will cheap
>out and use "free" software timing loops.
>-- 
>John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa

You must be clocking your 8530 at 3 MHz.  The spec for Valid Access
Recovery Time is 6TcPC+200 (nS (130 for the 'A' part)) where TcPC is
the bus clock cycle time.  At 4MHz you should wait a minimum of 1.7uS
and at 6MHz you only need to wait 1.2uS.

The kind of people that "cheap out" are the kind of people that cripple
their machines in a multitude of other subtle ways which are only
appropriate for closed-architecture "games machines".  Designers who
are creating machines with a future growth path would put in the extra
hardware (which only amounts to a small, registered PAL anyway).

Bruce Walker     {allegra,ihnp4,linus,decvax}!utzoo!aesat!bmw

"I'd feel a lot worse if I wasn't so heavily sedated." -- Spinal Tap

jack@boring.UUCP (02/17/86)

>
>E.g. in meeting the recovery time of a particularly good USART chip
>with a horrible bus interface, the Z8530, you need to wait 2.2us
>between writes to it.  Give me a good way to wait 2.2us *without*
>depending on instruction timing, and I'll consider your request.

Sorry, but this bad makes it a particularly *bad* USART chip, regardless
of any other features.
Imagine writing a device driver for it, finding out that the C compiler
generates such code that there's far more than 2.2us between writes, and
leaving the place. Then, two years later, the site gets a new C compiler
with a much better optimizer..........
-- 
	Jack Jansen, jack@mcvax.UUCP
	The shell is my oyster.

nather@utastro.UUCP (Ed Nather) (02/17/86)

In article <647@oakhill.UUCP>, davet@oakhill.UUCP (Dave Trissel) writes:
> Of course, the problem is that the basic clock frequency driving the chip
> is variable depending on the system.  If we implemented on-chip a clock
> or timer register from where would it derive its frequency?  Having an
> "adjust divisor" register setup by the system to factor the system clock
> would just push the problem right back into the hands of the O.S. coders
> where it is now since code somewhere would have to then setup the proper
> divisor.
> 
> We currently have one customer running the MC68020 at 15 Meghertz, so we can't
> assume that the "standard" test frequencies of 12.5 and 16.6666 will be used.
> Customers are already preparing for the 20 and 25 Megahertz versions but
> there is no way to know now what their exact frequencies will end up.
> 
> If you or anyone else has any suggestions on how to do this give a yell.

Some years ago I was faced with the problem of "upgrading" to a faster mini
and wanted to use the same program for the "old" and "new" ones.  They were
enough different internally to require code to identify which was which, and
adapt accordingly.

I used a counting loop (once, on program start-up) to see whether the program
was running in the fast or slow machine, by checking to see how far it got in
a known amount of time.  In that case, I used an attached teletype machine
as a timer, since it took about 0.1 sec to print a character, and I watched
its "busy" flag in the counting loop.

I'm not proposing to put a TTY on a chip alongside the CPU (I doubt you can
do that ...) but rather a simple, independent (and not very accurate) timer
whose sole job would be to find out how fast the CPU clock is running.
Simple software could then set the proper value into an adjustable count-
down divider so a built-in timer, running off the divided CPU frequency,
would be practical.  The built-in timer need only be accurate enough to
choose among a set of (quantized) clock frequencies.


-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.UTEXAS.EDU

campbell@sauron.UUCP (Mark Campbell) (02/17/86)

In article <530@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>In article <156@motatl.UUCP>, wayne@motatl.UUCP (R.W.McGee) writes:
>> The use of software timing loops on an asyncronous
>> microprocessor should be discouraged...Public floggings would provide
>> a cure, but would be hard to implement.
>
>People who design microprocessors, who don't want software to depend
>on the timings of individual instructions in particular systems,
>should provide a system-independent way to delay for a specified
>amount of time.  We use whatever you give us, guys!
>
>E.g. in meeting the recovery time of a particularly good USART chip
>with a horrible bus interface, the Z8530, you need to wait 2.2us
>between writes to it.  Give me a good way to wait 2.2us *without*
>depending on instruction timing, and I'll consider your request.
>
>PS:  if your answer is "add more chips", a lot of people will cheap
>out and use "free" software timing loops.
>-- 
>John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa

I was going to leave this one alone until I heard the H/W developers on the
other side of the wall giggling about it.

NCR has some damned good H/W engineers; and some of the best of these work in
my division.  These guys are so good that they actually let the software drive
the architecture of a machine (what I consider the theoretical ideal, which is
seldom obtained in the "real-world").  Of course, the term "drive" implies a
high level of cooperation; however these guys really listen and make an
effort to support those hardware features that we want.  Unfortunately, there
are often constraints that cause us to miss seeing eye to eye.  As an example...

Recently, we began work on a new machine.  The very first thing the H/W guys
did was obtain a copy of "All the Chips that Fit" (by Lyon and Skudlarek, of Sun)
and proclaim that we wouldn't make the mistakes that Sun made.  The major premise
of the paper was that there were many chips that were on unfriendly terms with
Unix; and that these chips caused a great deal of pain to a Unix implementation.

Unfortunately, management then stepped in and gave us an unit price that was
terrifyingly low.  At the next review of the H/W, we suddenly found that we
were getting many of those chips, or clones of those chips, that were specifically
mentioned in the paper.  We screamed, they screamed, etc.  After digging through
the manuals, however, we found that there was very little that could be done given
their stringent constraints.  A great example was our specification of what we
now call "the mythical 32-bit, low-powered CMOS, battery-backed binary counter".
I keep getting told that a certain BCD TOD chip is really fast.  That doesn't do
me a whole hell of a lot of good when I have to use 2 or 3 pages of conversion code
to support it.

The one thing we did insist upon, though, was glue logic to support those chips
that had timing problems.  Mr. Gilmore stated that microprocessor designers should
design system independed ways of dealing with delays.  What should really happen
is that the chip manufacturers incorporate the delay logic within those chips.
The use of the term "particularly good" when referring to a device with a major
flaw such as this is a non sequiteur.  Software delay loops not only cause poor
performance (due to race conditions, interrupt latency, etc.) during porting but
usually come back to haunt you a year or two later, when you switch to cheaper
alternative sources for the devices.
-- 

Mark Campbell    Phone: (803)-791-6697     E-Mail: !ncsu!ncrcae!sauron!campbell

cmt@myrias.UUCP (Chris Thomson) (02/18/86)

In article <2795@amdahl.uucp> Mike Taylor writes:
> Well, on a *real* computer, you just set the TOD clock comparator for
> now+2.2 us. and go do something useful while you wait.  Sorry, couldn't
> resist.

C'mon Mike!  Even a 5860 takes >5 us to context switch (twice).

jimb@amdcad.UUCP (Jim Budler) (02/19/86)

In article <6780@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes:
>...
>Sorry, but this bad makes it a particularly *bad* USART chip, regardless
>of any other features.
>Imagine writing a device driver for it, finding out that the C compiler
>generates such code that there's far more than 2.2us between writes, and
>leaving the place. Then, two years later, the site gets a new C compiler
>with a much better optimizer..........

Sorry, but this sounds like a bad device driver to me, not a bad device.
Depending on a poor C compiler for timing is just as bad as depending
on any other non-portable 'feature' of a C compiler.  The device driver 
could be broken by a new C compiler in any of a few thousand other
ways.
-- 
 Jim Budler
 Advanced Micro Devices, Inc.
 (408) 749-5806
 Usenet: {ucbvax,decwrl,ihnp4,allegra,intelca}!amdcad!jimb
 Compuserve:	72415,1200

rshepherd@euroies.UUCP (Roger Shepherd INMOS) (02/20/86)

Have a look at the transputer. Apart from a bug in REV A devices, all
transputers (no matter what speed selection) run of a standard (5 or 25 Mhz
) clock frequency. This is used to derive the standard comms link
speed and the processor clock. The transputer's real time
clock/alarm runs at 1 tick per micro-second at high priority,
this means that the occam program below will look at (eg) a uart
once every 5 uS. The accuracy achievable is quite good as the iunterrupt
latency (time to go from low to high priority) is about 58 cycles worst 
case (2.9 uS for a T414-20 - currently available parts are -12 so are
have 4.64 uS latency). 

PRI PAR
  SEQ 
    ... initialisation
    WHILE polling
      SEQ
        TIME ? AFTER nextinstance
        ... poll uart or whatever
        nextinstance := nextinstance + 5 -- 1 tick per uS
  ... -- rest of system at low priority
-- 
Roger Shepherd, INMOS Ltd, Whitefriars, Lewins Mead, Bristol, BS1 2NP, UK
Tel: +44 272 290861
UUCP: ...!mcvax!euroies!rshepherd

mat@amdahl.UUCP (Mike Taylor) (02/20/86)

In article <221@myrias.UUCP>, cmt@myrias.UUCP (Chris Thomson) writes:
> In article <2795@amdahl.uucp> Mike Taylor writes:
> > Well, on a *real* computer, you just set the TOD clock comparator for
> > now+2.2 us. and go do something useful while you wait.  Sorry, couldn't
> > resist.
> 
> C'mon Mike!  Even a 5860 takes >5 us to context switch (twice).

Yes, but context switching isn't useful! Actually, just trying to point
out that for timing like 2.2 us., regardless of how good your timing
facility is (S/370 architecturally has 244 picosecond resolution), you can't
ignore instruction timing.  Fielding the external interrupt when the
clock comparator "hits," even without a full context switch, will take quite
a few cycles to save registers, etc. before being able to do any useful
work.  What is even worse is the more or less unpredictable timing
delays due to cache effects, (consistency, misses) to say nothing of
EC level changes (a cycle here or there...)  It is probably true to
say that you can't usefully time anything to a resolution better
than plus or minus 20 cycles (300 ns.) even on a machine which has good
timing facilities.
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

andrew@aimmi.UUCP (Andrew Stewart) (02/21/86)

In article <2795@amdahl.UUCP> mat@amdahl.UUCP writes:

>In article <530@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes:
>> E.g. in meeting the recovery time of a particularly good USART chip
>> with a horrible bus interface, the Z8530, you need to wait 2.2us
>> between writes to it.  Give me a good way to wait 2.2us *without*
>> depending on instruction timing, and I'll consider your request.
>
>Well, on a *real* computer, you just set the TOD clock comparator for
>now+2.2 us. and go do something useful while you wait.  Sorry, couldn't
>resist.
>

On a *real* computer, you use the front panel keys. USARTS??? TOD clock???
Ha! (Remember the PDP-8, only bigger and better...)

	Andrew Stewart
-- 
-------------------------------------------
Andrew Stewart		 USENET:   ...!mcvax!ukc!aimmi!andrew

"My axioms just fell into a Klein bottle"

farren@well.UUCP (Mike Farren) (02/22/86)

In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes:
>(S/370 architecturally has 244 picosecond resolution)

   I admit to knowing little about the S/370, but a 4 GHz clock rate?
Can someone verify this, please?   I don't remember seeing any microwave
plumbing in a 370... :-)

-- 
           Mike Farren
           uucp: {your favorite backbone site}!hplabs!well!farren
           Fido: Sci-Fido, Fidonode 125/84, (415)655-0667

ka@hropus.UUCP (Kenneth Almquist) (02/22/86)

>> Sorry, but this bad makes it a particularly *bad* USART chip, regardless
>> of any other features.
>> Imagine writing a device driver for it, finding out that the C compiler
>> generates such code that there's far more than 2.2us between writes, and
>> leaving the place. Then, two years later, the site gets a new C compiler
>> with a much better optimizer..........	[JACK JANSEN]
>
> Sorry, but this sounds like a bad device driver to me, not a bad device.
> Depending on a poor C compiler for timing is just as bad as depending
> on any other non-portable 'feature' of a C compiler.  The device driver 
> could be broken by a new C compiler in any of a few thousand other
> ways.						[JIM BUDLER]

I don't like the idea of depending upon the C compiler for timing, but
what is the alternative?  Write the specific parts of the device driver
in assembly language?  This introduces maintainance problems of its own.

You frequently have to depend upon non-portable features of C when writing
device driver code, but of course device drivers are non-portable anyway.
				Kenneth Almquist
				ihnp4!houxm!hropus!ka	(official name)
				ihnp4!opus!ka		(shorter path)

jack@boring.uucp (Jack Jansen) (02/22/86)

In article <9645@amdcad.UUCP> jimb@amdcad.UUCP (Jim Budler) writes:
>In article <6780@boring.UUCP> I wrote:
>>...
>>Sorry, but this bad makes it a particularly *bad* USART chip, regardless
>>of any other features.
>>Imagine writing a device driver for it, finding out that the C compiler
>>generates such code that there's far more than 2.2us between writes, and
>>leaving the place. Then, two years later, the site gets a new C compiler
>>with a much better optimizer..........
>
>Sorry, but this sounds like a bad device driver to me, not a bad device.
>Depending on a poor C compiler for timing is just as bad as depending
>on any other non-portable 'feature' of a C compiler.  The device driver 
>could be broken by a new C compiler in any of a few thousand other
>ways.

The point here is that, even if you notice the small print in
the datasheet, you look at your driver and say "oh, there's
more than enough time in between writes", and forget about the
whole timing constraint in five minutes.

You're right that the driver could be broken in thousands of ways
by a new compiler, but this is *not* due to the driver, it is
due to the *device*.

There is no way you'll write a device driver that is guaranteed
to work with any C compiler, if you have to take care of timing
considerations (unless you're willing to pay the penalty of using
a *real* timer, of course).
-- 
	Jack Jansen, jack@mcvax.UUCP
	The shell is my oyster.

jimb@amdcad.UUCP (Jim Budler) (02/22/86)

In article <295@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes:
>>> Sorry, but this bad makes it a particularly *bad* USART chip, regardless
>>> of any other features.
>>> Imagine writing a device driver for it, finding out that the C compiler
>>> generates such code that there's far more than 2.2us between writes, and
>>> leaving the place. Then, two years later, the site gets a new C compiler
>>> with a much better optimizer..........	[JACK JANSEN]
>>
>> Sorry, but this sounds like a bad device driver to me, not a bad device.
>> Depending on a poor C compiler for timing is just as bad as depending
>> on any other non-portable 'feature' of a C compiler.  The device driver 
>> could be broken by a new C compiler in any of a few thousand other
>> ways.						[JIM BUDLER]
>
>I don't like the idea of depending upon the C compiler for timing, but
>what is the alternative?  Write the specific parts of the device driver
>in assembly language?  This introduces maintainance problems of its own.
>
>You frequently have to depend upon non-portable features of C when writing
>device driver code, but of course device drivers are non-portable anyway.

I guess I wasn't quite clear.  If whatever code you generated cannot
guarantee 2.2uS when run through the optimizer then you cannot say that
you have written a good device driver. I got another flame from somewhere
asking me how I thought it should be done without timing loops or
additional hardware. I didn't say not to use a timing loop, but if
you are going to do software timing DO software timing. i.e. put some real
code in there to GUARANTEE whatever time you want.

And yes, in a situation like this, trying to guarantee 2.2uS, I think
a short piece of assembly code to wait 3uS IS the answer. And how much
of a maintenance problem can 5 or 6 lines of assembly code be.  I've
seen many cases like the Vax _doprint in the Berkeley code and a few other
pieces of code with a couple of lines of in line assembly code in them.
-- 
 Jim Budler
 Advanced Micro Devices, Inc.
 (408) 749-5806
 Usenet: {ucbvax,decwrl,ihnp4,allegra,intelca}!amdcad!jimb
 Compuserve:	72415,1200

davet@oakhill.UUCP (Dave Trissel) (02/23/86)

In article <689@well.UUCP> farren@well.UUCP (Mike Farren) writes:

>>(S/370 architecturally has 244 picosecond resolution)
>
>   I admit to knowing little about the S/370, but a 4 GHz clock rate?
>Can someone verify this, please?   I don't remember seeing any microwave
>plumbing in a 370... :-)

Back when I was working with 370's as a systems programmer the time of day
(TOD) clock systems guaranteed that the resolution was greater than the
shortest possible instruction time.  In other words, you would always
get a unique value from the TOD clock even if you read it with back to back
store clock instructions.

What this indicated was that the clock resolution depended on the machine
model of 370.  The bottom line 370s (370/25 if I remember correctly) were
so slow that a clock frequency of several microseconds would have sufficed.

  --  Dave Trissel  Motorola Austin
  {seismo,ihnp4}!ut-sally!im4u!oakhill!davet

cmt@myrias.UUCP (Chris Thomson) (02/24/86)

> >(S/370 architecturally has 244 picosecond resolution)
>    I admit to knowing little about the S/370, but a 4 GHz clock rate?

The 370 architecture has timers that are 64 bits wide, with the 12th bit
from the right (low order) end being 1 microsecond.  It is model-dependent
how many of the low-order 12 bits actually count, as opposed to holding zero
values.  However, the timer resolution should be similar to instruction
execution time, since the instruction store time of day is required to give
a different answer each time it is used, even on a multiple-CPU
configuration.  Current high-end 370 models have resolutions of a few
nanoseconds.

jer@peora.UUCP (J. Eric Roskos) (02/24/86)

>    I admit to knowing little about the S/370, but a 4 GHz clock rate?
> Can someone verify this, please?   I don't remember seeing any microwave
> plumbing in a 370... :-)

Actually this involves a really interesting aspect of the nature of "time"
on a computer.  Suppose you have (hopefully without loss of generality :-))
a machine all of whose instructions take the same amount of time to execute.
Suppose it can execute 32,768 instructions per second.  Now, suppose you
have a clock that counts in 1/65536ths of a second.  Then, as far as you
are concerned, it's impossible to tell the clock is running that fast...
depending on when you began execution relative to the counter in the clock,
the low-order bit will always be a zero or one every time you look at it.

Although it's possibly not as obvious, the same thing happens if you don't
have such "round" numbers... if the timer is counting faster than your
CPU's basic cycle time (and if the CPU runs with a fixed-rate clock) then
the timer's counter will appear to be being incremented by some constant
value, and there's no way to tell it's going faster than that.

So, you can replace the faster timer with a much slower one that increments
its counter by this integer, and no one will be able to tell the difference.
Of course, this assumes that the timer doesn't do anything else, e.g.,
control some external devices which rely on the faster clock rate.

Actually this generalizes to an even more interesting idea, viz., that if
a CPU doesn't have any kind of reliable external clock to measure time
against, then if you stop the CPU's clock occasionally, or make it run
irregularly, the CPU's "idea" of time will be such that events external to
it that are happening at a constant rate will appear to the CPU to be
occurring irregularly.  So you get to experiment a little with the
relativistic nature of time this way.
-- 
UUCP: Ofc:  jer@peora.UUCP  Home: jer@jerpc.CCUR.UUCP  CCUR DNS: peora, pesnta
  US Mail:  MS 795; CONCURRENT Computer Corp. SDC; (A Perkin-Elmer Company)
	    2486 Sand Lake Road, Orlando, FL 32809-7642

mat@amdahl.UUCP (Mike Taylor) (02/24/86)

In article <689@well.UUCP>, farren@well.UUCP (Mike Farren) writes:
> In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes:
> >(S/370 architecturally has 244 picosecond resolution)
> 
>    I admit to knowing little about the S/370, but a 4 GHz clock rate?
> Can someone verify this, please?   I don't remember seeing any microwave
> plumbing in a 370... :-)
> 

Architecture, not necessarily implemented. See S/370 XA Principles of
Operation, IBM pub# SA-22-7085 pp.4-20,4-21
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

sher@rochester.UUCP (David Sher) (02/24/86)

Just to introduce a theoretical note:
Wouldn't an entirely self timed architecture avoid the issue of
software timing loops?  I would think that this would put the problem
where it belongs, in the hardware.  Of course you pay a certain factor
for self timing (I think it depends on the size of the chunks of hardware
that are self timed).  

Probably this has no relevance to current real machines but I'm an academic
anyway.
-- 
-David Sher
sher@rochester
seismo!rochester!sher

dick@ucsfcca.UUCP (Dick Karpinski) (02/26/86)

In article <6790@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes:
>
>There is no way you'll write a device driver that is guaranteed
>to work with any C compiler, if you have to take care of timing
>considerations (unless you're willing to pay the penalty of using
>a *real* timer, of course).

I thought someone suggested a solution:  Build a timing loop, but
set its constant (how many cycles) using some other, possibly low
precision, timer to see how fast the loop is _today_ with this
compiler/clock/cpu-chip/whatever.  I _think_ that one can usually
count on those things remaining constant _during_ this run of the
program, i.e. between reboots of the OS.  I have heard of systems
which change their cpu clock on the fly, but you probably know
that when you write the device driver.  Is that enough?

Dick
-- 

Dick Karpinski    Manager of Unix Services, UCSF Computer Center
UUCP: ...!ucbvax!ucsfcgl!cca.ucsf!dick   (415) 476-4529 (12-7)
BITNET: dick@ucsfcca   Compuserve: 70215,1277  Telemail: RKarpinski
USPS: U-76 UCSF, San Francisco, CA 94143

kevin@sun.uucp (Kevin Sheehan) (02/26/86)

Ok, anybody object to the last word on the subject? :-)

Actually, I am a little concerned about the reaction on two counts.

1) if you have a box where everthing but this chip cycles ok, and
you are looking for that sexy point in the curve between cost
and performance - KNOW that the choice will (and according to religion,
should :-) be to do it in software. bienvenidos al mundo..

2) ok, now if it's in software, and assuming you checked the point, or it
was made known to you as the driver writer, (seems to be assumed so far
that you didnt get lucky) then you write whatever it takes to work
(timing loop, simple assignment) and DOCUMENT THAT ASSUMPTION!!!
right there, right then, at that line, at the top - but make
it known to the next smuck.

I dont mind someone pulling stunts if it's NECESSARY, or sometimes just
the way it is.  (some are amusing, no?) I DO mind not knowing when it's
been done, and getting bit. 'nuff said?

				l & h,
				kev

rb@ccivax.UUCP (rex ballard) (02/27/86)

What is really needed is a standard "time base" signal that is
independent of "Chip Clock" speed.  When Chip makers create a "wait X"
instruction where X is the time in microseconds, and put the neccessary
loop/wait in the micro-code, I will be GLAD to stop using loops.  I have
always hated them (bursty DMA can blow timing too).  Something is
needed for that little window that is smaller than the RTC interrupt,
and bigger than a NOP or wait state.  Even a constant period NOP would
be nice.

With the 68020, even the old trick of running a big loop and timing
it against the RTC chip doesn't work (cache misses in shorter loops).

There are still a few places where very short timing intervals are
still needed, and have to be reasonably accurate.  Things like
hand-shaking, line-turnaround, device locking (where TAS won't do,
or needs to be done more than once).

The original point of using them for copy-protection schemes and
long trivial loops is valid.  Don't do it.

ron@brl-smoke.ARPA (Ron Natalie <ron>) (02/27/86)

> In article <2817@amdahl.UUCP> mat@amdahl.UUCP (Mike Taylor) writes:
> >(S/370 architecturally has 244 picosecond resolution)
> 
>    I admit to knowing little about the S/370, but a 4 GHz clock rate?
> Can someone verify this, please?   I don't remember seeing any microwave
> plumbing in a 370... :-)
> 

Who said anything about a 4Ghz clock rate?  All he said was the architecture
supported that resolution.  If the 370 time register were incremented by
one, it would be updated every 244 picoseconds.  However each machine
in the line increments it by a somewhat larger number corresponding to the
speed of the clock on that processor.  Hence, the higher order bits have
been consistant accross 10 years of processors and will continue to be
so for quite a few years to come I would expectr.

-Ron

rcd@nbires.UUCP (Dick Dunn) (02/28/86)

Talking about the nasties of correcting funky hardware by putting in
software timing loops...
>...then you write whatever it takes to work
> (timing loop, simple assignment) and DOCUMENT THAT ASSUMPTION!!!
> right there, right then, at that line, at the top - but make
> it known to the next smuck.
> 
> I dont mind someone pulling stunts if it's NECESSARY, or sometimes just
> the way it is.  (some are amusing, no?) I DO mind not knowing when it's
> been done, and getting bit. 'nuff said?

The sentiment is right, but that doesn't solve the problem.  As whoever it
was (from SUN) said, suppose that you switch to a processor that's
functionally the same but has different timing.  Sure--you comment the
gotchas as best you can--but who's going to go looking for the comments
that indicate, say, specific gotchas between a 16.67 and 20 Mhz 68020???
And how do you find them?  Do you have a working ELCgrep (that's English-
Language, Conceptual grep)?
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Worst-case analysis must never begin with "No one will ever want..."

marty@fritz.UUCP (Marty McFadden) (03/02/86)

I must agree that even well documented code is still burdensome if
timing loops are needed. I recently ported Unix* System V from the
68000 to the 68020 (16 2/3 Mhz), there were unfortunately a few timing
loops that were put into the kernel that had to be changed. (talk
about finding a needle in a haystack!!)

*Unix is a trademark of Bell Laboratories

					Martin J. McFadden
					FileNet Corp
					trwrb!fritz!marty

franka@mmintl.UUCP (Frank Adams) (03/04/86)

In article <6780@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes:
>>E.g. in meeting the recovery time of a particularly good USART chip
>>with a horrible bus interface, the Z8530, you need to wait 2.2us
>>between writes to it.
>
>Sorry, but this bad makes it a particularly *bad* USART chip, regardless
>of any other features.

It seems to me that from theoretical considerations, there will always be
*some* time dependencies in any device.  If you run it with a fast enough
processor, it will stop working.

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108

rb@ccivax.UUCP (rex ballard) (03/06/86)

In article <443@ucsfcca.UUCP> dick@ucsfcca.UUCP (Dick Karpinski) writes:
>In article <6790@boring.UUCP> jack@mcvax.UUCP (Jack Jansen) writes:
>>
>I thought someone suggested a solution:  Build a timing loop, but
>set its constant (how many cycles) using some other, possibly low
>precision, timer to see how fast the loop is _today_ with this
>compiler/clock/cpu-chip/whatever.  I _think_ that one can usually
>count on those things remaining constant _during_ this run of the
>program, i.e. between reboots of the OS.  I have heard of systems
>which change their cpu clock on the fly, but you probably know
>that when you write the device driver.  Is that enough?
>
>Dick
>-- 
This works ok UNLESS you have a high speed cache (like the 68020)
AND are servicing non-maskable interrupts.  The timing in a tight
loop can wander all over the place, depending on how often the
interrupts scramble your cache.  Problems can ALSO occur when
co-processors or multiple DMA devices are sharing the bus and your
processor has low priority.

root@ucsfcca.UUCP (Computer Center) (03/08/86)

Although I haven't seen all the messages in this discussion I have
seen what appears to be a lumping together of two things which are
really different.

In the first case (and I believe this was the origin of the series)
there is the necessity of guaranteeing a minimum time between
critical events due to a latency in a hardware device. In this case,
just providing the instructions which insure this delay in the
fastest case (e.g. all caches, pipelines, etc. functioning) is
good enough. You don't care about interrupts etc. since you
certainly aren't going to allow the interrupt routine access to
the device in such a case.

The other case is opposite, i.e. guaranteeing a maximum time between
critical events. This can never be satisfied if unconstrained
interrupt processing is allowed.

The real issue worth discussing then appears to be what constraints
are sufficient and minimal to allow the requirement to be satisfied.

Thos Sumner  (...ucbvax!ucsfcgl!ucsfcca!thos)

greg@utcsri.UUCP (Gregory Smith) (03/10/86)

In article <1162@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>It seems to me that from theoretical considerations, there will always be
>*some* time dependencies in any device.  If you run it with a fast enough
>processor, it will stop working.

False. A processor-to-device interface can be designed in such a way that
an access to a slow device will cause the processor to be 'stopped' until
the device is ready. This can be done in a port-dependent way, i.e. if
there is only one slow device on the bus, the processor will only be
slowed when that device is accessed. The 'stopped' state of the processor
is sometimes called a 'wait' state.
On many systems, this technique is the rule rather than the exception -
I think UNIBUS is an example.

-- 
"So this is it. We're going to die."	- Arthur Dent
----------------------------------------------------------------------
Greg Smith     University of Toronto       ..!decvax!utzoo!utcsri!greg