[comp.sys.ibm.pc] .84 usec Timer resolution on PC

norsk@sequent.UUCP (Doug Thompson) (11/22/89)

PC Timing Resolution of .84 microseconds are possible on a PC machine.

In the April 1984 issue of the late PC TECH JOURNAL, there was an article
titled "Life in the Fast Lane", by Bob Smith and Tom Puckett, that 
discribe the mechanism for obtaining very good timing resolution from the
PC 8253-5 Programmable Interval Timer (PIT).

The clock interrupt is generated by Timer 0 of the PIT. The input to the
timer is 1.193182 Mhz with a period of 0.838095 microseconds or .84 us.
By altering the programming mode of Timer 0 of the PIT from Mode 3 (which is
a square wave output where the count down counter decrements by 2 during
the first half of the cycle, then again during the last half)  to Mode 2
(which has the output normally high, but when the count down reaches 0,
the output goes low for 1 timer-clock cycle then goes high again.)

In this mode, a query via a Timer command of the current count down register
value is possible. The command causes the PIT to latch the current value
and make it available for query. Then this value is the count of timer-clock
cycles that have occurred since the last Timer 0 interrupt. Coupled with
the interrupt count in the BIOS data area, one can measure very in fine
timing the length of short events.


To program Timer 0 to Mode 2 do the following:

	mov	al, 34h		; Timer 0 - Mode 2
	out	43h, al
	xor	al, al		; Get ZERO to give full count of 65536
	jmp 	$+2
	out	40h, al		; load low order byte
	jmp	$+2
	out	40h, al		; hi order byte

To get a current timer value dot he following:

	...
	mov	ax, 40h		; BIOS data segment
	mov	es, ax
	mov	al, 00h		; Latch Timer 0 counter cmd
	cli			; critical section hold off ints
	out	43h, al		; issue latch cmd
	mov	bx, es:[6ch]	; get BIOS timer low word
	mov	cx, es:[6eh]	; get BIOS timer hi word
	in	al, 40h		; fetch latched counter
	mov	ah, al
	jmp	$+1
	in	al, 40h		; fetch hi order latched counter
	sti
	xchg	ah, al		; get in proper byte order
	neg	ax		; get up count from down count
	xchg	ax, cx		; return in common register order

	AX has MSWord of timer interrupt count
	BX has LSWord of timer interrupt count
	CX has count of timer-clock cycles since last timer interrupt (.84us)


	
-- 
Douglas Thompson		UUCP: ..{tektronix,ogcvax,uunet}!sequent!norsk
Sequent Computer Systems	Phone: (503) 526-5727
15450 SW Koll Parkway	!"The scientist builds to learn;the engineer learns in
Beaverton OR 97006	!order to build."  Fred Brooks

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (11/22/89)

In article <25235@sequent.UUCP> norsk@sequent.UUCP (Doug Thompson) writes:

| To program Timer 0 to Mode 2 do the following:
| 
| 	mov	al, 34h		; Timer 0 - Mode 2
| 	out	43h, al
| 	xor	al, al		; Get ZERO to give full count of 65536
| 	jmp 	$+2
| 	out	40h, al		; load low order byte
| 	jmp	$+2
| 	out	40h, al		; hi order byte

  Could you explain the "jmp $+2" instructions? I mean, I know what they
do, but not why you're doing it. If this is supposed to be a software
delay (a) comments are nice and (b) sad experience tells me that fast
cpus will need more delay than slow cpus.

  A little clarification, please, as to the intent?  Also the "jmp $+1"
in the second part?
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

norsk@sequent.UUCP (Doug Thompson) (11/23/89)

In article <1757@crdos1.crd.ge.COM> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <25235@sequent.UUCP> norsk@sequent.UUCP (Doug Thompson) writes:
>
>| To program Timer 0 to Mode 2 do the following:
>| 
>| 	mov	al, 34h		; Timer 0 - Mode 2
>| 	out	43h, al
>| 	xor	al, al		; Get ZERO to give full count of 65536
>| 	jmp 	$+2
>| 	out	40h, al		; load low order byte
>| 	jmp	$+2
>| 	out	40h, al		; hi order byte
>
>  Could you explain the "jmp $+2" instructions? I mean, I know what they
>do, but not why you're doing it. If this is supposed to be a software
>delay (a) comments are nice and (b) sad experience tells me that fast
>cpus will need more delay than slow cpus.
>
>  A little clarification, please, as to the intent?  Also the "jmp $+1"
>in the second part?
>-- 

Sorry if this caused some confusion. What I did was copy the "main" parts
of the code from the article cited to provide everyone an idea of what I
was talking about. the 'jmp $+1' obviously was a typo. It should be a
jmp $+2 like the others. And they are a delay tactic to flush the instruction
queue of the chip so that the IO devices have a chance to recover. Faster
CPUs would have to adjust for more delay. (BTW the code I copied used a
single 'nop' which provides almost no delay. And this delay tactic won't
work on the 486, since jmps don't flush the queue. The 486 decodes BOTH
paths of all jumps to speed up the jump when taken - sure is a good use
of silicon in my opinion)
-- 
Douglas Thompson		UUCP: ..{tektronix,ogcvax,uunet}!sequent!norsk
Sequent Computer Systems	Phone: (503) 526-5727
15450 SW Koll Parkway	!"The scientist builds to learn;the engineer learns in
Beaverton OR 97006	!order to build."  Fred Brooks

saify@cbnewsl.ATT.COM (saify.lanewala) (11/29/89)

In article <1757@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:
> In article <25235@sequent.UUCP> norsk@sequent.UUCP (Doug Thompson) writes:
> 
> | To program Timer 0 to Mode 2 do the following:
> | 
> | 	jmp 	$+2
> 
>   Could you explain the "jmp $+2" instructions? I mean, I know what they
> do, but not why you're doing it. If this is supposed to be a software
> 
> -- 
> bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
> "The world is filled with fools. They blindly follow their so-called
> 'reason' in the face of the church and common sense. Any fool can see
> that the world is flat!" - anon

That instruction is intended to flush the pre-fetch queue.  I read that
somewhere, but cannot give more insight.  Perhaps some hardware guru .....

There's an article in Programmer's Journal of last month (I believe) that
describes in detail an implementation of a fast timer for use in doing
performance measurements of small sections of graphics code.

There's also information on the jmp $+2 instruction in that article,
if you're interested.

Good luck.

Saify Lanewala
....attunix!stl

brad@optilink.UUCP (Brad Yearwood) (11/30/89)

From article <3056@cbnewsl.ATT.COM>, by saify@cbnewsl.ATT.COM (saify.lanewala):
> In article <1757@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:
>> In article <25235@sequent.UUCP> norsk@sequent.UUCP (Doug Thompson) writes:
>> 
>> | To program Timer 0 to Mode 2 do the following:
>> | 
>> | 	jmp 	$+2
>> 
>>   Could you explain the "jmp $+2" instructions? I mean, I know what they
> 
> That instruction is intended to flush the pre-fetch queue.  I read that
> somewhere, but cannot give more insight.  Perhaps some hardware guru .....

The immediate consequence of jmp $+2 is indeed to flush the prefetch queue.
The actual intent is to prevent I/O read or write cycles from proximate IN or
OUT instructions from being presented to a peripheral chip in too-rapid
succession.  Many of the PC peripheral chips (interrupt controllers etc.) date
back to the days of slow processors such as the 8085.  Frequently, they
require a command recovery time of a couple of microseconds or so after
accepting a command write or status read before they can handle another
operation.

The efficient instruction prefetch, decode, and execution queue of the '286
and '386 can result in multiple I/O read or write cycles occurring in immediate
succession.  The jmp $+2 flushes the pipeline, thereby forcing at least one
intervening memory cycle (an instruction fetch for the target instruction of
the jump) to some address other than the peripheral chip.  With the classic
6MHz AT, the resulting delay was sufficient to meet the peripheral chip's
command recovery time.  Of course, with a 25MHz 386 and perhaps a cache in
today's machines, one must take care that jmp $+2 is really sufficient for
the particular peripheral chip in the fastest possible case.  It might be
necessary to do several jmp $+2 (or some other delay maneuver).

Brad Yearwood
Optilink Corp.  {pyramid, tekbspa, pixar}!optilink!brad
Petaluma, CA

pipkins@qmsseq.imagen.com (Jeff Pipkins) (12/05/89)

The IBM AT Tech. Ref. manual says that successive IN and OUT
instructions, which "lock the bus" can starve out DMA transfers.
Do any of you hardware guys know more about this?

It seems to me that if I were making an AT compatible machine that
was faster, it would have to run AT software WITHOUT MODIFICATION
in order to be anything but worthless.  This means that the designers
should compensate for only getting one Jmp short +2 instruction
between INs and OUTs.  Someone please tell me this is true.  While
you're at it, maybe you can tell me that it was all a mistake and
that IBM never really screwed up this bad in the first place... @;-)

Jeff Pipkins
pipkins@imagen.COM
I am not authorized to speak for anyone but me.  (That includes
my wife).

norsk@sequent.UUCP (Doug Thompson) (12/06/89)

In article <57@qmsseq.imagen.com> pipkins@qmsseq.UUCP (Jeff Pipkins) writes:
>The IBM AT Tech. Ref. manual says that successive IN and OUT
>instructions, which "lock the bus" can starve out DMA transfers.
>Do any of you hardware guys know more about this?
>
>It seems to me that if I were making an AT compatible machine that
>was faster, it would have to run AT software WITHOUT MODIFICATION
>in order to be anything but worthless.  This means that the designers
>should compensate for only getting one Jmp short +2 instruction
>between INs and OUTs.  Someone please tell me this is true.  While
>you're at it, maybe you can tell me that it was all a mistake and
>that IBM never really screwed up this bad in the first place... @;-)
>

Compaq I believe has the necessary hardware that detects such IO
accesses and automatically does the required waiting (at least on
my 20/e) and IBM did screw it up - just look at the BIOS source code
in your Tech-o-Ref manual and you'll see it peppered with 'em.

-- 
Douglas Thompson		UUCP: ..{tektronix,ogcvax,uunet}!sequent!norsk
Sequent Computer Systems	Phone: (503) 526-5727
15450 SW Koll Parkway	!"The scientist builds to learn;the engineer learns in
Beaverton OR 97006	!order to build."  Fred Brooks