[comp.arch] 64 bit clocks

mo@messy.bellcore.com (Michael O'Dell) (08/23/90)

Precise and repeatable accounting gets more and more important
as machines get faster since it is easy lose a large number
of cycles. Market experience of some of our folks indicated
that this was an area requiring serious attention.
The time system in the ill-fated Prisma P1 was built to allow
very accurate user-mode measurement of short loops, etc.
The clocks were all machine cycle-counters, since with
a 4-6 nanosecond clock, enough D-flops to synchronize
with anything else would be as large as the rest of the machine!
The 64-bit registers both latched all 64 bits synchronously,
even though the path to them was only 32 bits (register/ALU data width).
(Brian Berliner, now of Sun but most recently of CVS-II fame, did the kernel
time-management code and participated heavily in the shaping
of these facilities.)

TOE - Time Of Eternity clock

	64-bit up-counter, zeroed at machine reset but not randomly
	loadable, incrementing continuously at the machine cycle clock
	rate, readable with one alternate-address-space-load-double.

PVT - Process Virtual Time

	64-bit up-counter, randomly loadable from Supervisor Mode,
	increments at the machine cycle clock rate only when the
	processor is in User Mode, readable from User Mode with one
	alternate-address-space-load-double. 

QCT - Quantum Count-down Timer

	32-bit down-counter which generates an interrupt when the counter
	goes to zero.

The TOE clock formed the system global timebase used instead of
incrementing a counter inside "hardclock()". The precision
allowed us to keep track of interrupt time independently of
other kernel time and other nice things.

The PVT clock was exchanged on context-switches so programs
would see very repeatable accounting info and they could
measure small loops themselves (or rather, the C library
could be enhanced to get process timing stuff quite accurately).
Alas, the PVT was also accurate enough that you would certainly`
see cache reloads and such, making for a certain amount of 
"phase noise," but that was the physical reality of the machine.
For reasonable length loops, though, the data should have been
pretty good. (Integrate, integrate, integrate!!!)

The QCT was used in the new scheduler to do the variable-quantum
scheduling.  The P1 scheduler played with quantum size as well
as dispatching order to accomplish all its tricks.  When a process
was running, in theory, there shouldn't be any timer interrupts
going on.  In fact, you still needed the 1ms ticker to get
tty output going again, etc, so there was a flywheel which
resubmitted a 1ms interval to the timeout queue.  But the
quantum expiry processing was done from the timeout queue as well!

There was also a real-time clock chip which was used primarily
to remember the time when the machine was turned off. (Is seems
obvious that a million-dollar computer would be expected to provide
at least the functionality of a $10 digital watch, but people
took some serious convincing of that...)

It wasn't perfect, but it seemed to accomplish what we wanted.

	-Mike O'Dell
	ex-Prisma

PS - the Load Alternate and Store Alternate instructions in 
the SPARC made all this MUCH easier than it would have been
if more traditional "memory mapping" had been the only
alternative.

aglew@dual.crhc.uiuc.edu (Andy Glew) (08/24/90)

As I have noted in several previous postings, I have seen hardware
designs where large (64 bit timers) could not be provided, because the
timer could not increment within the clock cycle of the machine
(without an expensive carry lookahead network).

I suggested providing the time in carry-save format, and letting
software postprocess such a timestamp if desired.

It should be obvious that a normal binary encoding and the carry save
encoding of the time can be combined.
    Eg. build a cheap ripple timer with a large number of bits (say 61
bits).  If the greatest length that a ripple can complete in a machine
cycle is 16 bits, latch the carries at the 16th, 32nd, and 48th bit
position.  Conveniently, those three bits can fit in the lower part of
a 64 bit word returned.  You can use this directly as a timestamp;
true time is obtained as
    	
    true_time = (timestamp >> 3) + ((timestamp&1)<<16)
    	    	    	    	 + ((timestamp&2)<<(32-1))
    	    	    	    	 + ((timestamp&4)<<(48-2))

--
Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]