mo@messy.bellcore.com (Michael O'Dell) (08/10/90)
One of the nice things about using 64 bits for the time is that you can then put it in nanoseconds - which you *almost* really need on really fast machines. (100ns might be ok, but the difference is still contained in 64 bits, so just do it!!) -Mike
dhoyt@vw.acs.umn.edu (08/11/90)
In article <26012@bellcore.bellcore.com>, mo@messy.bellcore.com (Michael O'Dell) writes... >One of the nice things about using 64 bits for the time is that >you can then put it in nanoseconds - which you *almost* really >need on really fast machines. (100ns might be ok, but the >difference is still contained in 64 bits, so just do it!!) Actually you will still want to quantities: date and time. A date, measured in milliseconds or microseconds to handle dates and universal time. That would handle most people, with the execption of the bang/wimper types. You would also want an interval time. This ideally would be in sub-picosecond units, perhaps as a distance, even in this day and age. That would allow your OS to run (and report) the testing equitment for new ssd's, particle accelerators, 50 meter dashes and other transient experiments in a consistant, even manner. david paul hoyt | dhoyt@vx.acs.umn.edu | dhoyt@umnacvx.bitnet
colin@array.UUCP (Colin Plumb) (08/12/90)
Well, than, it's a good thing that Unix times are unsigned, and will last until 2106. For accuracy, use NTP timestamps, which are 32.32 bit fixed-point integers, giving around .2 ns accuracy. (232830643 attoseconds, if you're fussy) It will run out shortly after 06:20 GMT 31 Jan 2106. I can't tell you the exact time, because it depends on the number of leap seconds used until then - it's based on atomic time, while GMT is astronomical and the offset from atomic time is periodically diddled. -- -Colin
andrew@alice.UUCP (Andrew Hume) (08/15/90)
In article <26012@bellcore.bellcore.com>, mo@messy.bellcore.com (Michael O'Dell) writes: > One of the nice things about using 64 bits for the time is that > you can then put it in nanoseconds - which you *almost* really > need on really fast machines. (100ns might be ok, but the > difference is still contained in 64 bits, so just do it!!) > > -Mike this is nearly true. it is clear that support for high resolution time is needed and a quanta around 1-10ns is about right. however, the problem is that all the work around ISO (and there is a LOT of it) on date/time formats varies considerably on the number of bits required but is tending towards 128 bits (high resolution + all dates (including BC)). i note in passing that VMS (i think) has some funny date like 1858 as its epoch, the so-called smithsonian time. i also note that in the third edition of unix, when the time was measured in clock ticks (and thus wrapped around every 2.? yrs), ken proposed to deal with wraparound by changing the epoch in the manual and running a special fsck-like program that subtracted a year from every inode.
aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/22/90)
..> Time formats While my benchmarks chug away, mind if I have my say about time? Time formats are (1) something that I have strong opinions about, and (2) something that I believe I know quite a bit about, since I have spent most of my career making exact time measurements on computers in one form or another, wrestling with the various time formats. A 64 bit, 1 bit per nanosecond (or 128 bit, 1 bit per picosecond??), time format is great FOR PORTABLE USES OF TIME. Like timestamping files, maybe even a little bit of low accuracy performance measurement. Such a fixed format, though is EVIL and MISLEADING for high accuracy time measurement. Here are some reasons: Very few machines have a cycle time that is really commensurate in the integrals with exact units like nanoseconds. Typically, such machines have cycle times that are, say, 1.0012 ns long. Now, very few hardware designers are going to put a divider in to account for this small deviation - instead, they'll just use a counter, and assume that the deviation is negligible, or can be handled some other way. In other words, a hardware designer who says that he is giving you an exactly ns clock is probably *lying*. The best he can do is give you a clock accurate to several ppm (or parts in 10^13 - you just move the point around). There is a science/engineering discipline, called metrology (or horology) that specializes in how to make really accurate measurements. Most real designers are not (and do not need to be) trained metrologists. Who cares about parts in 1E13? The guy who is trying to make really accurate measurements. I have been able to make measurements where I have been able to see the effect of a single cache miss in a long section of code - after I have gone through the processes of converting "Nanoseconds according to the processor" into "the closest thing I can get to real nanoseconds". Occasionally one reads papers where it is obvious that the researchers did not go through this process, ie., where they assumed that the hardware reported nanoseconds were real nanoseconds. It would be better if we were just honest and admitted that machines don't run in nanoseconds - rather, they run in whatever is the fastest convenient time for the part. Even if vendor A says "I can build a nanosecond clock more regular than any constraint you put on it", vendor B may not be able to, due to differences in technology or format. Very few machines have a cycle time that is really perfectly regular. Ie. that supposedly nanosecond clock might be 0.999999867 now, and 1.0000001234 sometime later, according to changes in temperature, humidity, the local EM environment, whatever. I have observed such effects on several real systems, both CMOS and NMOS. Such effects are well known, which is why most timer hardware has a "timer correction" facility. Eg. when you compare your clock to another clock, and find that you are 0.00000001256 out of synch, you tell your clock to shorten its cycle by 0.00000000001 every tick - ie. to tick faster. So, in addition to the physical/environmental variations in timer cycle, you have a designed in time warp. You can't make measurements of a higher resolution than this, without standing on your head. To use a software example that is probably more familiar to the readers of this group: the BSD "adjtime()" facility warps the meaning of time, so that naonseconds are not consrant. Instead of warping the clock rate, which is difficult to do, some timers drop a tick every N ticks or so. So, therefore, your timer is not accurate to its least significant bit - a delta of 2LSBs might mean a 1ns difference instead of 2ns, if you were unlucky. Provide your "high resolution" 1 ns per bit timer if you must. But, for the people who really need high resolution timers, provide a "RAW" timer that ticks in whatever is the most convenient tick rate for your machine. Try to make the ticks as regular as possible. Don't play any tricks like warping the tick rate or dropping ticks. Characterize the ticks as well as you can. Provide a software library to convert from this RAW timer format to the portable timer format. Ensure that the software library does not just linearly scale ticks to real time, but instead can inter/extrapolate on a curve fitted between RAW timer values and certain calibration points in real time. Ironically enough, less hardware is required for this RAW timer than for a canonical bit per ns timer - and what hardware there is should all be devoted towards making the timer as accurate and regular as possible, rather than scaling into some "portable" format. Let software do the mapping. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
hascall@cs.iastate.edu (John Hascall) (08/22/90)
In article <11187@alice.UUCP> andrew@alice.UUCP (Andrew Hume) writes: }In article <26012@bellcore.bellcore.com>, mo@messy.bellcore.com (Michael O'Dell) writes: }> One of the nice things about using 64 bits for the time is that }> you can then put it in nanoseconds - which you *almost* really }> need on really fast machines. (100ns might be ok, but the }> difference is still contained in 64 bits, so just do it!!) }that VMS (i think) has some funny date like 1858 as its epoch, the so-called }smithsonian time. VMS keeps time in 64 bits (really 63, negative times are "delta times"), in 100 nSec units, since 17 Nov 1858 (when the calendar jumped 11 days?). } i also note that in the third edition of unix, when the time was measured }in clock ticks (and thus wrapped around every 2.? yrs), ken proposed to deal }with wraparound by changing the epoch in the manual and running a special }fsck-like program that subtracted a year from every inode. Boy, this really loses today now that clocks tick faster than 60 HZ (60 HZ = 828 days, 256 HZ = 194 days, 1000 HZ = 50 days). John Hascall / Project Vincent / Iowa State University Comp Ctr john@iastate.edu / hascall@atanasoff.cs.iastate.edu
rpw3@rigden.wpd.sgi.com (Rob Warnock) (08/23/90)
In article <2506@dino.cs.iastate.edu> hascall@cs.iastate.edu (John Hascall) writes: +--------------- | }that VMS (i think) has some funny date like 1858 as its epoch, the so-called | }smithsonian time. | VMS keeps time in 64 bits (really 63, negative times are "delta times"), | in 100 nSec units, since 17 Nov 1858 (when the calendar jumped 11 days?). +--------------- Nice try, but no go. 17 Nov 1858 was the date of the first (recorded) high- quality astronomical photograph. It is used as Day Zero for quite a few systems. The DEC PDP-10 also used that as Day Zero, b.t.w. (Didn't CDC, too?) -Rob ----- Rob Warnock, MS-9U/510 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311
seanf@sco.COM (Sean Fagan) (08/23/90)
In article <26012@bellcore.bellcore.com>, mo@messy.bellcore.com (Michael O'Dell) writes: > One of the nice things about using 64 bits for the time is that > you can then put it in nanoseconds - which you *almost* really > need on really fast machines. (100ns might be ok, but the > difference is still contained in 64 bits, so just do it!!) The Elxsi, a rather nice machine (it has no supervisor mode!), has a 50ns clock, and a 64-bit clock register. If you want to find out *exactly* how many clock-ticks an instruction takes, you do something like: ld.l r1, CLOCK <instr> ld.l r2, CLOCK sub.l r3, r1, r2 I'm guessing at the syntax, and I'm *sure* it's wrong, but you get the general idea. Rather useful, actually. -- Sean Eric Fagan | "let's face it, finding yourself dead is one seanf@sco.COM | of life's more difficult moments." uunet!sco!seanf | -- Mark Leeper, reviewing _Ghost_ (408) 458-1422 | Any opinions expressed are my own, not my employers'.
jkenton@pinocchio.encore.com (Jeff Kenton) (08/23/90)
From article <67535@sgi.sgi.com>, by rpw3@rigden.wpd.sgi.com (Rob Warnock): > In article <2506@dino.cs.iastate.edu> hascall@cs.iastate.edu > (John Hascall) writes: > +--------------- > | }that VMS (i think) has some funny date like 1858 as its epoch, the so-called > | }smithsonian time. > | VMS keeps time in 64 bits (really 63, negative times are "delta times"), > | in 100 nSec units, since 17 Nov 1858 (when the calendar jumped 11 days?). > +--------------- > > Nice try, but no go. 17 Nov 1858 was the date of the first (recorded) high- > quality astronomical photograph. It is used as Day Zero for quite a few > systems. The DEC PDP-10 also used that as Day Zero, b.t.w. (Didn't CDC, too?) > The magic date John is thinking of is September 1752 (try 'man cal'): cal 9 1752 September 1752 S M Tu W Th F S 1 2 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - jeff kenton --- temporarily at jkenton@pinocchio.encore.com --- always at (617) 894-4508 --- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
cet1@cl.cam.ac.uk (C.E. Thompson) (08/23/90)
In article <2506@dino.cs.iastate.edu> hascall@cs.iastate.edu (John Hascall) writes: >In article <11187@alice.UUCP> andrew@alice.UUCP (Andrew Hume) writes: >}that VMS (i think) has some funny date like 1858 as its epoch, the so-called >}smithsonian time. > > VMS keeps time in 64 bits (really 63, negative times are "delta times"), > in 100 nSec units, since 17 Nov 1858 (when the calendar jumped 11 days?). > The VMS time base is Julian day 2,400,000. Julian day numbers have been (maybe they still are) popular with astronomers. The change from the Julian (different Jules, of course) calendar to the Gregorian calendar happened in England and the American colonies from 2-14 September 1752. I thought everyone knew that :-) Chris Thompson JANET: cet1@uk.ac.cam.phx Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk
rtrauben@cortex.Eng.Sun.COM (Richard Trauben) (08/24/90)
>seanf@sco.COM writes: >If you want to find out *exactly* how >many clock-ticks an instruction takes, you do something like: > > ld.l r1, CLOCK > <instr> > ld.l r2, CLOCK > sub.l r3, r1, r2 > Nope. (close but no cigar...) You just measured the execution time sum of TWO instructions: <instr> PLUS <ld> execution time where the <ld> includes bus arbitration and memory access time to the TOD clock resource. In most systems the latter term dominates. Unless <instr> is the kind of instruction you want to drop anyway. -:) -Richard
karsh@trifolium.esd.sgi.com (Bruce Karsh) (08/24/90)
seanf@sco.COM writes: >If you want to find out *exactly* how >many clock-ticks an instruction takes, you do something like: > > ld.l r1, CLOCK > <instr> > ld.l r2, CLOCK > sub.l r3, r1, r2 In article <703@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >Nope. (close but no cigar...) >You just measured the execution time sum of TWO instructions: ><instr> PLUS <ld> execution time where the <ld> includes bus >arbitration and memory access time to the TOD clock resource. How about: ld.l r1, CLOCK <instr> ld.l r2, CLOCK sub.l r3, r1, r2 ; r3 = Tinstr + Tclockfetch ld.l r1, CLOCK <instr> <instr> ld.l r2, CLOCK sub.l r4, r1, r2 ; r4 = 2*Tinstr + Tclockfetch sub.l r5, r3, r4 ; r5 = r4 - r3 = Tinstr Of course, executing <instr> more than once may have some timing side effects with respect to the cache. Hence, you should probably ensure that this code and all the arguments to <instr> are already in the cache. Bruce Karsh karsh@sgi.com
rminnich@udel.edu (Ronald G Minnich) (08/24/90)
In article <703@exodus.Eng.Sun.COM>, rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: |> > ld.l r1, CLOCK |> > <instr> |> > ld.l r2, CLOCK |> You just measured the execution time sum of TWO instructions: |> <instr> PLUS <ld> execution time where the <ld> includes bus |> arbitration and memory access time to the TOD clock resource. huh? CLOCK is a fast register right there on the processor in most cases. I can't imagine anyone in their right mind putting that high-res clock at the other end of a memory bus if it has any kind of resolution. Say it ain't so, sean! ron 1987: We set standards, not Them. Your standard windowing system is NeUWS. 1989: We set standards, not Them. You can have X, but the UI is OpenLock. 1990: Why are you buying all those workstations from Them running Motif?
przemek@liszt.helios.nd.edu (Przemek Klosowski) (08/24/90)
In article <67633@sgi.sgi.com> karsh@trifolium.sgi.com (Bruce Karsh) writes: >seanf@sco.COM writes: >>If you want to find out *exactly* how >>many clock-ticks an instruction takes, you do something like: >> >> ld.l r1, CLOCK >> <instr> >> ld.l r2, CLOCK >> sub.l r3, r1, r2 > >In article <703@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >>Nope. (close but no cigar...) < .. Bruce has the idea of executing instr twice ...> How about: ld.l r1, CLOCK <instr> ld.l r2, CLOCK ld.l r3, CLOCK sub.l r4, r2, r1 ; r4 = Tinstr + Tclockfetch sub.l r5, r3, r2 ; r5 = Tclockfetch sub.l r6, r4, r5 ; r6 = Tinstr No side effects are involved here. Of course CLOCK cannot be cached or else :^) -- przemek klosowski (przemek@ndcva.cc.nd.edu) Physics Dept University of Notre Dame IN 46556
danh@halley.UUCP (Dan Hendrickson) (08/24/90)
In article <67633@sgi.sgi.com> karsh@trifolium.sgi.com (Bruce Karsh) writes: >seanf@sco.COM writes: >>If you want to find out *exactly* how >>many clock-ticks an instruction takes, you do something like: >> >> ld.l r1, CLOCK >> <instr> >> ld.l r2, CLOCK >> sub.l r3, r1, r2 > >In article <703@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >>Nope. (close but no cigar...) >>You just measured the execution time sum of TWO instructions: >><instr> PLUS <ld> execution time where the <ld> includes bus >>arbitration and memory access time to the TOD clock resource. [stuff deleted] > Bruce Karsh > karsh@sgi.com I believe that the point of the discussion was if you put a cycle timer "very close" to the CPU, that is if it took a small number of cycles and always took the same number of cycles (that is, the access did not go across some bus which various parts of the machine were trying to use at the same time), then you had a method of accurately measuring the number of cycles to execute an instruction. The only caveat on the approach is that all of the cycles must be in the instruction cache (inst. buffers in Cray terminology). The key is to have the "ld.l r1,CLOCK" instruction be a register transfer, not a memory reference. Dan Hendrickson Tandem Computers Austin, TX
hawkes@mips.COM (John Hawkes) (08/25/90)
In article <32015@super.ORG> rminnich@udel.edu (Ronald G Minnich) writes: >In article <703@exodus.Eng.Sun.COM>, rtrauben@cortex.Eng.Sun.COM >(Richard Trauben) writes: >|> > ld.l r1, CLOCK >|> > <instr> >|> > ld.l r2, CLOCK >|> You just measured the execution time sum of TWO instructions: >|> <instr> PLUS <ld> execution time where the <ld> includes bus >|> arbitration and memory access time to the TOD clock resource. > >huh? >CLOCK is a fast register right there on the processor in most cases. >I can't imagine anyone in their right mind putting that high-res >clock at the other >end of a memory bus if it has any kind of resolution. Actually, there is a specific Elxsi instruction to read the clock register, and the register lives on the ALU board (the principal board of the three boards comprising the CPU). I don't recall the latency, but I doubt it requires more than one or two cycles. The Elxsi CPI is something on the order of two or three if instructions and data are cached. -- John Hawkes {ames,decwrl}!mips!hawkes OR hawkes@mips.com
preston@gefion.rice.edu (Preston Briggs) (08/25/90)
In article <957@halley.UUCP> danh@halley.UUCP (Dan Hendrickson) writes: >>seanf@sco.COM writes: >>>If you want to find out *exactly* how >>>many clock-ticks an instruction takes, you do something like: >>> >>> ld.l r1, CLOCK >>> <instr> >>> ld.l r2, CLOCK >>> sub.l r3, r1, r2 >I believe that the point of the discussion was if you put a cycle timer "very >close" to the CPU, that is if it took a small number of cycles and always took >the same number of cycles (that is, the access did not go across some bus which >various parts of the machine were trying to use at the same time), then you >had a method of accurately measuring the number of cycles to execute an >instruction. The only caveat on the approach is that all of the cycles must >be in the instruction cache (inst. buffers in Cray terminology). The key I'd guess that on modern machines, the Heisenburg Uncertainty principle comes into play. You can't measure the time of a single instruction usefully because the measurement code interferes with the cache and various pipelines. We can certainly measure how long it takes to issue the instruction, but when does it complete? In a particular context, it'll sometimes depend on the progress of earlier instructions, etc. Inserting measurement code changes the context. -- Preston Briggs looking for the great leap forward preston@titan.rice.edu
karsh@trifolium.esd.sgi.com (Bruce Karsh) (08/25/90)
>How about: > > ld.l r1, CLOCK > <instr> > ld.l r2, CLOCK > ld.l r3, CLOCK > sub.l r4, r2, r1 ; r4 = Tinstr + Tclockfetch > sub.l r5, r3, r2 ; r5 = Tclockfetch > sub.l r6, r4, r5 ; r6 = Tinstr >No side effects are involved here. Of course CLOCK cannot be cached or else :^) Very nice! Bruce Karsh karsh@sgi.com
mash@mips.COM (John Mashey) (08/25/90)
In article <1990Aug24.181208.29581@rice.edu> preston@gefion.rice.edu (Preston Briggs) writes: ... >I'd guess that on modern machines, the Heisenburg Uncertainty >principle comes into play. You can't measure the time of a single >instruction usefully because the measurement code interferes >with the cache and various pipelines. We can certainly measure how >long it takes to issue the instruction, but when does it complete? >In a particular context, it'll sometimes depend on the progress >of earlier instructions, etc. Inserting measurement code changes the >context. Preston is right on, and I'd say it even stronger: Not only does the measurement code change the context, but even if it didn't, it's ALMOST USELESS to be trying to measure the speed of individual instructions on current machines, and not just from cache and pipeline effects. Let's add (at least): conflicts with functional units (such as write ports into a register file) memory stalls (either from back-to-back stores, or load/store, or store/load) memory stalls (from running into DRAM refresh) memory stalls (write-buffers, or write-back caches) cache misses and maybe location of the instruction within the cache line Amongst the current machines, any one of these can have some effect, and there are plenty of others as pipelines get more complex. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/26/90)
>[Preston Briggs] >I'd guess that on modern machines, the Heisenburg Uncertainty >principle comes into play. You can't measure the time of a single >instruction usefully because the measurement code interferes >with the cache and various pipelines. >[John Mashey] >Not only does the measurement code change the context, but even if it >didn't, it's ALMOST USELESS to be trying to measure the speed of >individual instructions on current machines. Please note that the guys above said that it is useless (1) to try to measure the speed of an individual instruction, not (2) that it is useless to try to measure the speeds of instruction aggregates to reveal individual instruction effects. Or do you want to extend your statements to cover (2), John and Preston? If so, then I disagree. Experimental physics hasn't stopped since Heisenburg. We just know a bit more about what we can and cannot measure. I would never suggest timing individual instructions. You can, however, time sequences of instructions - basic blocks may be too small, but critical paths through a function may be large enough (functions like syscall() come to mind), and you can time precisely enough to show the effects of individual instructions on these aggregates. Just make sure the ramp up and ramp down effects can be accounted for or averaged out. Even then, measurement changes the context - but you account for that by minimizing the measurement distortion, and designing your measurement code so that the distortion is unidirectional - so that you get an upper or lower bound from your measurement. If what you are trying to do is, eg. set a strict upper bound on context switch time (hello hard RT - and, yes, I know about the probabilistic effects of caches) a bound is all you need. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/26/90)
>>How about: >> >> ld.l r1, CLOCK >> <instr> >> ld.l r2, CLOCK >> ld.l r3, CLOCK >> sub.l r4, r2, r1 ; r4 = Tinstr + Tclockfetch >> sub.l r5, r3, r2 ; r5 = Tclockfetch >> sub.l r6, r4, r5 ; r6 = Tinstr >>No side effects are involved here. Of course CLOCK cannot be cached or else :^) > >Very nice! Ummmmmm....... If CLOCK is accessed across a bus, then Tclockfetch may be affected by bus traffic. At least you are reading the correction right there, so you are likely to, but not guaranteed to, get the same bus traffic. In general, of course, on modern machines you cannot measure individual instruction times. But you might be able to measure the timing effects of individual instructions on larger code sequences, with care. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
mash@mips.COM (John Mashey) (08/26/90)
In article <AGLEW.90Aug25130251@dwarfs.crhc.uiuc.edu> aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes: >>[Preston Briggs] >>I'd guess that on modern machines, the Heisenburg Uncertainty >>principle comes into play. You can't measure the time of a single >>instruction usefully because the measurement code interferes >>with the cache and various pipelines. >>[John Mashey] >>Not only does the measurement code change the context, but even if it >>didn't, it's ALMOST USELESS to be trying to measure the speed of >>individual instructions on current machines. >Please note that the guys above said that it is useless (1) to try to >measure the speed of an individual instruction, not (2) that it is >useless to try to measure the speeds of instruction aggregates to >reveal individual instruction effects. > Or do you want to extend your statements to cover (2), John and >Preston? If so, then I disagree. No, of course not. It is perfectly reasonable to measure aggregates, subject to all of the caveats that have been mentioned in this discussion so far. The bigger & more realistic the aggregates, the better. In addition, it will get worse. Hopefully, people now understand the uselessness of single-instruction measurements on current machines. If you agree to this, for the kinds of pipelines that most current machines use, consider how much worse it gets with: vector units superscalar superpipelined superscalar-superpipelined out-of-order execution speculative execution multi-level cache hierarchies, with various inter-level buffering since all of these are either here already, or possibly coming soon, in microprocessors. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
seanf@sco.COM (Sean Fagan) (08/27/90)
In article <41090@mips.mips.COM> mash@mips.COM (John Mashey) writes: >it's ALMOST USELESS to be trying to measure the speed of >individual instructions on current machines > conflicts with functional units (such as write ports > into a register file) Well, the Elxsi doesn't have pipelines or functional units (serial execution only), so those don't come into play. Doesn't mean John's points are invalid, I just wanted to point that out 8-). -- Sean Eric Fagan | "let's face it, finding yourself dead is one seanf@sco.COM | of life's more difficult moments." uunet!sco!seanf | -- Mark Leeper, reviewing _Ghost_ (408) 458-1422 | Any opinions expressed are my own, not my employers'.
gillies@m.cs.uiuc.edu (08/28/90)
I think it is rather ridiculous for the ISO to support timing accuracy in the nanoseconds for pre-history. Until very recently, we couldn't measure time in hundredths of a second -- why would we want to measure time in nanoseconds back into prehistory? What an idiotic idea. Also, tell me when it will be possible to synchronize all the computer clocks with a nano-second accuracy atomic clock. How will such a clock be reset later? My conclusion: ISO should specify a nanosecond relative timer, and a much coaser absolute timer (like milliseconds). Don W. Gillies, Dept. of Computer Science, University of Illinois 1304 W. Springfield, Urbana, Ill 61801 ARPA: gillies@cs.uiuc.edu UUCP: {uunet,harvard}!uiucdcs!gillies
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/29/90)
In article <3300165@m.cs.uiuc.edu> gillies@m.cs.uiuc.edu writes: [ strong statement on stupidity of ns dating ] | Also, tell me when it will be possible to synchronize all the computer | clocks with a nano-second accuracy atomic clock. How will such a | clock be reset later? I guess the first question is when will there be a benefit from doing so? And how long will it stay in sync? | | My conclusion: ISO should specify a nanosecond relative timer, and a | much coaser absolute timer (like milliseconds). The timer should not be more accurate than the accuracy of the setting. Unless there's a good way to set such a timer within a ms *repeatably* then why worry about how accurately you can measure it? The relative timer is important, the absolute timer leads people to believe they have accuracy they don't. Yes I know about using phone lines and radio to distribute time, with and without hardware ping and delay compensation. It is still hard to be sure you're within a ms. Fortunately getting within 50 ms seems to be adequate for most things, which is easy to do. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) VMS is a text-only adventure game. If you win you can use unix.
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/29/90)
>In article <3300165@m.cs.uiuc.edu> gillies@m.cs.uiuc.edu writes: > [ strong statement on stupidity of ns dating ] >| Also, tell me when it will be possible to synchronize all the computer >| clocks with a nano-second accuracy atomic clock. How will such a >| clock be reset later? > I guess the first question is when will there be a benefit from doing >so? And how long will it stay in sync? >| >| My conclusion: ISO should specify a nanosecond relative timer, and a >| much coaser absolute timer (like milliseconds). davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: > The timer should not be more accurate than the accuracy of the >setting. Unless there's a good way to set such a timer within a ms >*repeatably* then why worry about how accurately you can measure it? The >relative timer is important, the absolute timer leads people to believe >they have accuracy they don't. Er, this is a solved problem in software engineering... You have an architecture-specific constant that tells you how may bits are significant to the ``right'' of the decimal point ,and a function that returns only those bits non-zero. The application can use a constant-size time variable, and discover how much of it is significant when necessary. If I were writing this in an object-oriented language (:-)), I'd define it thusly: declare clock_$absolute_machine_time entry() fixed decimal (72,36), clock_$accuracy entry() fixed binary (6); --dave (pardon me if I got the PL/1 wrong, but I couldn't resist bring up the 1970s ``state of practice'' solution) c-b -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
bdg@tetons.UUCP (Blaine Gaither) (08/29/90)
I must agree with aglew. You need high frequency timers (= cpu clock). Even if you are timing small routines the extra precision is needed to help you determine whether or not what you are observing is indeed what you wish to observe. I have seen countless situations where analysts have assumed some small anomaly was "handling timer interrupts, .." only to find out it was a indication of a major source of error. A final important reason for high precision timers is to help architects manage the software implementation. The cavalier way in which OS types often treat timing facility implementation is only exacerbated by letting them hide behind a coarse grain clock.
aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/30/90)
>A final important reason for high precision timers is to help >architects manage the software implementation. The cavalier way in >which OS types often treat timing facility implementation is only >exacerbated by letting them hide behind a coarse grain clock. If only the architects would give us a high precision timer, OSers would not treat timing (and accounting) cavalierly. But, when you create the high precision timer, you also have to budget for the OS development time necessary to undo years of reliance on low-precision timers because they were the only thing going. (Hi, Blaine! Just ribbing you...) -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
alex@vmars.tuwien.ac.at (Alexander Vrchoticky) (08/30/90)
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: [about synchronizing clocks to nanosecond accuracy] > I guess the first question is when will there be a benefit from doing >so? And how long will it stay in sync? A global sense of time is a powerful concept in distributed real-time systems. The synchronization accuracy achievable depends on a lot of factors, most notably the variability of the communication delay and the drift rates of the local clocks. On local area networks a synchronization accuracy in the order of a few microseconds can be achieved with just a little hardware support and with very reasonable overhead. Given the advances of computer architecture in the past I don't dare say that synchronization accuracy in the order of nanoseconds will not be achieved. > The timer should not be more accurate than the accuracy of the >setting. Unless there's a good way to set such a timer within a ms >*repeatably* then why worry about how accurately you can measure it? The >relative timer is important, the absolute timer leads people to believe >they have accuracy they don't. System calls to set timers and clocks to absolute values are of course nonsensical when the variability of the execution time of the system call itself is in the order of the granularity of the clock or timer or greater. For clocks there is a solution to the problem: Adjust the *rate* of the clock until the correct value is reached and maintain the correct value by corrections of the rate. Unfortunately 1003.4 does not specify an interface for this (ok, this does not really belong in comp.arch ...). Given such a clock the variability of the system call does *not* matter for absolute timers (Putting aside pathological cases). I don't see a satisfactory solution for relative timers. The variability of the notification is of course a problem for both types of timers. Note that the accuracy of *any* timer used for interval measurements also depends on the fact that all corrections of the clock setting are gradual: Otherwise short durations might be measured as if they had taken a negative amount of time. I agree that for networks of workstations and other non-real-time applications a few milliseconds of accuracy are probably plenty. But other systems are more demanding: Real real-time systems need Real Time :-) And they need software mechanisms to access it. -- Alexander Vrchoticky Technical University Vienna, Dept. for Real-Time Systems Voice: +43/222/58801-8168 Fax: +43/222/569149 e-mail: alex@vmars.tuwien.ac.at or vmars!alex@relay.eu.net (Don't use 'r'!)
aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/31/90)
>For clocks there is a solution to the problem: >Adjust the *rate* of the clock until the correct value is reached >and maintain the correct value by corrections of the rate. >Unfortunately 1003.4 does not specify an interface for this >(ok, this does not really belong in comp.arch ...). Which is exactly the sort of thing I was complaining about. Adjust the rate for your absolute time clock that you never measure intervals on. The clock that you basically only use for timestamping files. But leave your hands off the clock that I'm using to time loops, program execution, ie. where I am differencing timer values to gain an interval time. Or at least log all of your rate adjustment times, so that I can know that a difference of 10 ticks NOW is 10.2 us, while a difference of 10 ticks yesterday at 9pm is 9.8 us. Doing those adjustments is a pain, but its doable. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]