rdh@sli.com (Robert D. Houk) (08/11/89)
In article <5818@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes: >In article <559@halley.UUCP> tjd@foghorn.mpd.tandem.com (Tom Davidson) writes: >>>Not that if makes much difference, but the ETA-10 has several extra registers >>>to keep track of cycle counts for the vector and scalar units. >> >>AS John mentions, some "registers" kept such goodies as a clock counter (in >>whatever periods the particular cpu was running: 7, 10.5, 19ns etc), vector >>unit busy. It also had 5 programmable counters which could be set to track >>such things as >> . number of in stack branches >> . number of branches NOT taken >> . number of times opcode xx was executed >>and a whole host of other neat things. All this could be accesed from a >>fortran program. > >One thing that the ETA lacks is a count of the page table traffic >generated by the memory management unit. > >When a programmer suspects thrashing, the average OS can help by >reporting paging rates, task switch counts, interrupt load, ethernet >packets, and so on. The OS typically is unable to report on cache >traffic or on TLB traffic. To the serious performance tuner, this >is a flaw. On rare occasions, it's even a serious flaw. >-- >Don D.C.Lindsay Carnegie Mellon School of Computer Science One of the things I really *HATE* about most "modern" computer systems is their total black-box'edness. Some don't even have power lights. I really miss "lights". They enable one to "at a glance" know amazingly well what a machine is up to. They enable one to perform "miracles" of online diag- nosis of ailing systems. If the cpu doesn't work, lights won't help all that much, but for misbehaving systems they were godsends. 'fer instance, I remember a time when the KI-10 (ye olde second-generation PDP-10) I was on just hung. Stopped. Did not responde. A friend and I simultaneously converged on the machine room to check it out. Not a light flashing. Hmm, looks like it halted. Glance at console terminal to see halt reason. Odd, no crash info. Oh, the RUN light is on. Odd. No PI channel active, and they're all enabled. OH - look at that, PI system is not on. This is wrong! Hey, how's about we toggle in a CONO PI,PION instruction and XCT it from the switch register. Poof! Timesharing resumes. Disks resumes chattering. Terminals resume terminaling. And all that. No user lost any data (except terminal typein of course - not a single typeout character was lost, no disk files lost or corrupted, only a 3-minute pause in ser- vice was noticed). What is the point, all you single-user workstation users ask? Just reboot the machine if you have a problem? Well, the point is that in less time than it takes my Sun3/Unix to reboot (and it only has one little ole WrenV disk), we were able to walk down the hall, look at a hung system, diagnose it, and cure it. NO DATA/FILES LOST, OR CORRUPTED (Ha - just try that, you UNIX workstation users, see how many files you lose when you forcibly re- boot!) Now I'm told that the reason for the demise of the lights was that they accounted for 10% of the cost of the system (not an insignificant amount, that). Further, I guess that with a front-panel "footprint" of about 2 inches by 20 inches for modern workstations (e.g., my Sun3/60), there's not a lot of room for lights. But boy do I get annoyed when I come in, the screen is blanked out, and won't unblank. How helpless! I can't even tell if the damn power is on or not! (well, OK, I can feel how hot it is and surmise that it is on if it is hotter than ambient, or I can kill the AC/Stereo/Office chatter and probably hear the little boxer fan whirring away, and I even gotta admit that there is this little LED array in back beside/underneath the Ethernet/keyboard connectors that, if flashing, would tell me that the power is at least on - assuming I feel like clearing a path to the backside of my machine). The point is that today's systems simply do not seem to pay much attention to providing system diagnosis or tuning info. I consider this to be a serious flaw in their architecture. (Don't take me wrong, I am not a total curmudgeon - I am truly impressed with the advances H/W technology has provided: a multiple-MIPS many-MB system in a box smaller than ONE power supply (out of several) in ONE memory box/rack (out of many) for a less- powerful (raw-MIPS-wise) 10-year-old cpu, all at a cost less (in adjusted dollars) than a 2400 baud 24-line x 80-character Video Display Terminal of 20 years ago... It's just that in the race for H/W miracles I think one important aspect of total system design has been lost in the name of saving a few up-front dollars - at an untold cost of lost and reduced manhours at some later point in the product life.) As to "counters", the KL-10 (another of ye olde PDP-10 processors, although of a more recent vintage, it's only 10 years old) while having no lights (not true, it had a power light, and a fault light; also the PDP-11 front- end had, let's see, 16 data lights, 18 address lights, and 4 (6?) cpu status lights, all of which were more-or-less useless, not even being enter- taining to watch), did have a "PERF" board. This board was a "peripheral" device that just sorta sat on the cpu's most intimate inner busii and watched stuff. You could program it to count in either event mode (number of times "X" happened) or in duration mode (count of clock cycles "X" was happening), for "X"es of PI activitity, I/O activity, and half a dozen other useful tidbit-wise things. You could use the PERF board to measure the cache hit ratio. Etc. And so on. A very useful tool for measuring just what your system was doing. System performing poorly? Just glance at the display, and ask the obvious question like "Why is the system spending 40% of its time at PI 4 (the network PI channel)" and procede from there to investigate who is flooding the network with bogus packets. Etc. This sort of functionality is readily adaptable to today's silicon miracles (and in fact pretty much has to be tightly integrated into the cpu chips to even exist), but I don't see it happening. At least, not in the mainstream cpus (Intel - anyone home? Motorola? National? Anyone? Anywhere? Sigh.) -RDH P.S. I will admit that the H/W people have a good gripe with the S/W people about not using the nifty H/W provided. The only reason that TOPS-10 used the PERF board provided in every KL-10 system is that I "discovered" it one day whilst idly perusing the print set (Hey, what's this thingie? Looks neat, how do I use it? Wowie, neato stuff. You mean noone uses it? Well, can't have this, it's too neat not to use!), thought it was neat, and "slipped" it into the system one weekend when noone much was looking. Shipped it as an unsup- ported tool. H/W people get annoyed when S/W people ignore their nifty toys, so all you S/W types need to encourage your H/W types...