djbailey@skyler.mavd.honeywell.com (04/26/91)
Someone must have thought of this before. What are the references for, problems with, and current state of the art for the following concept? Suppose we describe a computer's performance as an n-dimensional vector. What would the dimensions be? Some possibilities are simple computation speed, complex computation speed (floating point and special functions), internal information transfer speed, in/out information transfer speed, and degree of parallelism in the architecture. Each dimension would be measured at a low level and scaled to create the vector. We also want to map applications into this same n-dimentional space so that appropriate computer systems can be identified. Perhaps the main problem with a performance vector is in figuring out how to map recognizable problem attributes into the vector space. Comments? -- Don J. Bailey
karplus@ararat.ucsc.edu (Kevin Karplus) (04/27/91)
The idea of trying to reduce a huge set of numbers to a smaller set that characterize them fairly accurately is not a new one. The technique is called factor analysis, and is widely used in the social sciences to try to make sense out of large, noisy data sets. To apply factor analysis to computer benchmarks you would need: 1) many different benchmarks, measuring the same or different aspects of performance 2) many different machines on which the benchmarks have been run. A factor analysis would try to reduce the number of dimensions from the number of benchmarks down to a smaller number (probably two or three dimensions, but this will depend on how many are needed to adequately explain the results). The hard part is to come up with an explanation of what the "factors" mean. Sometimes they correlate particulalry well with one of the original dimensions, and so can be roughly equated with whatever that dimension measured. Having a large number of "pure" tests in the benchmark set will make explaining factors somewhat easier. Kevin Karplus
mark@hubcap.clemson.edu (Mark Smotherman) (04/27/91)
djbailey@skyler.mavd.honeywell.com writes: > Suppose we describe a computer's performance as an n-dimensional > vector. What would the dimensions be? You might want to look at how Dan Siewiorek used Kiviat graphs rather than vectors, i.e., pp. 46-48 in D.P. Siewiorek, C.G. Bell, and A. Newell, Computer Structures: Principles and Examples, McGraw-Hill, 1982. These graphs are visual representations of performance measures for, in Siewiorek's case, CPU processing (memory accesses/sec), main and secondary memory capacities and speeds, and three communication speeds (communication, external, human). The advantages that he sees for Kiviat graphs include a summary of major performance parameters and a graphical representation of system balance. Certainly the CPU performance metric is up for grabs. You can reject the usual suspects (e.g., MIPS, MOPS, MFLOPS). The reciprocal of CPI might be somewhat better but has obvious disadvantages in dealing with instructions like fused-multiply-add and HP's branch-and-add. Something like SPECmarks looks the best to me. Siewiorek also recently stated that for balanced contemporary systems Case's ratio (1 Mbyte of memory per CPU MIPS) and Amdahl's ratio (1 Mbit of I/O bandwidth per CPU MIPS) should both be upward adjusted by a factor of 8. These ratios probably shouldn't be expressed in terms of SPECmarks since SPEC, as of yet, only stresses the CPU and cache. However, looking at a scatter(!!) plot of SPECmarks vs. (1st+2nd)-level cache sizes, how about this ratio for a balanced machine: 6.4 Kbytes of cache per SPECmark: 5 SPECmarks ~ 32 Kbytes 10 SPECmarks ~ 64 Kbytes 20 SPECmarks ~ 128 Kbytes 40 SPECmarks ~ 256 Kbytes -- Mark Smotherman, CS Dept., Clemson University, Clemson, SC 29634-1906 mark@cs.clemson.edu or mark@hubcap.clemson.edu
apm@vipunen.hut.fi (Antti Miettinen) (04/28/91)
In article <1991Apr25.174542.100@skyler.mavd.honeywell.com> djbailey@skyler.mavd.honeywell.com writes: >Someone must have thought of this before. I haven't...so if you have, skip this now and never flame me ;) Really..I'm inserting a page character here which should stop every pager, so skip now.. >Suppose we describe a computer's performance as an n-dimensional >vector. I think this is a great idea. The problem is that everybody would like to get only one number to characterize the performace of a computer. In my opinion the concept of vector representation is ideal for computer performance. Computers are different and perform well in different areas. >What would the dimensions be? Good question ;) >Some possibilities are simple >computation speed, complex computation speed (floating point and >special functions), internal information transfer speed, in/out >information transfer speed, and degree of parallelism in the >architecture. Each dimension would be measured at a low level and >scaled to create the vector. And the problem that everybody would only take the absolute value of the performance vector. But vector as performance figure would be quite usefull. Figure this: computer X has a performance vector whose magnituse is HUGE but it is severy bent to the direction of floating point performance. Would be very usefull if we had well established axis base, I think. >We also want to map applications into this same n-dimentional space so >that appropriate computer systems can be identified. Perhaps the main >problem with a performance vector is in figuring out how to map >recognizable problem attributes into the vector space. Comments? Think of the Real World (what's that?). What do we want the computers to do? My suggestions for the dimesions: - floating point performance (whetstones?) - integer performance (2.x dhrystones?) - graphics performance (something I don't know too much about) - I/O performance (average seek, kB/sec or something..?) -- Corrections on spelling etc. directly to: apm@kata.hut.fi (I would appreciate any, really)
djbailey@skyler.mavd.honeywell.com (04/29/91)
In article <APM.91Apr28021454@vipunen.hut.fi>, apm@vipunen.hut.fi (Antti Miettinen) writes: >>Suppose we describe a computer's performance as an n-dimensional >>vector. > > I think this is a great idea. The problem is that everybody would like > to get only one number to characterize the performance of a computer. Ideally, computer performance should be measured relative to the intended use. If we could really map computer performance and application needs into the same vector space, then we could measure the distance between the application vector and various computer vectors. I'm certainly not an expert at computer performance evaluation. I'm looking forward to the comments on this chain. -- Donald J. Bailey (djbailey@skyler.mavd.honeywell.com) 2600 Ridgway parkway Minneapolis, MN 55413
meissner@osf.org (Michael Meissner) (04/29/91)
In article <APM.91Apr28021454@vipunen.hut.fi> apm@vipunen.hut.fi (Antti Miettinen) writes: | Think of the Real World (what's that?). What do we want the computers | to do? My suggestions for the dimesions: | | - floating point performance (whetstones?) | - integer performance (2.x dhrystones?) | - graphics performance (something I don't know too much about) | - I/O performance (average seek, kB/sec or something..?) I've thought for some times that these are too low level. I think it needs to be raised to a higher level, for example: compiles/links/executes per hour vector codes per day etc. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?
apm@vipunen.hut.fi (Antti Miettinen) (04/30/91)
In article <1991Apr29.125222.103@skyler.mavd.honeywell.com> djbailey@skyler.mavd.honeywell.com writes: >Ideally, computer performance should be measured relative to the intended >use. Yes. As a matter of fact I now remember that a laboratory here at Helsinki University of Technology at least once used some funny method in choosing workstations. If I remember correctly they had a lot of numbers describing some characteristics of the machines. They then mapped these into two dimensional space. They also mapped an imaginary ideal machine to this representation and chose the machine which came nearest to the ideal machine. The method they used in reducing the dimensions is used in shape/pattern/whatsthecorrectoword? recognition. I know I should know what the method was. >I'm certainly not an expert at computer performance evaluation. Neither am I. Just saw this lonely post with an IDEA and felt it was nice ;)
conte@crest.crhc.uiuc.edu (Tom Conte) (05/07/91)
In article <1991Apr29.125222.103@skyler.mavd.honeywell.com>, djbailey@skyler.mavd.honeywell.com writes: > In article <APM.91Apr28021454@vipunen.hut.fi>, apm@vipunen.hut.fi (Antti Miettinen) writes: > >>Suppose we describe a computer's performance as an n-dimensional > >>vector. > > > > I think this is a great idea. The problem is that everybody would like > > to get only one number to characterize the performance of a computer. > > Ideally, computer performance should be measured relative to the intended > use. If we could really map computer performance and application > needs into the same vector space, then we could measure the distance > between the application vector and various computer vectors. > > I'm certainly not an expert at computer performance evaluation. I'm > looking forward to the comments on this chain. > Well, indeed this isn't a new idea: @techreport{Saav88, author="R. H Saavedra-Barrera", title="Machine Characterization and Benchmark Performance Prediction", month=jun, year=1988, institution="Computer Science Division, University of California", address="Berkeley, CA", number="UCB/CSD 88/437"} @techreport{Pond90, author="C. G. Ponder", title="An Analytical Look at Linear Performance Models", institution="Lawrence Livermore National Laboratory", address="Livermore, CA", month=sep, year=1990, number="UCRL-JC-106105" } @article{SaSM89, author="R. H Saavedra-Barrera and A. J. Smith and E. Miya", title="Machine characterization based on an abstract high-level language machine", month=dec, year=1989, journal=ieeetc, number=12, volume=38, pages="1659-1679"} This topic is related to my Ph.D. work. You might also like to read: @article{CoHw91, author="T. M. Conte and W. W. Hwu", title="Benchmark characterization", journal="IEEE Computer", month=jan, pages="48-56", year=1991} ------ Tom Conte Center for Reliable and High-Performance Computing conte@uiuc.edu University of Illinois, Urbana-Champaign, Illinois Fast cars, fast women, fast computers
eugene@nas.nasa.gov (Eugene N. Miya) (05/08/91)
Performance vectors (done before) [do you need more references?]. This is a nice idea in theory, but very difficult to do in practice. It is subject to many non-linear, hard to control effects: operating system paging and swaping, compiler optimizations, cache effects, etc. I think we will inevitability be forced to do this, but it would require some stabilization in the industry. One might be able to make allowances for these effects. But we are talking major work. It requires a "bottom-up" approach, and it won't be as portable as the "bc/useless-benchmark." So until companies are willing to make public their compiler technology, and users take the time to analyze their codes, and hardware stablizes. I doubt it would happen. Oh, I will say that I think it will happen on supercomputers first, some consensus on theory is required (we think far too linearly: we believe in the "mythical-MFLOPS/MIPS:" 1000 CPUs == a CPU which is 1000* as fast. Not so.). The theory will include: "semi-groups," vector-valued measures, and we need more metrics to characterize software (I've thought about Halstead measures and I know the people at the SRC are thinking about McCabe measures). We must fight useless measures. I think it will happen, just not quickly because of market and political considerations. It will happen because some applications need the performance. I don't think workstations will drive this work. I'm just trying to catch up on this news group. --eugene miya, NASA Ames Research Center, eugene@orville.nas.nasa.gov Resident Cynic, Rock of Ages Home for Retired Hackers {uunet,mailrus,other gateways}!ames!eugene