saghir@eecg.toronto.edu (Mazen Saghir) (01/16/91)
In a previous post I asked about information on the dixie simulator, and as it turns out, I meant the pixie (with a "p") instruction simulator. Does anyone know where I could get some information on this simulator? Thanks in advance. Mazen Saghir -- Computer Group | e-mail: saghir@eecg.toronto.edu Department of Electrical Engineering | University of Toronto | ** CCE '89 ** Toronto, Ontario M5S 1A4 CANADA |
mash@mips.COM (John Mashey) (01/18/91)
In article <1991Jan15.162526.2139@jarvis.csri.toronto.edu> saghir@eecg.toronto.edu (Mazen Saghir) writes: > >In a previous post I asked about information on the dixie simulator, and as >it turns out, I meant the pixie (with a "p") instruction simulator. Does >anyone know where I could get some information on this simulator? Thanks in >advance. "pixie" is a program shipped as part of a standard binary distribution for MIPS systems; I believe DEC ships it standard on DEC{server,station} as well. It is not a simulator; here's what it does and how you use it: 1) assume you have an executable X, and that running it with a given input takes Y seconds. 2) type pixie X, which creates a new executable X.pixie, that contains code to count the number of times each basic block is executed, and if you invoke the right options, also generates complete address traces. this takesa few seconds. 3) run X.pixie, which creates a counts file that records the numbers of executions. 4) say prof X, which gives you: number of instruction cycles spent by subroutine, sorted by frequency number of calls per subroutine number of cycles spent per line of source code, showing the top N lines of code These numbers do NOT give you complete times, in that they do not show cache & memory system overhead. However, by comparing the cycle count to Y*clock rate, you can get an idea of the percentage of time occupied by instruction cycles versus memory cycle lossage. Typical would be 60-80% going to instructions. This information gets used by programmers, as it gives you accurate profiling without profiling libraries, special compilers flags, etc. (Note that many vendors do not ship profiled versions of system libraries, making it difficult to see the time going there.) 5) say pixstats X, which gives you numerous gory details of instruction usage, number of registers saved per subroutine call, # loads, instruction concentration (i.e., what percentage of instruction cycles would fit in a perfect associative cache of 1 word, 2 words, 4 words, etc.) Msot of this is of interest only to computer architects. 6) There is an additional cache simulator, that comes as a part of separate software package (System PRogrammer's Package), which takes the output of address traces above, plus paramaters to specify: refill size, write-buffering, memory latency, etc, etc, and compute the number of cycles lost to memory latency, write-buffer stalls, etc; adding those cycles to the instruction cycles shoudl give a pretty good approximation of the actual time. This tends to be of use to people designing hardware systems, who want to see the tradeoffs between memory speed, cache size, refill-size, performance, and cost. SO, if you have a MIPS system, or (usually) a MIPS-based system, you should be able to use pixie, pixstats, and prof. There are manual pages on these things on the systems. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
tom@ssd.csd.harris.com (Tom Horsley) (01/18/91)
The 88k based Night Hawk 4000 series machines that Harris Computer Systems
make come with a tool similar in many ways to the PIXIE tool that comes with
MIPS systems. This is the analyze88 tool. It can read an existing executable
and generate a new executable patched to generate basic block counts and
dump the information to a file when the program exits. The report88 program
can then be used to generate various reports from the basic block
statistics.
In addition to the profiling capability, analyze88 also serves as a
post-linker code optimizer (reducing many of the 2 instruction sequences
compilers generate for memory references to one instruction, by putting the
most commonly referenced high 16 bit address values in reserved registers
that act as program wide common sub-expressions).
It can also generate annotated dis-assembly listings showing the static
instruction timing within each basic block, where instructions are blocked
due to resources that are not available yet, etc.
Like PIXIE, it has the limitation that it cannot tell you anything about the
cache, the timings it generates assume the 88k will never have to wait on
memory.
P.S. I gratefully acknowledge that many of the ideas for analyze88 came from
things I heard about PIXIE, they are all great ideas, and the tool has been
enormously beneficial to us in-house at Harris in helping to make our 88k
compilers generate unsurpassed quality code for the 88k (shameless plug :-).
--
======================================================================
domain: tahorsley@csd.harris.com USMail: Tom Horsley
uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle
Delray Beach, FL 33444
+==== Censorship is the only form of Obscenity ======================+
| (Wait, I forgot government tobacco subsidies...) |
+====================================================================+