aglew@ccvaxa.UUCP (02/13/88)
I've been trying to go beyond the obvious uses of profil(2), and have certain questions and wonderings: (1) profil(buff,...) char *buff; On the systems I've looked at, buff is treated as an array of shorts. Shouldn't UNIX be honest and say short *buff? Most systems I know have 16 bit shorts. Now, it occurs to me that I might like to profile some really long running programs, that run more than, say, 3 days - easily long enough to overflow the profiling bins. As a start, shouldn't we say typedef short *profilbinT? (2) Exactly what is the correspondence between profil bins and code locations and actual instructions, particularly when scaling? The man page says something about scale=0x10000 implying a one-to-one correspondence between words of code and words (I assume counting bins) in the buffer. Now, on some systems an instruction can begin at an arbitrary byte location. Does this mean that I should use a scale of 0x20000 to make sure that I get a counting bin for the beginning of every possible instruction on such a machine (eg. a VAX)? The man page says that scale = 0x8000 maps each pair of words of code to a word in the buffer. Again, I assume that these are 16 bit words, and words in the buffer refer to short counting bins. I have observed the following correspondence of byte offsets from the base code location to bin numbers, using 0x8000, "2 to 1": W 0 - Byte 0 - Bin 0 Byte 1 - Bin 0 W 1 - Byte 2 - Bin 1 Byte 3 - Bin 1 W 2 - Byte 4 - Bin 1 Byte 5 - Bin 1 W 3 - Byte 6 - Bin 2 What is the rationale here? It makes sense if instructions are 16 or 32 bits, and the sampled PC points to the next instruction, on a machine that requires 32 bit instructions to begin on a 32 bit boundary, because that means that in a sequence I16.I16.I32, the two adjacent 16 bit instructions get counted in the same bin; but it doesn't seem to make sense on a machine like the VAX. Now, the same code is present on the 3B2 - does it go all the way back to a PDP-11? (3) How do the System V scale factors relate to the BSD scale factors? ie SV 0177777 <-> 0x10000. Just add 1? (4) What's all this scale garbage anyway? I'm sure that it was a lot cheaper on a small machine, but 16 bits just isn't capable of expressing some of the fractions that might be appropriate to deal with on a large machine. Say I have a 256M text program that I want to divide into 4 counting bins - can I do that with profil? Maybe I can't give it 64K of counters. Maybe the scale argument should be made into a floating point number. But single precision floating point may only give you 6-8 decimal digits of accuracy, not enough to scale properly on *really* large programs. Maybe the scale argument should be a shift factor, specifying the power of two to divide by? Or maybe there should be no scale argument, but just a (CodeBottom,CodeTop) pair, and you let the system decide on what an appropriate representation is. After all, you are guaranteed that the addresses are representable, in as portable a form as a C pointer provides. Andy "Krazy" Glew. Gould CSD-Urbana. 1101 E. University, Urbana, IL 61801 aglew@gould.com - preferred, if you have nameserver aglew@gswd-vms.gould.com - if you don't aglew@gswd-vms.arpa - if you use DoD hosttable aglew%mycroft@gswd-vms.arpa - domains are supposed to make things easier? My opinions are my own, and are not the opinions of my employer, or any other organisation. I indicate my company only so that the reader may account for any possible bias I may have towards our products.