roskos@csed-1.UUCP (Eric Roskos) (02/24/88)
In article <10422@ut-sally.UUCP>, nather@ut-sally.UUCP (Ed Nather) writes: > After writing the "clear" routine in assembler, using the repeated store > instruction available on the 80x8x chips, I found the execution time to > be 13 times faster [for a clear-memory instruction], by direct measurement. Notice, though, that again this is a case where the language doesn't really express what you are doing very well. It's true that a really "smart" optimizer that recognized a lot of special cases might recognize something like for (p = &mem[low]; p < &mem[high]; p++) *p = 0; as what a human reader recognizes it as -- a shorthand for "clear mem array from offset `low' to one less than offset `high'" -- but it is not very reasonable to expect it to recognize it; after all, it is something of a dialect. Actually to do the above using an 80x86 instruction the optimizer would have to undo part of the programmer's attempt to optimize, since the instruction needs a count of words or bytes, whereas the programmer (me) was thinking having a single counter (the address counter) would be more efficient than having two counters (the address and the count of words), whereas on the 80x86, at least, it isn't. What you really need is something like memset(&mem[low], '\0', high-low); Now, this notation is in the form of a procedure call, but, if you have a really clean model of language semantics, procedure calls are just a notation for an operator which may or may not be compiled as a subroutine, just like any other operator (floating point divide, for instance) -- so the above might generate a call to a routine that uses the 80x86 family instruction you mentioned; or, it might generate the code inline, as an inline procedure; as, in fact, some compilers can do. (I think the Microsoft C 5.0 Optimizing Compiler does this for some types of procedures, actually, though I can't remember if it does it for arbitrary ones or not (there's a "pragma" that lets you specify some of these sorts of things). It does do it for things like "intrinsic" functions if you use the -Oi switch.) And, that is fine; it is a good way to solve the problem. "memset" is an operation that sets memory to a particular value; it tells what you really want to do; and it is in a form that even a simple compiler can recognize. Having this sort of thing is much more cleaner architecturally than expecting an optimizer to figure out what you "really meant" in a piece of code, eventhough it may be possible for it to do so... -- Eric Roskos, IDA (...dgis!csed-1!roskos or csed-1!roskos@HC.DSPO.GOV)