urip@orcisi.UUCP (02/16/87)
As one of the designers of the new optimizing compiler for the NS32000, I enjoyed reading the articles posted on the above subject. I am going to describe the way that our implementation deals with some of the issues that were mentioned (see title). It is needless to say (but I'll say it anyway...) that I completely agree with all those who condemn using pointers to local variables, and assumming that incrementing the pointer will point to the next declared variable. Not only that any of these variables may end up in a register, but even those that stayed in memory (because there was no free register for them) are reallocated to fill the "holes" created by the lucky ones (i.e. variables that won a register). This reduces the size of frames (sometimes down to zero). Someone from Sun (don't remember the name) has suggested to declare all variables that are not affected by pointers or by other functions as 'register' in order to help the optimizer. As far as I understand this is completely unnecessary. According to the "white book" 'register' storage class is applicable only for automatic variables or function parameters, and it is not possible to take the address of a register variable. This leaves us with local variables whose address is never taken, which are considered safe by the optimizer since they cannot be affected by pointers or by other functions (unless we play dangerous games with pointers, which we already promised not to!!). The conventions for saving registers by the caller vs. the callee were mentioned too, but no one mentioned the posibility of a compromise between the two - combined caller and callee save. This convention is used by good old Portable C Compiler (pcc). By this convention, the registers are divided into two groups: volatile and non-volatile registers (on the VAX and NS32000 the volatile registers are R0..R2 and the rest are non-volatile). Volatile register means that a procedure call may change its contents, so it has to be saved by the caller (or not used across procedure calls). Non-volatile register means that a procedure call is guaranteed to retain its old value, so it has to be saved by the callee (if used by the callee). The pcc uses non-volatile registers for 'register' declarations and volatile registers for expression evaluation. This means that every 'register' declaration causes save and restore of that register in the beginning and end of the procedure that contains the declarations. Declaring a variable that is used only once or twice as 'register' may therefore cause perfomance degradation (depending on the exact instruction timings). Since most expressions do not contain function calls, using non-volatile registers for expression evaluation is very efficient. Our optimizer uses the same convention (so modules compiled by either compiler are compatible) but squeezes a little more performance out of it by using the following strategy: a variable that spans across procedure calls (in optimizers' jargon: a variable whose live range contains procedure calls) is allocated a non-volatile register (if still profitable after taking into account the cost of save/restore). A variable that doesn't live across procedure calls is allocated a volatile register (if available), which is cheaper to use because it does not require save/restore (because, by convention, the caller does not expect it to retain its value...). I hope this answers the original question: what is the available number of registers for 'register' declaration. Uri Postavsky (...utgpu!syntron!orcisi!urip) Currently with O.R.C. Toronto, formerly with National Semiconductor Tel Aviv.
faustus@ucbcad.UUCP (02/22/87)
WRT the discussion about whether the caller or the callee should save registers -- wouldn't the most efficient strategy be for the caller to pass the callee a mask of which registers are currently in use (by itself and previous functions)? That way, the callee could AND its mask with the caller's mask and save those registers, and OR its mask and the caller's and pass that to any functions that it calls. I think this should prevent any unnecessary saving of registers. Wayne
lodman@ncr-sd.UUCP (02/23/87)
In article <1283@ucbcad.berkeley.edu> faustus@ucbcad.berkeley.edu (Wayne A. Christopher) writes: >WRT the discussion about whether the caller or the callee should save >registers -- wouldn't the most efficient strategy be for the caller to >pass the callee a mask of which registers are currently in use (by >itself and previous functions)? That way, the callee could AND its >mask with the caller's mask and save those registers, and OR its mask >and the caller's and pass that to any functions that it calls. I think >this should prevent any unnecessary saving of registers. Wouldn't this force the linker to insert code to save and restore registers? It doesn't sound like a workable solution. -- Michael Lodman mike.lodman@SanDiego.NCR.COM {sdcsvax,cbatt,dcdwest,nosc.ARPA,ihnp4}!ncr-sd!lodman
greg@utcsri.UUCP (Gregory Smith) (03/03/87)
In article <1283@ucbcad.berkeley.edu> faustus@ucbcad.berkeley.edu (Wayne A. Christopher) writes: >WRT the discussion about whether the caller or the callee should save >registers -- wouldn't the most efficient strategy be for the caller to >pass the callee a mask of which registers are currently in use (by >itself and previous functions)? That way, the callee could AND its >mask with the caller's mask and save those registers, and OR its mask >and the caller's and pass that to any functions that it calls. I think >this should prevent any unnecessary saving of registers. In article <1379@ncr-sd.SanDiego.NCR.COM> lodman@ncr-sd.UUCP (0000-Mike Lodman) writes: >Wouldn't this force the linker to insert code to save and restore >registers? No. It would require support in the architecture itself - a 'register-in-use' word, with a bit for each register. * At the top of each function would be an instruction with a register mask field indicating which registers would be needed within that function. This would be anded with the current mask to produce the set of registers which must be pushed. The 'register-in-use' mask would be then be 'or-ed' with this value to produce the new mask. The old mask would also have to be pushed. ( can you say 'CISC'? ) The same instruction would of course perform the usual stacrobatics (and make coffee if anybody wanted it). On function exit there would be an instruction which would undo this. This instruction would also contain the same mask that appeared in the entry instruction. The problem is, you wouldn't gain much. As you go deeper into the stack, bits in the 'reg-in-use' word would become set, but would never be cleared by anything but a function return. Once all of them become set, you have gained nothing over a conventional (vax/68k/NS32K) 'push-specified-set' instruction, and you are pushing and popping an extra word to boot. Presumably, you could have a 'release-all-registers' instruction which would push all the regs-in-use, and the regs-in-use word, and clear the regs-in-use word. A complementary instruction would pop them. If a function A were to make many calls to a function B, and B needed a lot of regs, it would then be advantageous for A to 'release-all-regs' before the loop, so that the save/restore need only be done once. But how to determine this? * you could do this on an arbitrary machine with run-time code, but I don't know of a machine with a push-specified-set op that allows a non-constant set. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
kary@hplsdla.UUCP (03/09/87)
> The problem is, you wouldn't gain much. As you go deeper into the stack, > bits in the 'reg-in-use' word would become set, but would never be > cleared by anything but a function return. The 'reg-in-use' word could be cleared at function entry. Since the contents of the registers are on the stack, they can be considered no longer in use. Another thing that could be done is the 'reg-in-use' word could set the appropriate bit whenever a register is loaded, (the 'reg-in-use' word should be a processor register, though not necessarily addressable) then there would be no compiler support required for the register mask feature. The register mask would then reflect only the registers actually in use at the point of the function call, registers that are first used *after* the call wouldn't have to be saved. One thing missing from this scheme is a way to clear a bit when a register is no longer in use. Perhaps clearing that bit could be tacked onto every register instruction. Every register instruction now has two modes, one standard mode and one to clear the 'reg-in-use' bit. The compiler could then generate the second version the last time a register is used in a function. This machine is now a MISC (Machiavellian Instruction Set Computer :-) Dan Kary hp-lsd!kary
hrubin@osupyr.UUCP (Herman Rubin) (03/09/87)
It is not that difficult for hardware to avoid register loss in subroutine calls while minimizing register saving. The callee saves those registers which the caller tells it are in use and which the callee needs, and keeps a bit pattern of those the caller sends which it does not use. It would then send this list or-ed with the list of its active registers when it makes another call. Thus it is not merely building up a long list by or's. Having written programs with unavoidable rare conditional subroutine calls, I can unequivocally state that having the caller save registers can be so costly as to require kludges. -- Herman Rubin Until mid May: Department of Statistics, Cockins Hall, 1958 Neil Avenue Ohio State University, Columbus OH43210 hrubin@osupyr or cbosgd!osu-eddie!osupyr!hrubin or ts0474@ohstvma.bitnet "Permanent": Department of Statistics, Purdue University, West Lafayette IN47907 hrubin@l.cc.purdue.edu or ihnp4!pur-ee!stat-l!hrubin or hrubin@purccvm.bitnet