ccplumb@watnot.UUCP (04/15/87)
I've been dreaming up A RISCy architecture in my spare time, and in the course of trying to minimize the number of memory accesses per instruction, I ran into the problem of handling JSR's. A branch already requires an extra fetch to fill the pipeline, and adding a stack push would make things ugly. What if JSR moved the return address into another register? If the register was R0 (A register hardwired with the constant 0), you'd have a JMP. To nest JSR's, the called procedure would need to save this register, but it needs to save registers for locals anyway, so it shouldn't be too much of a hassle. As far as a compiler is concerned, the return register is just another reg that's trashed by all function calls and needs to be restored before exit. Also, if you were really tricky, you could nest a few levels without saving any registers at all. Caller and Callee need to agree on which register is used to pass the return address, but if the calling sequence is known, you can set up a different convention for every level of nesting. I suppose if register windows were used, this idea would correspond to putting the PC into the moveable window. I'me sending this out for comment. Are there any really serious bugs in it? Would it be really wonderful? Thanks for any advice. -- -Colin Plumb (watmath!watnot!ccplumb) Silly quote: There's a flaw in the ointment.
mason@tmsoft.UUCP (04/16/87)
In article <12884@watnot.UUCP> watmath!watnot!ccplumb (Colin Plumb) writes: > > I've been dreaming up A RISCy architecture in my spare time, and in the >course of trying to minimize the number of memory accesses per instruction, >I ran into the problem of handling JSR's. A branch already requires an >extra fetch to fill the pipeline, and adding a stack push would make things >ugly. > >What if JSR moved the return address into another register? If the register I'm not sure if this is currently in use by anyone, but I've also thought of it in a RISCy stack machine (no, I don't think a conflict in terms) that I've been doing thought experiments with. I don't think you want the generality of being able to save the return in ANY register (like you can do with the PDP11 (although with a stack push)). I foresaw 2 different save places. This would allow the compiler to not have to save the return register if the routine didn't call anything else (with 2 return registers, compiler generated calls (like structure copy) could use the second call/return pair) This may also be advantageous for the stack machine I was thinking of because the result on the TOS doesn't interfere with the return address (you can get some of the advantage of a data stack+return stack with only one real stack). -- ../Dave Mason, TM Software Associates (Compilers & System Consulting) ..!{utzoo seismo!mnetor utcsri utgpu lsuc}!tmsoft!mason
henry@utzoo.UUCP (Henry Spencer) (04/16/87)
> What if JSR moved the return address into another register? ... As I recall, this is exactly what the call instruction on the original Berkeley RISC designs does. > ... Are there any really serious bugs in it? None that are obvious. > Would it be really wonderful? ... I don't know about "really wonderful", but it seems a sensible thing to do. -- "We must choose: the stars or Henry Spencer @ U of Toronto Zoology the dust. Which shall it be?" {allegra,ihnp4,decvax,pyramid}!utzoo!henry
rwa@auvax.UUCP (Ross Alexander) (04/17/87)
In article <12884@watnot.UUCP>, ccplumb@watnot.UUCP writes: > I've been dreaming up A RISCy architecture in my spare time, and [...] > I ran into the problem of handling JSR's. > [ ... ] > What if JSR moved the return address into another register? > -Colin Plumb (watmath!watnot!ccplumb) As all old Waterloo MFCF hackers know ( :-) please! ) that's the way that Honeywell 6000's do (did?) things - the JSR analogue was an instruction called Transfer-Set-indeX (tsx) which jumped and dropped the return address out into an index register of your choice. So if you called subr A via 'tsx 1,a' and A called B via 'tsx 2,b' then the returns would be 'tra 0,2' and 'tra 0,1' in that order without any stack operations. The fly was that there were only 8 index registers, so one ended up using one (the B compiler used index reg 7 (?)) to act as a stack pointer, and doing loads and stores to fake pushing and poping the return addresses. Grotty. Of course, in hand written assembler (William Ince's APL interpreter comes to mind) this trick worked very well. But I wouldn't care to maintain that code. Anyway, I think life is simpler with a conventional stack and JSR/RTS instructions. ...!alberta!auvax!aubade!rwa Ross Alexander, Athabasca University
louis@auvax.UUCP (Louis Schmittroth) (04/17/87)
The TSX was _the_ way to call a subroutine on the IBM-704, as I recall, but the last time I coded one was about 27 years ago. The 704 surely belongs in the hall of fame as one of the very successful vacuum tube scientific and engineering computers in the 1950's.
bcase@amdcad.AMD.COM (Brian Case) (04/18/87)
In article <137@auvax.UUCP> rwa@auvax.UUCP (Ross Alexander) writes: >In article <12884@watnot.UUCP>, ccplumb@watnot.UUCP writes: >> What if JSR moved the return address into another register? > >As all old Waterloo MFCF hackers know ( :-) please! ) that's the >way that Honeywell 6000's do (did?) things - the JSR analogue >was an instruction called Transfer-Set-indeX (tsx) which jumped >and dropped the return address out into an index register of >your choice. > >Anyway, I think life is simpler with a conventional stack and >JSR/RTS instructions. Just as a data point, the Am29000 call (and call-indirect, the only two "call" instructions) instruction has an 8-bit destination field for specifying the general register into which the return address is placed. This is good in that it allows the user to define the most appropriate procedure-call mechanism for his particular environment. (Actually, not many people will design unique procedure-call mechanisms, but the flexibility is important for some special applications.) A conventional stack and jsr/rts instruction set is great as long as it does what you want. As soon as the match isn't right, the trouble begins: you're stuck with what the machine gives you. The good ol' calls instruction on the VAX is nice but it is too slow because it does too much. bcase
mash@mips.UUCP (John Mashey) (04/19/87)
In article <12884@watnot.UUCP> watmath!watnot!ccplumb (Colin Plumb) writes: > > > I've been dreaming up A RISCy architecture in my spare time, and in the >course of trying to minimize the number of memory accesses per instruction, >I ran into the problem of handling JSR's. A branch already requires an >extra fetch to fill the pipeline, and adding a stack push would make things >ugly. > >What if JSR moved the return address into another register? .... MIPS R2000's do this. It works fine. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
robison@uiucdcsb.cs.uiuc.edu (04/20/87)
In article <12884@watnot.UUCP>, ccplumb@watnot.UUCP writes: > I've been dreaming up A RISCy architecture in my spare time, and [...] > I ran into the problem of handling JSR's. > [ ... ] > What if JSR moved the return address into another register? > -Colin Plumb (watmath!watnot!ccplumb) The PC/RT processor chip stores the return address in a register. This method is quite simple, since upon entry to most subroutines, the processor saves a block of registers on the stack anyway. In the case of a leaf procedure (a procedure with no embedded procedure calls), the processor can avoid the save. Arch D. Robison University of Illinois at Urbana-Champaign CSNET: robison@UIUC.CSNET UUCP: {ihnp4,pur-ee,convex}!uiucdcs!robison ARPA: robison@B.CS.UIUC.EDU (robison@UIUC.ARPA)
gerryg@laidbak.UUCP (Gerry Gleason) (04/20/87)
In article <136@tmsoft.UUCP> mason@tmsoft.UUCP (Dave Mason) writes: >I'm not sure if this is currently in use by anyone, but I've also thought of >it in a RISCy stack machine (no, I don't think a conflict in terms) that I've >been doing thought experiments with. There is one I know of, CRISP (C Reduced Instruction Set Processor) at AT&T. I did some work on the kernel for a experimental prototype of this machine. It's basically a risc processor with a top of stack cache. It seemed like a good architecture, but I don't know what they are doing with it. gerry gleason
firth@sei.cmu.edu (Robert Firth) (04/20/87)
In article <12884@watnot.UUCP> watmath!watnot!ccplumb (Colin Plumb) writes: > I've been dreaming up A RISCy architecture in my spare time... >What if JSR moved the return address into another register? If the register >was R0 (A register hardwired with the constant 0), you'd have a JMP. To nest >JSR's, the called procedure would need to save this register, but it needs to >save registers for locals anyway, so it shouldn't be too much of a hassle. >As far as a compiler is concerned, the return register is just another reg >that's trashed by all function calls and needs to be restored before exit... As someone who has implemented several languages on several machines, perhaps my thoughts might be helpful. In the majority of the codegenerators I've written, the first instruction of a procedure retrieves the return link from the place where the hardware put it. For example, the CA LSI-2 stores the return link inline before the called routine, so if you want recursion or reentrancy you've got to move it. The PDP-11 puts it on the SP stack, so if you want to allocate local variables towards high addresses you have to pop it off again or grow two stacks. And so on. The machines I've liked best stored the return link in a register. Not just for that reason; in addition they have both been very clean pieces of hardware (thanks, Perkin-Elmer, for the PE3200; thanks, MIPS, for MIPS), but one aspect of that cleanliness is that they don't try to tell language implementors how to think. You definitely have my vote for using a register. Another issue is the right operand of the JSR. Most machines seem to use the "effective address" as the operand, so whereas LOAD F will fetch the VALUE in F, JSR F will jump to the ADDRESS of F. I have never liked this. You lose nothing, and gain a lot, by evaluating the operand in Rmode, so that JSR #F calls F, JSR F calls the thing pointed to by F, and JSR Reg calls the thing whose value has been computed in the register. But this is an eccentric view.
baum@apple.UUCP (Allen J. Baum) (04/21/87)
-------- [] The Bell Labs CRISP call instruction saves the return address in the stack, which happens to be in the registers because of their stack cache. The HP Spectrum call instructions put the return address into any of the GPRs. I think even the old PDP-6/10/20 had instructions that did exactly that. {decwrl,hplabs,ihnp4}!nsc!apple!baum (408)973-3385
paul@unisoft.UUCP (Paul Campbell) (04/21/87)
In article <136@tmsoft.UUCP> mason@tmsoft.UUCP (Dave Mason) writes: >I'm not sure if this is currently in use by anyone, but I've also thought of >it in a RISCy stack machine (no, I don't think a conflict in terms) that I've >been doing thought experiments with. The INMOS Transputer is also a RISCy stack machine (it only has 3 stack registers but they claim that that is all you need for most expressions (it also has 2-4k of 50ns on chip RAM .....) Paul Campbell ..!ucbvax!unisoft!paul
kenny@uiucdcsb.UUCP (04/21/87)
/* Written 12:52 pm Apr 20, 1987 by firth@sei.cmu.edu in uiucdcsb:comp.arch */ In article <12884@watnot.UUCP> watmath!watnot!ccplumb (Colin Plumb) writes: > I've been dreaming up A RISCy architecture in my spare time... >What if JSR moved the return address into another register? Then firth@sei.cmu.edu replies: [arguments for storing return address for branches in register] <Another issue is the right operand of the JSR. Most machines seem to <use the "effective address" as the operand, so whereas LOAD F will fetch <the VALUE in F, JSR F will jump to the ADDRESS of F. I have never liked <this. You lose nothing, and gain a lot, by evaluating the operand in <Rmode, so that JSR #F calls F, JSR F calls the thing pointed to by F, <and JSR Reg calls the thing whose value has been computed in the register. <But this is an eccentric view. Not really; if you look at the PDP-11 architecture, it appears that a jump is in fact a move-immediate to the program counter. But why not just treat the program counter as another register, that happens to be used in auto-increment mode by the hardware? The subroutine linkage operations would expand to two-instruction pairs, with a RISC-y flavor: Call Return MOVE PC, Rn ADD #<size of jump>, Rn MOVE #<subr>, PC MOVE Rn, PC If you have three-address operations, it's simpler: Call Return ADD #<size of jump>, PC, Rn MOVE Rn, PC MOVE #<subr>, PC It's probably worthwhile combining the operations in the hardware, because procedure linkage is so expensive, *provided*, of course, that the hardware designer can do it cheaply. It's really nice, though, being able to use the PC as a general register -- I can't think of applications where I'd want to multiply or divide by it, but a load-direct from memory is useful for branch tables and the like, while having it available as an operand really cleans up position-independent coding.
phil@osiris.UUCP (Philip Kos) (04/21/87)
In article <1061@aw.sei.cmu.edu>, firth@sei.cmu.edu (Robert Firth) writes: > In article <12884@watnot.UUCP> watmath!watnot!ccplumb (Colin Plumb) writes: > > >What if JSR moved the return address into another register? > > .... The PDP-11 puts it on > the SP stack, .... Yeah, well, not if you tell it not to. As I recall, the PDP-11 JSR instruction allows you to specify *any* of the eight "GPRs" to hold the current PC contents, after pushing the old value of the specified register onto the stack. (Among other purposes, this was used [with R5] for passing the address of an in-line FORTRAN parameter address block - the scheme was referred to in RT-11 manuals as "subroutine linkage".) And to think I was feeling left out back when everyone was discussing the old IBM machines in comp.misc, just because I never wrote assembly language on anything more primitive than a 370! I guess I'm not as young as I thought I was... :-) -- ...!decvax!decuac - Phil Kos \ The Johns Hopkins Hospital ...!seismo!mimsy - -> !aplcen!osiris!phil Baltimore, MD / ...!allegra!mimsy - "And you'll be my duchess, my duchess of prunes!" - F. Zappa
utterback@husc4.HARVARD.EDU (Brian Utterback) (04/21/87)
The original poster was wondering whether anyone still used the method of storing the return address of a jmp in a general register. I can answer yes. The Cray-2 has as its equivalent of "JSR" the instruction "r,Ai Ak" Which branches to the address held in register Ak and stores the return address in Ai. Brian Utterback The above opinions are not really held by anyone, especially my employer.
watson@convexs.UUCP (04/22/87)
I believe the Berkeley RISC I and RISC II chips did just this. See the excellent doctoral thesis by Katavenis (spelling?) from UCB. This thesis won the ACM doctoral award a year or two ago.
jfh@killer.UUCP (04/22/87)
Changing JSR/BSR/CALL etc to a save-pc-and-jump instruction is OLD. Probably older than me. The first time I saw it was in IBM land. Correct me if I am wrong about the System 360, bu calls with it used a Branch And Link instruction that did exactly what was described. The PDP-11 family did weird things with JSR. The format was JSR func,reg. The current value of REG was stacked (viola - recursion is born), the current PC is saved in REG, and the address of FUNC is loaded into the PC. Note that JSR FUNC, PC Does exactly what (well, similiar) JSR FUNC does in Vax/MC68000/J-Random Chip. You return with RTS reg, where (unless you are into co-routines or weird results) reg is the same one used in the JSR. The microcode for the RTS instruction moves REG into PC, to restore the return address (note that this is a no-op if REG is PC) and then pops the stack into REG (this is where the return address gets loaded if REG == PC). My favorite uPC of all times is the RCA CDP1802. It was popular (?) long before the 8088, had 16 registers, all 16 bits and was CMOS. The best part was the it was a truely general purpose register machine. Any one register could be made into the PC by some instruction (whose name I forgot), and the SP could also be changed. - John. (jfh@killer.UUCP) Disclaimer: No disclaimer. Whatcha gonna do, sue me?
jfh@killer.UUCP (04/22/87)
In article <1061@aw.sei.cmu.edu>, firth@sei.cmu.edu (Robert Firth) writes: > The machines I've liked best stored the return link in a register. Not > just for that reason; in addition they have both been very clean pieces > of hardware (thanks, Perkin-Elmer, for the PE3200; thanks, MIPS, for MIPS), > but one aspect of that cleanliness is that they don't try to tell > language implementors how to think. > > You definitely have my vote for using a register. > > Another issue is the right operand of the JSR. Most machines seem to > use the "effective address" as the operand, so whereas LOAD F will fetch > the VALUE in F, JSR F will jump to the ADDRESS of F. I have never liked > this. You lose nothing, and gain a lot, by evaluating the operand in > Rmode, so that JSR #F calls F, JSR F calls the thing pointed to by F, > and JSR Reg calls the thing whose value has been computed in the register. > But this is an eccentric view. I read someone's complaint about the vax CALLS/CALLG instructions in an article I read after posting my reply to the original article, so I apologize for posting again. DEC has been doing things the *right* way since the early 70's with the PDP-11 and later with the Vax. JSR on a PDP-11 saves the return address in a register. JSR on a Vax (same for a PDP-11, but the Vax has more and harder to figure out modes :-) takes a dozen or so addressing modes (the illegal ones are the most fun (what does JSR (R15) _really_ do?) Trivia question - what does TSTW -(R15) do and is it legal? Other thoughts I thought of... Motorola does what DEC does (kinda) with the M68000 family. Except for the abundance of modes. JMP (A0,D0) is my favorite code to generate for switch's and JSR (A0,D0) probably has some equally handy usages (like device or file system switches (see "Unlinking "." in comp.bugs.sys5)). Yes, I am all for saving the return address in a register, makes interpreters and such much easier - co-routines and other non-standard constructions work much better also. I am also for having a way to stack the return address automatically (And Guy Harris thought the PDP-11 was really dead ... Here is yet another not-yet-dead feature ...) The only problem with the DEC approach is that it can't be pipelined too much since the branch address and the old PC may both want to be loaded into the PC at the same time. Of course, this is a problem with the uCode coder. - John. (jfh@killer.UUCP) Disclaimer: No disclaimer. Whatcha gonna do, sue me?
greg@utcsri.UUCP (04/28/87)
In article <1061@aw.sei.cmu.edu> firth@bd.sei.cmu.edu.UUCP (Robert Firth) writes: >Another issue is the right operand of the JSR. Most machines seem to >use the "effective address" as the operand, so whereas LOAD F will fetch >the VALUE in F, JSR F will jump to the ADDRESS of F. I have never liked >this. You lose nothing, and gain a lot, by evaluating the operand in >Rmode, so that JSR #F calls F, JSR F calls the thing pointed to by F, >and JSR Reg calls the thing whose value has been computed in the register. >But this is an eccentric view. You do lose something. If the 68000 JSR worked this way, e.g., there would be no way to JSR relative to the PC. JMP is effectively LEA <address>,PC and JSR is the same thing with a push. So do you think LEA is useless? Even if you don't need a PC-relative address, it is often more efficient than an absolute one. So what do you gain? Of course you can indirect one level further, but does that gain you a lot? It doesn't with CISC addressing modes, anyway. The PDP-11 provides other jump options: MOV ...,PC which is an Rmode jump, and ADD ...,PC which is a relative jump with general-address offset. As for 'JSR Reg', this is illegal on the 68K and PDP, where it is replaced for free by 'JSR (Reg)'. The analogous op on an NS32k is JSR 0(reg), with one byte for the 0, since non-indexed register indirection is not provided. You can say JSR reg, which is the same thing for different reasons: a register in an address context is assumed to contain the address. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...