ericmcg@pnet91.cts.com (Eric Mcgillicuddy) (12/21/90)
The WDC spec sheets do not go into detail on which instructions do not work properly with /ABORT, however there are some provisos listed. The processor status will be modified if /ABORT is asserted after cycle 3 of an RTI. The Processor status will be modified if /ABORT is asserted after a modify cycle. The PBR will be set to 00 if after cycle 2 of any hardware or software interupt and the DBR is also set to 00 if in emulation mode. /ABORT should not be held low for more than 1 cycle. After 1 cycle, the ABORT Latch is reset and will thus ABORT the ABORT interupt. As far as I can tell, /ABORT will only cause problems if used as an asynchronous interupt. Insuring that it is not triggered too far away from the positive transition of phi2 ( within tpcs nS) should prevent any badness from occurring, regardless of instruction. Leventhal merely suggests not using the ABORT vector and goes into no detail on what problems could be created. I vote that you modify GSOS rather than writing UNIX GS. I would certainly buy the card if it were compatible. BTW, I suggest you contact Tony Faddell, the ASIC '816 apparently 'corrects' the problems with /ABORT, he would know the most this particular interupt. Also BTW, what about using the newer 68852? I believe that it is a bit faster and might have a larger table cache on chip. UUCP: bkj386!pnet91!ericmcg INET: ericmcg@pnet91.cts.com
rhyde@ucrmath.ucr.edu (randy hyde) (12/21/90)
The following is a somewhat out of date article (since there are technical specs on the 65c832 apparently available) about the 65c832 chip. This originally appeared as an article in // Technical a year or two ago. interesting reading for assembly and hardware types. A Report on the 65c832 There have been lots of rumors concerning possible upgrades to the Apple //GS system. One central theme to an upgrade is an enhanced 65xxx microprocessor. Ever since the introduction of the 65c816 microprocessor, stories concerning its eventual upgrade, the 65c832, have flourished. Like all tales, the tale of the 65c832 chip has included fact intertwined with fiction. Someone would say "This is what I wish the chip would do." People repeating the story translated it to "This is what the chip will do." and wishes became design goals. For many of us, the 65c832 is going to be a big disappointment. We've been led to believe that the 65c832 will do all kinds of wonderful things. When it finally appears, it will represent a natural evolution of the '816, not the radical departure that everyone wants. The kind of stuff that big let downs are made of. However, keep one thing in mind-- the '832 will be completely upwards compatible with the '816, so no matter what it does differently, it certainly cannot be worse than the 65c816. A Little History The 65c832 microprocessor's origins began over 12 years ago at Motorola. Motorola's microprocessor division had just released their 6800 microprocessor and they were working on designs for the next generation chip. A rift developed and a maverick band of engineers, led by Chuck Peddle, splintered off to form MOS Technologies. Their first product, the 6501 microprocessor was greeted by a flurry of lawsuits from Motorola. The 6501 was pin-compatible with the 6800 and Motorola didn't like the fact that MOS Technologies' parts could be substituted into a board designed for Motorola parts. So MOS Technologies introduced an improved version of the 6501, the 6502, which had improved hardware and a different pin-out. Several commerically available computer systems employed the 6500 family microprocessors including the KIM, SYM, AIM, OSI, PET, Atari, Commodore, and, of course, the Apple I and Apple II. There were actually a dozen or so different 6500 microprocessors. They all shared the same instruction set (indeed, the same silicon chip), they differed mainly in the packaging used to hold the chip. For example, the Atari 2600 VCS system used a 28 pin version of the 6502 called the 6507. Around August, 1978, one of MOS Technologies' second sources, Synertek, began circulating specifications for a new 6500 microprocessor called the 6516. This chip was a pseudo-sixteen bit processor designed to compete with the new Motorola 6809 microprocessor. This chip introduced a few new addressing modes and several new instructions. Probably the most unique thing about it was that it used a set processor status register bits to control whether or not the A, X, and Y registers, or memory operands operated in eight or sixteen bit mode. The (previously) unused bit in the P register became a user flag in the 6516. The 6516 sported sixteen-bit accumulator, X, Y, PC, and SP registers. It also incorporated an eight-bit "Z" register which controlled the location of the zero page. In terms of addressing modes, the 6516 supported the following addressing modes: - immediate, - implied, - register, - direct page, - direct page indirect, - direct page indexed by X, - direct page indexed by Y, - direct page indexed by X indirect, - direct page indirect indexed by Y, - absolute, absolute indexed by X, - absolute - absolute indirect - absolute indexed by X - absolute indexed by Y - 8 and 16 bit relative The instruction set included all of the 6502's instructions plus LDZ (STZ), LDS (load SP), LHA (load H.O. A byte), LHX (load H.O. X byte), LHY (load H.O. Y byte), LAX (load A from location pointed at by X), SAX (store A at (X)), LAY/SAY (load/store A at (Y)), ADD (no need to clear carry), SUB (no need to set carry), INC/DEC accumulator, TAZ (init Z register), TZA (get current Z register value), YPC (transfer Y to PC -- JMP (Y)), PCY (copy current PC into Y), XHA/XHX/XHY (swap A, X, and Y halves), XXY (exchange values in X/Y registers), SEF/CLF (set/clear user flag), LDQ (load "Q" processor register with an immediate value), SEV (set overflow flag), AXA/AYA (add X/Y to A), AAX/AAY (add A to X or Y), AMX/AMY (add memory to X or Y), NEG (negate accumulator), several new shift and rotate instructions including RLT, RRT, ASR, RHL, RHR, RXL, RXR, RYL, and RYR, BFS/BFC (branch if user flag set/clear), JNE/JEQ (jump long if not equal/equal), PHD/PLD (push/pop 16-bit A), PHX/PHY/PLX/PLY/PHZ/PLZ (push/pop X, Y, and Z registers), PHR/PLR (push/pop all registers), BR1..BR5 (five new BRK/software interrupt instructions). In addition to the new instructions, Synertek enhanced several old instructions by adding new addressing modes. They also reduced the number of cycles needed to execute various instructions, for example, many implied addressing mode instructions took only one cycle (rather than two) on the 6516. After reading over the Synertek technical notes, I immediately wrote an article for Micro, the 6502 Journal discussing the 6516 microprocessor. The very next month after publication one of Synertek's representatives wrote a letter to Micro swearing up and down that there was no such project, never was such a project, and that I'd made the whole thing up. Funny, I still have in my possession, on Synertek letterhead, technical notes #34 and #40 which describe the features of the SY6516 microprocessor. The SY6516 never saw the light of day. Synertek's representatives who had come around and shown me the specs for the SY6516 were simply gauging people's reactions to the chip. Apparently, the reactions weren't strong enough to forge ahead with the product. An advanced 65xx processor was not forthcoming from Synertek. Around 1980, MOS Technologies (having long since been bought out by Commodore) began making noises about a 16-bit upgrade to the 6502 designed to compete with the 68000, Z8000, and 8086 microprocessor. In the true one-upsmanship style common to semiconductor houses, MOS Technologies called their chip the 650,000. I glanced over the extremely tentative specs for the chip. It reminded me of Intel's iAPX- 432 processor so I immediately wrote the chip off. Fortunately for Commodore's sake, MOS Technologies completely abandoned work on the chip before it got out of the wish list stage. It seemed as though the 6502 was destined to be left behind by semiconductor houses. The introduction of the IBM PC cemented the 8086's future and killed off any hopes for Zilog's Z8000. The Motorola 68000 was hanging in there due to its superior archetecture, and the introduction of the LISA and MacIntosh computers guaranteed success for the 68K. Unfortunately, the success of the 68K was almost the final nail in the 6502's coffin. Almost everyone producing 65xx machines in any quantity (Apple, Commodore, and Atari) had switched over to the 68K and were waiting for their 6502 machines to die off. The 65xx family might have truly died off were it not for one man. Bill Mensch, one of the original 6502 designers, loved his chip. If he couldn't get the big companies to design his "6502 dream machine", he'd start his own company and do it himself. With some layout tape, a couple of sheets of mylar, and help from his family, Bill laid out a chip that was a modest improvement over the 6502-- the 65c02. Bill's company, The Western Design Center, licensed the 65c02 chip to several large companies including Rockwell (who made some modifications to the instruction set), GTE, NCR, and VLSI Technologies. Eventually the 65c02 found its way into the Apple //c and the Apple //e ensuring its success. The 65c02, however, was not what Bill had in mind. It was a springboard. A revenue producing commodity product a development company could use to finance more ambitious products. Those ambitious products were the 65c802 and 65c816 chips. At the time the Western Design Center (WDC) was designing the 65c816, Zilog was working on a comparable 16-bit upgrade to the Z80, the Z800. Somewhere around 1984 (I can't remember the exact date), EDN published an article comparing the work on the 65c816 with the work on the Z800. It was a David and Golith story. Tiny WDC vs. the giant Zilog. Both companies were having problems with their chip designs. But it was WDC, laying out their chip with layout tape and mylar sheets on the kitchen table who beat Zilog with their fancy CAD/CAE systems to market. Perhaps you can actually buy a Z800 today, I'm not really sure. One thing's for sure, with the demise of CP/M there's no market for such a chip. The 65c816's design was not without it's problems. Bill Mensch "improved" the bus interface on the 65c816 (over that used by the 6502). Unfortunately, the Apple's disk drive controller relied on some of the old kludges in the 6502 chip. With those problems removed, the 65c802 and 65c816 chips worked fine on an Apple computer, but the disk drives didn't work. Of course Apple immediately began laminting about the stupidity of the designers at WDC and WDC's designers immediately began complaining about the stupidity of Apple's design. In the long run, money won out. If WDC wanted Apple to use the '816, WDC would have to redesign the chip. They did. This exchanged, combined with the fact that the 65c816 was two years late coming to market, was the beginning (if not the cause) of an adversarial relationship between Apple and WDC. There are those at Apple who feel that the schedule of the 65c816 was one of the major reasons Apple cancelled the ill-fated Apple //x project. Eventually, the 65c816 functioned properly and Apple incorporated it into the Apple //GS. This guaranteed a modicum of success for the '816 part. There's nothing like a high visbility personal computer to guarantee a chip's success. This has worked for the Z80 (TRS-80), 6502 (Apple, Atari, Commodore, and numerous others), 8088 (IBM), and 68000 (Apple, Atari, Commodore, Sun, and others). Chips that have not lived up to their maker's expectations, like the Z8000 and 32000, were never adopted by a major personal computer manufacturer. So the 65c816 seems to have everything going for it. That brings us up to date. Bill Mensch and the Western Design Center are not resting on their laurels. They've been busy designing several new microprocessors including the 65c265 a single chip microcomputer incorporating a 65c815, built-in RAM and ROM, parallel I/O ports, counter/timers, serial ports, a built-in LAN, and lots of other goodies. This chip isn't destined for a personal computer, it will find its way into controller applications like microwaves, stereos, telephones, and other sophisticated electronic products. These are jellybean type devices that produce a constant income for their designer. So it only makes sense that WDC finish the design of these parts. Unfortunately, the design of parts like the 65c265 takes certain resources. Since WDC is not a gigantic conglomerate, it has limited resources. If all your manpower, time, and money are going towards the development of the 65c265, you don't have any left for the '832. That's exactly what was happening with the 65c832 as of June, 1988. It's a concept that WDC employees kick around all the time, but on which active work has yet to begin. On the positive side, there's still time to influence WDC's design. On the down side, it will be a couple of years, at least, before the 65c832 is real. Some 65c832 Features and Design Phiolosophy Since the '832 is still at the earliest design phases, there's not a lot of solid information I can give you concerning the chip. There are some comments Bill Mensch made at the 65c832 standards conference last June in Phoenix Arizona that might shed some light on what you can expect to see. First of all, don't expect a 16 or 32 bit bus on the 65c832. One of Bill Mensch's design goals is to produce a chip that is pin compatible with the 65c816. He wants you to be able to unplug your 65c816 in your Apple //GS and pop in a 65c832 and continue running old software on your Apple //GS. This guarantees some compatiblity with existing hardware, but it definitely limits performance due to bus bandwidth limitations. Bill mentioned the possibility of a 65032 chip which supports a full 32-bit data and address bus, but he'd have to be convinced there is a need for such a part before he would commit to it. You can also expect to find integer multiply and divide instructions and probably a set of floating point instructions on the '832. I don't know a whole lot about chip design, but I do know that floating point instructions take a lot of effort and silicon to implement. Why do you think all of the other major manufacturers have gone to separate floating point coprocessor chips? Indeed, originally the 65c832 was going to be a floating point coprocessor for the 65c816. Placing the floating point processor on the chip may cause major design problems (and their attendent delays) for the 65c832. Hopefully the folks at WDC know whtat they're getting into and can handle this in stride. By the way, you can blame/thank Mike Westerfield for WDC's insistence that the floating point instructions will be on chip. Mike told Bill that he would only support the floating point instructions if they were on-chip. He wouldn't support them in his compilers if the 65c832 used a separate FPU chip. This convinced the folks at WDC that floating point had to be on-chip. This, probably off-hand, remark from ByteWorks may end up killing the whole project. Everyone else has had trouble building coprocessors, much less putting floating point right on the chip. Perhaps WDC can pull another David and Goliath off and put everything on one chip. However, I'd rather they played it safe and actually built a 65c832, sans floating point, rather than go for the gold and give up on the project or go out of business in the process. Naturally the 65c832 is going to support full 32-bit registers everywhere. This includes the A, X, Y, Z, PC, S, DBR, PBR, and D registers. This means you can place the direct page or stack anywhere in memory. Furthermore, you will be able to align the program and data banks on any arbitrary byte boundary. This will greatly enhance Apple's memory mangement and segmentation techniques. Of course, the PBR and DBR registers won't be absolutely necessary (since all addresses are now 32-bits), but they'll still be around for compatibility with the '816 chip. WDC will upgrade the 65c816's instruction set using the currently undefined WDM (William D. Mensch) opcode. Bill Mensch hinted that the '832 will use this opcode as a prefix byte to other instructions to change their meaning. The Z80 and 6809 chips sucessfully used this technique to expand their instruction sets over the 8080 and 6800 microprocessors. This technique has one major drawback- it lengthens each instruction employing these techniques which, in turn, increases the amount of execution time necessary for such instructions (by at least one cycle to fetch the opcode prefix byte). Therefore, native '832 instructions will run slower than comparable '816 native mode instructions. Pure relative addressing is another topic Bill has enspoused. On the '832 you'll be able to write truly relocateable programs. This feature alone will dramatically affect the loading time and size of application programs running on an AppleJ//GS. It will also improve memory management facilities on the GS since the loader and memory manager wil be able to move relocateable blocks of code around in memory at execution time. This will dramatically improve the GS' memory manager garbage collection abilities. Beyond this, there isn't much I can say about the 65c832 chip. Addressing modes, instruction types, data types, and other new features are all up in the air at this point. Of course, if you've got some ideas about your own 65c816 "dream machine" WDC would love to hear from you. Jot your ideas down and mail them to: William D. Mensch c/o The Western Design Center 2166 East Brown Road Mesa, Arizona, 85203 (602) 962-4545 While you're at it, put in a plug for the 65032. I'd dearly love to see a true 32-bit Apple II using the NuBus. The 65032 is just the ticket for such an item. Keep in mind that WDC isn't the only possible source for an upgraded 65c816 chip. Although unlikely, rumors have it that Apple is designing an upgrade using gate array technology. Perhaps WDC will have some competition, who knows? Whatever the case, there's a definite upgrade path for the Apple II family in the works.
rhyde@ucrmath.ucr.edu (randy hyde) (12/21/90)
This is a very long article (36k) of interest to hardware
hackers, assembly language programmers, and machine
architects. It is a description of how I feel the
65xxx family should evolve. Don't count on anything
you read here. Nonetheless, you might find it interesting.
If you're not one of the aforementioned types, you may
want to skip the noise which follows...
The 65C816 Dream Machine
This essay is an attempt to vent my frustrations.
While the 65C816 chip is, without question, better than
the 6502 and 65c02 chips that preceded it, the 65c816
leaves a lot to be desired. Unless you count
microcontrollers like the 8048, F8, or 8051, I've never
encountered a chip as difficult to program in assembly
language as the 65c816. Those stupid M and X bits cause
so much trouble I wonder if they're worth the trouble of
using them. Attempting to use the 65c816 in native mode
while attempting to coexist with other 6502 routines
(requiring emulation mode) such as ProDOS 8 can really
push one's patience. But wait! There's a small chance
things can be improved. The WDM (William D. Mensch)
instruction is reserved by the Western Design Center for
instruction set expansion. While I'm sure Mr. Mensch has
other plans for this opcode, the following treatise
provides my views on how this single opcode should be
used.
The WDM opcode should be used in the next version of
the 65c816 (let's call it the 65c820, just to be amusing)
to change the instruction set. When the 65c820 resets,
it should come up in the 6502 emulation mode, just like
the 65c816 does now. The XCE instruction could be used
to switch to 65c816 mode just like the existing 65c816
part. The WDM opcode, which I'll call NAT (for NATive
mode) will be used to switch the processor to 65c820
native mode. Once in the 65c820 mode, the 65c820 takes
on a completely different character. The only bounds
I've placed on the new instruction set is that if you can
perform an operation with a single instruction on the
65c816, you can perform the same thing on the 65c820 with
a single instruction. All other aspects (including
timing and instruction size) can vary. I've also taken
some liberties with the way certain instructions affect
the flags. For the most part however, 65c816
instructions have an identical counterpart on the 65c820.
Design Issues: There are lot's of reasons for
designing a new instruction set. My criteria are as
follows:
1) The instruction set must mirror the philosophy of the
6500 family. A programmer experienced with the 6502
instruction set must feel comfortable with the 65c820
instruction set.
2) The new instruction set must support high level
language constructs better than the 6502 and 65c816
processors.
3) The new instruction set must be easy to learn and fun
to use.
4) We must remember that fancy instructions are very
difficult to implement in silicon. Hence super fancy
instructions which provide limited functionality must
be left out. For example, the 65c820 doesn't support
floating point instructions (although they could be
added via a coprocessor).
5) The only (commercially popular) computer system that
would ever use the 65c820 is an upgrade of the Apple
IIGS. Hence the instruction set should contain
instructions that enhance the operation of an Apple II
family machine.
6) The original 6502 instruction set was designed with a
small set of basic instructions complemented with a
large set of addressing modes. The 65c816 strayed
from this philosophy, the 65c820 returns to it.
Based on these design issues, I offer the following
machine; the 65c820:
_________________________________________________________
______________________
Programmer's Model:
The 65c820 will contain several additional registers,
above and beyond those available on the 65c816. All
registers are 16 bits. The register bank includes:
A, AX -- Accumulator and accumulator extension
X -- X index register
Y -- Y index register
F -- Stack frame pointer
S -- Stack pointer
D -- Direct page register
P -- Program status word
ABR -- Auxillary bank register
SBR -- Stack bank register
DBR -- Data bank register
PBR -- Program bank register
PC -- Program counter
LBound -- Low bounds register
HBound -- High bounds register
A, X, Y, S, D, & PC are mostly identical to their 65c816
counterparts. AX is the accumulator extension used by the
multiply and divide instructions. F is a special index
register, useful for accessing local variables and
parameters. P differs from the 65c816 version in that it
is 16-bits wide. Accessing the upper byte of this
register is a privileged operation (more on that later
on). DBR and PBR are similar to their 65c816 cousins,
except they are now 16-bits long and allow you to
position the program and data banks on any PAGE boundary
(rather than a bank [64K] boundary). ABR is an auxillary
data bank register. SBR lets you locate the stack
anywhere in the 16Mbyte address space. LBound and HBound
provide some rudimentary memory management functions.
All memory addresses are added to LBound to produce the
true physical address. If the result- ing address is
greater than HBound, an ABORT trap will be issued. This
allows you to load multiple programs into memory and
protect them from being walked on by other programs.
As I alluded to earlier, certain operations are
PRIVILEGED. The 65c820's program status word takes the
following form:
15 14 13 12 11 10 9 8 | 7 6 5 4 3 2 1 0 U/S
I M fpc * * * * | N V * * D dir Z C
The low order 8 bits are identical to the 6502's P
register except the B bit isn't present (it's not
required) and the I bit has been moved to bit 14. The
dir bit controls the direction of various string
instructions (ala 8086). The low order 8 bits are called
the USRPSW (user program status word). The upper 8 bits
are called tye SYSPSW (system program status word) and
can only be accessed while in the system mode. Bit 15
(U/S) is the user/supervisor bit. This bit determines
whether or not you are in the user or system (supervisor)
mode. Bit 14 is the interrupt disable bit. For
protection reasons, a user mode program cannot have
access to the interrupt disable bit (by turning off all
interrupts and not turning them back on, a user mode
program can cause all kinds of havoc). Bit 13 is the
memory management bit. If set, the LBound the HBound
registers determine the location and extent of the
logical address space. If clear, then the logical
address space and physical address space are the same.
The fpc bit determines if a floating point coprocessor is
installed. If not, the floating point expansion
instructions will cause an illegal instruction trap,
otherwise, the FP instructions will be routed to the
floating point coprocessor. The remaining bits in the P
register are reserved for future use.
Opcode Format:
The 65c820's instruction set is broken down into 32
classes. They are
0-MOV, 1-LEA, 2-LEAA, 3-LEAD, 4-LEAS, 5-XCHG, 6-
ADD, 7-ADC 8-SUB, 9-SBC, A-CMP, B-AND, C-OR,
D-XOR, E-ASH, F-LSH 10-ROT, 11-BIT, 12-ADDQ, 13-CMPQ,
14-exp, 15-exp, 16-exp, 17-exp 18-exp, 19-Scc, 1A-Ccc,
1B-Icc, 1C-Brnch,1D-Brnch,1E-exp, 1F-exp
"exp" refers to expansion.
The "typical" instruction format (for opcodes $00..$11)
is
15 14 13 12 11 10 9 8 | 7 6 5 4 3 2 1 0 a a
a a a a s d | r r r o o o o o
where a = addressing mode bits s = size
(0=byte, 1=word) d = direction (0=to addressing
mode loc, 1=from addressing mode loc) r = register
o = opcode (one of the group values above).
There are 64 possible addressing modes (since there are
six "a" bits). The register bits refer to the first
eight of these addressing modes (0..7).
0- A 10- d,F 20- d,X 30- d,Y 1- X
11- a,F 21- a,X 31- a,Y 2- Y 12-
n(d,F) 22- a,FX 32- a,FY 3- S 13- n(a,F)
23- l,X 33- l,Y 4- F 14- n[d,F] 24- (X)
34- (Y) 5- TOS 15- n[a,F] 25- (d,X) 35-
(d),Y 6- Imm 16- (d,F) 26- n(d,X) 36-
n(d),Y 7- d 17- [d,F] 27- [d,X] 37-
[d],Y 8- a 18- (a,F) 28- n[d,X] 38-
n[d],Y 9- l 19- [a,F] 29- n(d,FX) 39-
n(d,F),Y A- d,S 1A- P 2A- n[a,FX] 3A-
n[a,F],Y B- (d,S),Y 1B- D 2B- [a,FX] 3B-
[a,F],Y C- (d) 1C- ABR 2C- (a,X) 3C-
(d),Y+ D- [d] 1D- SBR 2D- n(a,X) 3D-
(d),-Y E- n(d) 1E- DBR 2E- [a,X] 3E-
[d],Y+ F- n[d] 1F- PBR 2F- n[a,X] 3F-
[d],-Y
where:
A, X, Y, S, F, P, D, ABR, SBR, DBR, and PBR are the
corresponding 65c820 registers. Imm refers to an
immediate operand. d refers to an eight-bit value,
usually (but not always) a direct page address. a refers
to a 16-bit absolute address. l refers to a 24-bit long
address n is a displacement of the form one byte, +/-
64 if the H.O. bit is zero. two bytes, H.O. byte
first, +/- 16383 if the H.O. bit is one.
All addressing mode containing F, FX, or FY are relative
to the SBR register. Any "d" address appearing in such an
addressing mode is simply an 8-bit displacement relative
to the frame pointer. FX means add F and X and use the
result as the frame pointer. FY is the same, but using
the Y register.
Y+ and -Y are autoincrement and autodecrement addressing
modes. For autoincrement, the Y register is incremented
after the value contained in Y is used. For auto-
decrement, the Y register is decremented before the value
is used.
Addressing modes of the form n[---]-- compute the
effective address specified by the indirect operation and
then add the specified offset to the effective address to
obtain the true effective address. For example, if Y
contains 5 and location $00 points at $1000 in the DBR,
then 4(0),y refers to location $1009.
The TOS addressing mode refers to the Top Of Stack, more
on this later.
General Instructions:
The general instructions (opcodes $00..$11) all take the
form:
Instr Source, Dest
Where Instr is the instruction mnemonic, Source is the
address of a source operand, and Dest is the address of a
destination operand. At least one of the two operands
must be a "register" addressing mode. The register
addressing modes are the first eight addressing modes
listed above. If the source operand is a register
addressing mode, then the direction bit in the
instruction is zero, otherwise it is one. If the source
addressing mode is the immediate addressing mode, the
flags are set by the result of the operation, but nothing
else is changed. For example, MOVB #0,#n sets the zero
flag since a zero bit is moved, but the zero isn't
actually moved anywhere. Note that a "B" or "W" suffix
is used on the mnemonics to specify the instruction size.
Three important register addressing mode greatly enhance
the capabilities of the 65c820 processor: the TOS, Imm,
and d register addressing modes. Since d is a register
addressing mode, any direct page memory location can be
used as a "register". This greatly enhances the
flexibility of the 65c820. This effectively gives you
256 registers to play around with.
The Imm addressing mode, since it is a register
addressing mode, lets you perform operations between any
register or memory location in the machine (addressable
by a single instruction) with an immediate operand. For
example, CMPB #5,2[3,D],Y is perfectly legal. For a
few instructions, immediate operands don't make much
sense, such instructions will cause an illegal
instruction trap (for example, you cannot load the
effective address of an immediate operand into a
register).
The TOS addressing mode is extremely powerful. If
you've looked ahead at the expansion instructions, you'd
have noticed that there aren't any specific push or pop
instructions (unless you count ENTER, EXIT, SAVE, and
RESTORE). The TOS addressing mode handles all of this
for you. You want to push the accumulator onto the
stack? No problem, MOVW A,TOS will do the job. You
want to pop the X register off of the stack? Use MOVW
TOS,X. You want to add the item on the top of stack to
the item below it on the stack (a VERY common operation
performed by compilers), just use ADDW TOS,TOS. This
instruction will pop two words off of the stack, add
them, and push their sum back onto the stack (leaving two
bytes on the stack rather than the original four). With
the TOS addressing mode, you can push (or pop) any value
anywhere in addressable memory onto the stack with a
single instruction.
Special (but not expansion) Instructions:
There are seven groups of instructions in this category:
ADDQ, CMPQ, Scc, Ccc, Icc, and the branch instructions.
ADDQ (add quick) appears in place of the ubiquitous INC
and DEC instructions. ADDQ lets you add a four-bit signed
value to any addressable item. The register bits, along
with the direction bit, let you specify a signed four-bit
value. This value is added to the specfied address. The
immediate operand MUST be the source operand.
The CMPQ (compare quick) is similar except a compare
operation is performed rather than an addition.
Furthermore, the immediate operand is the destination
operand rather than the source operand.
The Scc (set on condition), Ccc (clear on condition), and
Icc (invert on condition) instructions are used to set
boolean values based on the condition codes. These go
hand in hand with the branch instructions so I'll
describe them all at once.
There are 16 possible conditions, the register and
direction bits are used to specify the condition. These
conditions are
0- RA/A 4- HI 8- GT C- PL 1- CC/LO
5- LT 9- EQ D- VC 2- CS/HS 6- GE
A- NE E- VS 3- LS 7- LE B- MI
F- SR/N
LO (lower) = unsigned less than HS (higher/same) =
unsigned greater than or equal LS (lower/same) = unsigned
less than or equal HI (higher) = unsigned greater than LT
= signed less than GE = signed greater than or equal LE =
signed greater than or equal GT = signed greater than
The Scc, Ccc, and Icc instructions take the form:
Scc{b|w} #Imm, Dest Ccc{b|w} #Imm,
Dest Icc{b|w} #Imm, Dest
If the immediate operand is zero, then Scc will store a
one into the specified location if the condition code is
met, otherwise a one will be stored. Ccc does just the
opposite, it stores a zero if the condition is met, one
otherwise. The Icc instruction will complement the
specified location (logical NOT) if the condition code is
met. If the immediate operand is not zero, the the Scc
in- struction will set the specified bits in the
destination operand if the condition code is met, the Scc
instruction will have no effect if the condition is not
met. The Ccc instruction clears the specified bits in the
destination operand. The Icc instruction inverts the
specified bits. The destination bits are specified with
ones in the immediate operand. For example, SCS #$88,
$00 will set bits three and seven in memory location zero
if the carry flag is set, location $00 will be
unaffected if the carry flag is clear.
The SA/CA/IA (Set always, Clear always, Invert always)
instructions always perform the specified operation. The
SN/CN/IN (set never, clear never, invert never) behave as
though the condition code was not met.
The branch instructions are unusual compared to those
encountered thus far. The instruction is only one byte
long. It takes the form:
7 6 5 4 3 2 1 0 o o o --1C or 1D--
If the opcode is $1C, then the three "o" bits represent
condition codes 0..7 above. Note that the BRA instruction
uses opcode bits %000.
If the opcode is $1D, then the three "o" bits represent
condition codes 8..$F above. There is no BN instruction,
Opcode %111 is the BSR (branch to subroutine)
instruction.
Unlike the 65c816, branches are not limited to +/- 128
bytes. A displacement value, similar to the used by the
general addressing modes allows a one-byte displacement
of +/- 64 bytes or +/- 16383 bytes. More than enough for
most cases.
Math expansion instructions:
The math expansion instructions (opcode $14) use the
three register bits as an opcode expansion yield eight
additional instructions. The instruction format is
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
a a a a s d o o o 1 0 1 0 0 where "aaaaaa"
is a general addressing mode, "s" is the size (B/W), "d"
is the direction (load/store), and "ooo" is the sub-
opcode, decode as follows:
0- MUL 1- DIV 2- MOD 3- REM 4- INDX 5- CHK
6- DIVS 7- FPexp
Sub-opcode 7 is reserved for floating point expansion via
a coprocessor. If the FPC bit in the SYSPSW is not set,
then executing this opcode will cause an illegal
instruction trap. If the FPC bit is set, then an
additional eight bit opcode follows this instruction.
This opcode value plus the physical address provided by
the addressing mode, bounds registers, and applicable
prefix(es) are passed along to the coprocessor.
All of these instructions use the 65c820 accumulator
as the register operand. MULW performs an unsigned 16x16
multiply, leaving the 32-bit result in A, AX. MULB
performs an unsigned 8x8 multiply, leaving the result in
A. DIVW performs an unsigned 32/16 division. The value
in (A,AX) is divided by the specified operand and the
quotient is left in (A,AX). DIVB divides the 16-bit
accumulator by an eight bit value, leaving the result in
A. DIVS{W|B} perform signed divisions. These two
instructions operate on the 16-bit accumulator or 8-bit
accumulator ONLY. The AX register is not used. MOD and
REM compute the modulo and remainder functions (MOD is
unsigned, REM is signed). Their register usage is
identical to DIV/DIVS. There is no need for a signed
multiply instruction since signed and unsigned
multiplication produces the same result, assuming you
ignore the value in AX.
The INDX and CHK instructions are used to perform
array computations. The operand of these two
instructions points at a pair of bytes or words. The
INDX instruction multiplies the accumulator by the first
value and then adds the second value to the accumulator.
The direction bit in the opcode is ignored. The INDX
instruction takes two forms: INDXB and INDXW.
The CHK instruction compares the value in the
accumulator against the first and second values. If the
accumulator lies within these two values (inclusive) then
the overflow flag is cleared. If the accumulator is
outside the range of these two values, then the overflow
flag is set. The direction flag in the opcode is used to
determine whether a signed or unsigned comparison is
used. The CHK instruction takes four forms: CHKSB, CHKSW,
CHKUB, and CHKUW. The "U" and "S" specify unsigned or
signed.
String expansion instructions:
Opcode $15 is used for string operations. The 65c820
processor provides four basic string operations: MOVS
(move), CMPS (compare), XLATS (translate), and FILLS
(fill). The instruction format is as follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 s s
s d d d l l l o o 1 0 1 0 1
where "sss" is the source address, "ddd" is the
destination address, "lll" is the length address, and
"oo" is the opcode. Since all of the addresses are three
bits, they must be register addresses. The source
address is a sixteen bit value taken from one of the
register addressing modes. The sixteen-bit value
obtained at said address is the start of the string
within the data bank (i.e., relative to the DBR
register). The destination address is also a sixteen-bit
register addressing mode value, specifying the start of
the destination address within the auxillary bank (i.e.,
relative to the ABR register). The length value is a
sixteen-bit quantity obtained directly from the register
addressing mode location. Prefix bytes (described later
on) are not allowed in front of a string instruction.
Opcode assignments:
0- MOVS 1- CMPS 2- XLATS 3- FILLS
The direction of the string operation is specified by
the "dir" bit in the USRPSW register. If the bit is
clear, then the source and destination operands are
incremented after each string operation. If the "dir"
bit is clear, then these operands are decremented after
each string operation.
The string instructions take the form:
MNEMONIC src, dest, len
where src, dest, and len are any of A, X, Y, S, F, TOS,
#value, or a direct page address. For the these
operands, the sixteen bit value specified by one of these
addresses is used, relative to the DBR, as the address
(or length) of the specified block. An absolute address
can be specified by an immediate operand. The direct
page address is the address of the 16-bit value within
the direct page, it does not mean that the address of the
block is that address in the direct page. Same with the
TOS, the value on the top of stack contains the address,
the top of stack is not the block itself. The len
operand is always a byte count. Unless an immediate
operand is specified, the operands are always updated to
reflect their new value at the termination of the block
operation.
The MOVS instruction is used to move a string of
bytes from one location to another. A block of "len"
bytes specified by DBR/src is moved to ABR/dest.
The MOVS operation is an example of an instruction
that does not exactly mirror its 65c816 counterpart. It
may take two (or more) instructions to perform the same
operation as the 65c816 MVN and MVP instructions, since
the direction flag may require adjustment before
performing a MOVS instruction. Futhermore, the ABR and
DBR registers may need adjustment before and after the
MOVS instruction to simulate the MVN and MVP
instructions. Finally, the actual count is specified by
length, not count-1 (as on the 65c816), so this may
require some adjustment if you are translating 65c816
code instruction by instruction.
Example:
MVN 0,1
can be simulated by
MOVW #0,DBR MOVW #1,ABR ADDQ #1,
A ;Since MVN assumes A contains count-1 MOVS
X,Y,A MOVW #1,DBR
The CMPS operation compares the two specified
strings. It does a byte by byte comparison until length
bytes are compared or a character in the source string is
not equal to the corresponding character in the
destination string. The condition codes are set to
reflect the ordinality of the two strings (so you can use
any of the branch, Scc, Ccc, or Icc instructions to test
the results). If the z flag is returned set, then the two
strings are equal (through the specified length),
otherwise the source and destination operands are updated
to point at the differing chars and the length operand is
updated to show the number of character processed thus
far (assuming, of course, that these operands weren't
immediate, in which case they would be ignored).
The XLATS instruction is used to translate values in
a string. The source operand points at a table in the
DBR. Each character in the dest string is used as an
index into this table and the value fetched from the
table is stored over the original character in the
destination string.
The FILLS instruction is used to initialize a string
with a fixed value. The source operand is an eight-bit
value. It is stored in successive locations at ABR/dest
for len bytes. If an immediate value is specified, a
sixteen-bit value is encoded into the instruction, but
only the L.O. eight bits are used.
Single byte expansion instructions:
These instructions take the form:
7 6 5 4 3 2 1 0 o o o 1 0 1 1 0
Where "ooo" is decoded as:
0- NOP 1- COP 2- BRK 3- SVC 4- RTS 5- RTL 6- RTI 7- EXIT
SVC is the "supervisor call" instruction. Its
intended use is for making operating system calls. It is
similar in function to the COP instruction.
EXIT is used to deallocate local variables in a
procedure. It undoes the actions of the ENTER
instruction. Basically it performs the following
operations:
MOV F,S MOV TOS, F
The remaining instructions in this group are
identical to their 65c816 counterparts, so they don't
require any futher elaboration.
Single byte w/displacement expansion instructions:
These instructions take the form:
7 6 5 4 3 2 1 0 o o o 1 0 1 1 1
Where "ooo" is decoded as:
0- SAVE n 1- RESTORE n 2- reserved 3- reserved 4- RTS n
5- RTL n 6- ADJSP n 7- ENTER n
The "n" value immediately following these
instructions is a displacement value. If bit seven of
the first byte following the opcode is zero, then the
remaining six bits are used to specify a signed value in
the range +/- 64. If bit seven is one, then the
following 15 bits are used to specify a value in the
range +/-16383. Except possibly for ADJSP, none of these
instructions should ever require more than a single byte
displacement.
SAVE is used to quickly push registers from the set
[A,AX,X,Y,F,D,P] onto the stack. The instruction is
followed by a single byte with bits 0..6 cor- responding
to these registers. Bit seven must always be zero.
RESTORE does just the opposite of SAVE, it pops the
specified registers off of the stack.
RTS n and RTL n perform the specified return from
subroutine operations and then add the specified
displacement to the stack pointer after the return
address has been popped. This provides a convenient
mechanism whereby parameters can be removed from the
stack.
The ADJSP n instruction adds the displacement value
to the stack pointer. This is a shorter version of the
ADD #value,S instruction. A special case was created for
this instruction because it gets used all the time in
languages like "C" or "SDL/65" which allow a variable
number of parameters.
The ENTER n instruction is used to set up an
activation record when a procedure is initially entered.
It performs the following operations:
MOVW F,TOS MOVW S, F ADJSP n
The EXIT instruction can be used to undo the effects of
this instruction.
Prefix expansion instructions:
These instructions take the form:
7 6 5 4 3 2 1 0 o o o 1 1 0 0 0
where "ooo" is decoded as:
0-ABR prefix 1-SBR prefix 2-PBR prefix 3-word index
prefix 4-dword index prefix 5-qword index prefix 6-
XBA/SWA 7-EMU
XBA and EMU aren't true prefix bytes, they're just
single byte instructions that didn't conveniently fit
anywhere else. So I'll describe them first. XBA is
identical to its 65c816 counterpart, it swaps the bytes
in the accumulator. EMU switches from 65c820 native mode
to 65c816 emulation mode. EMU is a privileged
instruction and will cause a privileged instruction trap
if executed from the user mode.
The first three prefix bytes are used to modify the
bank used for data accesses. Addressing modes that
normally access memory through the data bank register
(which are all memory references except direct, long,
TOS, and those involving F) can be "tweaked" to access
memory through the auxillary, stack, or program bank
registers by prefixing the address with the appropriate
prefix. For example,
MOVW #275, ABR:$1000
stores 275 into location $1000 in the auxillary bank
register rather than the data bank register. Indirect
addresses of the form (a,X) and n(a,X) present a minor
problem. Does the prefix specify the bank address of the
absolute operand or the effective address? I've opted
for requiring that the absolute operand reside in the
data bank and the prefix byte determines the effective
address bank.
Any addressing mode utilitizing the frame pointer
register (F) is always relative to the stack bank
register. Prefixes are only allowed for the following
frame-based addressing modes: n(d,F), n(a,F), (d,F),
(a,F), n(d,FX), and n(d,F),Y. The indirect address
always comes out of the stack bank, the prefix applies to
the computed effective address.
Although the ABR:/SBR:/PBR: lexemes immediately
precede the address expression to which they apply (on
the source line), in the object code, the prefix byte
always precedes the instruction to which the prefix
applies. If more than one prefix byte precedes an
instruction, only the last one is used. If a prefix byte
precedes an instruction to which the prefix doesn't make
sense (a branch, for example), then the prefix byte is
ignored. Finally, the prefix byte will be ignored if
there isn't an applicable addressing mode in the current
instruction. E.G.: byt $18 ;ABR prefix
byte MOVW A,X ;ABR prefix has no meaning
here.
Three additional prefix bytes apply to the X and Y
index registers. These are the word index prefix, dword
index prefix, and qword index prefix. These prefix bytes
provide scaled indexed addressing modes for the 65c820.
Without one of these prefixes, the X and Y registers are
always byte offsets. That is, when used as an index
register, the contents of X or Y is added directly to the
effective address being computed. When accessing words,
pointers (double words), or eight byte values (e.g.,
floating point) you have to manually adjust the index
registers by a factor of 2, 4, or 8. The scaled index
addressing prefix bytes let you avoid this problem. The
word prefix multiplies the X or Y register value by two
before using it in the effective address computation.
Likewise, the dword and qword prefixes multiply X or Y by
4 or 8 before using the value. In the source code, these
prefix bytes are specified by the ":W", ":D", and ":Q"
suffixes:
MOVW A,LBL,X:W MOVW
$0,(PTR),Y:D MOVW $2, 2(PTR),Y:D
MOVW F,(TBL,X:W) MOVW A,LBL,Y:Q
If multiple prefixes appear, only the last one is used.
If the prefix doesn't apply to the next instruction, it
is ignored.
Single operand expansion instructions:
The $1E expansion instructions are dedicated to
instructions which require a single operand. The format
for the opcodes is as follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
a a a a s o o o o 1 1 1 1 0
where "aaaaaa" is a general addressing mode, "s" is the
size (B/W), and "oooo" is one of the following opcodes:
0- NOT 8- LAX (load AX register) 1-
NEG 9- SAX (store AX register) 2- ABS
A- XAX (exchange AX register) 3- BOOL (0->0 else->1)
B- LLB (load LBound register) 4- SEX
C- LHB (load HBound register) 5- ZEX
D- SLB (store LBound register) 6- JMP
E- SHB (store HBound register) 7- JSR
F- VAL (validate memory location)
All of these instructions are followed by a single
general address expression. Immediate operands are not
allowed for any of these instructions.
NOT- logically compliments the specified value. NEG-
takes the two's complement of the specified value. ABS-
takes the absolute value of the specified location. BOOL-
If the specified location is not zero, a one is stored
into it.
SEX- (that's sign extension, not what you think). SEXB
checks the high order bit of the specified byte and
copies it into the H.O. byte of the corresponding
address. For example, if X contains $0082 then SEXB X
will store $FF82 into X. If X contains $0002, then SEXB X
will store $0002 into X. SEXW sign extends the
specified location into the AX register.
ZEX- zero extends the specified value. ZEXW simply
stores a zero into AX. ZEXB stores a zero into the H.O.
byte of the specified word.
JMP and JSR are like their 65c816 counterparts except any
valid addressing mode can be used. Note that, unlike
most other instructions, the result is assumed to be in
the current program bank unless a long addressing mode is
specified.
LAX, SAX, and XAX allow you to load, store, and exchange
the contents of the AX register. Note that these three
instructions plus SEX, ZEX, MUL, DIV, and MOD are the
only instructions that deal with the AX register.
LLB, LHB, SLB, and SHB let you load and save the contents
of the bounds registers. These are privileged
instructions which will cause a privilege trap if
executed from the user mode.
VAL- This instruction is used to validate a memory
location. That is, it tests the specified memory
location to see if it lies within the range specified by
the bounds register. The address is a physical address,
not a translated address. The overflow flag is set if a
bounds violation would occur. Note that the M bit in the
SYSPSW need not contain a particular value when using
this instruction. This is a privileged instruction which
will cause a privilege violation if executed in the user
mode.
BIT expansion instructions:
These instructions take the form:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
a a a a s o o o o 1 1 1 1 1
"aaaaaa" is the destination addressing mode. "s" is the
size (applicable only to MAND, MOR, and MXOR). "oooo" is
the sub-opcode, decoded as:
0- INS dest, start, len 1- EXT dest, start, len 2- FFS
dest, start, len 3- FFC dest, start, len 4- MAND dest,
mask 5- MOR dest, mask 6- MXOR dest, mask
7..F- reserved.
INS is used to insert a value into a bit field. The
value in the accumulator is shifted to the left "start"
bits and the the "len" following bits are stored into
the specified memory location. For example, if memory
location $00 contains $F0 and the accumulator contains
$3, then INS $00,2,4 would leave location $00 containing
$CC. Note that you needn't specify byte or word size as
this is intrinsic from the length.
EXT- extracts a bit field from some location and stores
the right justified value into the accumulator (zeroing
out any unused bits). For example, if memory location
$00 contains $CC and the accumulator contains $FFFF, then
EXT $0,2,4 would leave the accumulator containing 3 and
location $00 containing $CC.
FFS finds the first set bit in the specified location.
The bit position is returned in the accumulator. If
there were no set bits, the accumulator contains "len"+1.
FFC finds the first clear bit in a manner identical to
FFS.
Some notes: These four instructions are followed by a
single byte. The low order four bits contain the start
value, the high order four bits contain the length-1.
"start" + "len" must always be less than or equal to 15.
FFS and FFC use the direction bit in the USRPSW to
determine which way to progress in the bit field when
searching for the set or clear bit.
The MAND, MOR, and MXOR (masked AND, OR, and XOR) will
AND, OR, or XOR the accumulator into the specified memory
location. The difference between these three
instructions and the standard AND, OR, and XOR
instructions is that they are followed by a byte or word
(depending on the instruction size) which contains a mask
for the operation. Wherever a one bit appears in the
mask, the logical operation will take place, wherever a
zero bit appears, the destination's bits will be
unaffected.
_________________________________________________________
__________________
That wraps up my proposed instruction set for the 65c820.
I'll be happy to discuss my design decisions with anyone
who's interested. The next step is to try and convince
someone to actually build this thing! In the mean time,
I might try writing an interpreter and assembler for it.
By the way. Many of you have probably recognized certain
instructions from this processor or that processor
sprinkled throughout. To set the record straight, most
of my ideas have come from my own frustrations with the
65c816, the 8086 family, and the National Semiconductor
32000 family. Despite that fact that a lot of you think
that Intel's parts stink because they're used by IBM,
don't let that prejudice you against many of the design
issues here. The 8086 does have a resonable
archetecture, given the compromises it had to face. It's
certainly better than the 65c816. I've incorporated a
lot of the better ideas (like segment prefixes) into the
design of the 65c820. Once again, don't downplay these
powerful features just because you don't like IBM.
*** Randy Hyde