[comp.sys.apple] Apple Technical note 2

delaney@wnre.aecl.CDN.UUCP (06/11/87)

This note covers program conversion to the GS

Apple //GS Technical Note #2
      Transforming Input/Output Subroutines for use in "Native" Mode.
 Written by:      Pete Mc Donald              10/86
 ________________________________________________________________________
 This note outlines a number of techniques, useful in the transformation 
 of Apple // I/O subroutines for use in the "Native" Apple //GS 
 environment. Note: This note is only of interest to those who are 
 "converting" applications. If you intend to let your application remain 
 a "classic" Apple // application, then this information can be ignored. 
 ________________________________________________________________________
 The Apple //GS execution environment represents quite a departure from 
 the environment to which the average Apple // developer is accustomed.  
 This fact results in a number of unique problems when one attempts to 
 convert existing Apple // applications for use in the "native" Apple 
 //GS environment. 
 Some of the biggest conversion problems are with I/O subroutines that 
 contain critically timed code. The problems stem from two major issues. 
 Number one is that in the "Native" GS environment, one cannot guarantee 
 that there will be memory available in a given bank.  Number two is 
 that I/O is not available in every bank.
 There are actually a number of possible solutions to this problem. 
 Which ones you should use will depend on what exactly the program in 
 question is doing. In this note I will attempt to describe some of the 
 problem situations, and possible solutions.
 Examine the following "6502" code segment. It serves no useful purpose, 
 other than to illustrate a simple manifestation of the problem.  Assume 
 IoLoc is a location in the CXXX range of memory.
      Loop        Lda        Ioloc
                  Dey
                  Bpl        Loop
 This fragment, if placed anywhere in banks 2-through "Available banks," 
 will have no effect on the I/O device it is intended to control.  This 
 is due to the fact that in those banks, CXXX contains RAM not I/O 
 circuitry. 
 There are 2 possible solutions in this case. Either change the 
 instruction Lda IoLoc so that it uses "Long" addressing, thereby 
 forcing the CPU to reference the the proper bank. (The problem with 
 this is that the "long" version of Lda requires an extra CPU cycle to 
 execute. If the code segment is timing critical, then this is likely to 
 be unacceptable.)
 Alternately, and in the timing critical case ideally, we could instead 
 set the "databank" register before we enter the loop. The effect of 
 this is that the instruction Lda IoLoc would take the same number of 
 cycles as it did before, and the loop timing would not change.
 Well this seems pretty easyI Unfortunately, most code is not isolated 
 like that in the example. Specifically, code will also commonly try to 
 load or store to some location in memory other than the I/O location at 
 the same time that it is trying to access the I/O location.
 Take for example the following fragment:
      Loop        Lda        Data,y
                  Sta        IoLoc
                  Dey
                  Bpl        Loop
 In this case, we assume that the label "Data" refers to some kind of 
 table that normally resides in the same bank as the program. Now the 
 problem is that if the databank register is set to access I/O 
 locations, then obviously the reference to "Data" will end up 
 referencing the same bank as the I/O. This is not likely to be 
 acceptable. One thing that can be done, is to move the data table to 
 the direct page (zero page for 6502 programmers). Then the problem is 
 that the instruction lda data,y will take one less cycle than before.  
 There is a solution, although it is a little complicated. If we set the 
 direct page register to a "non page-aligned" location, then a 1 cycle 
 penalty is applied to all direct page references, and at least for this 
 example, we have solved our problem. 
 However, nothing is ever that simple. What happens to references to 
 other direct page locations that expect to operate without the one 
 cycle penalty? To really address this question would take a lot more 
 space that I have here. So, in lieu of further examples I will just 
 give some general info. 
 As an aside, I have used these techniques to transform the old "Apple 
 // Disk // formatter module,"  for use in any bank of memory in the 
 native GS environment. This was accomplished using, almost exclusively, 
 editor find and replace commands, and was done in a matter of hours 
 instead of the days that would have been required for a complete 
 rewrite of the program.
 In addition to the techniques already mentioned, there were a couple of 
 other things that ended up being necessary to complete the 
 transformation. 
 As I mentioned above, one problem that comes up is what to do when you 
 have a program that references I/O, local "program-bank" data, and  
 zero-page. In this case, there are a couple of possible situations that 
 will require significant rewrites, but not necessarily. 
 (Diagram missing)
 In the case of the disk formatter, it turned out that some modules used 
 both normal zp addressing, and normal 16 bit absolute indexed. Since 
 the transformation process dictates that we change 16 bit absolute 
 addressing to direct page addressing with a non page aligned direct 
 page, there could have been a problem, had both uses of the direct page 
 been timing critical.  Fortunately, by treating each module of the 
 program separately, it worked out that when I needed both types, only 
 one was timing critical. The solution was in some modules, to set the 
 direct page to a non-page aligned value and in other modules, to set 
 the direct page to a page aligned value. There are also some minor 
 logistical issues about having a direct page whose base address can 
 either be at XXX0, or XXX1, the biggest of which is keeping track of 
 which is in effect at a given point, and knowing to reference the label 
 as either label, or label +1, or label P1 depending on the particular 
 case.
 One last note. In the case of the formatter conversion, there was one 
 other fairly major issue. The problem is that there are not direct page 
 versions of all the modes of 16 bit absolute. For example, one cannot 
 convert 16bitaddress,X to 8bitAaddress,X. In the case of the formatter, 
 I was able to deal with it by reversing all the register use. (i.e. all 
 ldy become ldx, and all sty become stx, etc.,etcI
 There are still a number of other ways that one can approach these 
 issues. In fact one that comes to mind would be to use some form of the 
 new stack-relative addressing modes to yield yet another range of 
 semi-independently accessible addresses. 
 The real point of this note is that, with a little bit of thought and 
 effort, one can successfully convert a large subset of likely 
 configurations for use in the native environment without major 
 rewrites. The bottom line is, be creative!
 For more information on the various new modes and features of the 
 65816, I heartily recommend Programming the 65816 Including the 6502, 
 65C02, and 65802. (Eyes/Lichty) [Prentice Hall Press].