u9050728@cs.uow.edu.au (Shane Kelvin Richards) (04/17/91)
Hello. I am hoping someone can help me. I am developing a set of animation routines for the //GS. So far its okay, BUT things are becoming a tad slow when many objects are on the screen. Heres how I am doing it.. If I want to put a shape on the screen, I go through a loop pulling off each background byte (that the shape will be using) anding it with the shapes mask (using a mask makes it really easy to have "see-through" holes in the shapes and so that the background moves around the shape if the background and shape happen to fall in the same byte) then I ora the shape byte and place this calculated byte back on the screening (and yes I am using the shadow screen in bank $01 for faster reads). So this is the basic process. It slows down quite considerably but I can't think of any other way of doing what am trying to do. This is the place the shape down on top of the background nicely (so that it doesn't get jaggies if I had for instance just Stored the shape directly on the screen. My question is, is my basic ideas/techniquie correct? Am I using the wrong method for fast shape manipulation. OR am I using the correct method and I should just try to improve upon my code and optimise where I can? For simplicity I only let shapes move by 2 pixels so that they always fall on a byte boundary. Also, I am usin the 320x200 resolution. Any ideas, what other techniques are there? -- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- Shane Richards u9050728@cs.uow.edu.au +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
toddpw@nntp-server.caltech.edu (Todd P. Whitesel) (04/17/91)
u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes: [ stuff deleted ] > My question is, is my basic ideas/techniquie correct? Am I using >the wrong method for fast shape manipulation. OR am I using the correct >method and I should just try to improve upon my code and optimise where >I can? Your method is reasonable, but the time-wasters are pretty obvious. Read on. > For simplicity I only let shapes move by 2 pixels so that they >always fall on a byte boundary. Also, I am usin the 320x200 resolution. Time-waster #1: loops. If you are looping through the picture data and the mask then you are spending a non-trivial amount of time in the loop overhead. Unrolling consists of coding a long string of instructions with the offsets hardcoded as the addresses; the index register(s) are used to hold the low word of the data address. You can do truly evil things this way if you map the SHR buffer to the stack (better disable interrupts temporarily though!): lda 0,x ;dp points to object location on screen and |0,y ;DBR/Y points to mask ora |$1000,y ;suppose the image is 4K past the mask sta 0,x lda 1,x and |0,y ora |$1000,y sta 1,x ... Note that the above example does assume the mask and image start at a fixed distance from each other. It is a speed vs. memory tradeoff. Time-waster #2: rectangular objects. Depending on the types of objects you want to animate, it may actually help to pack the image and its mask so that dead space in the object rectangle is replaced by offset/length values for each line of the object. This is almost always a win. Time-waster #3: the mask itself. If you can afford to let the mask be per byte and not per pixel, you can get even more speed but at real memory expense -- you hardcompile each object into code that draws it by simply storing it (using the index w/ hardcoded offset technique from above). If you want EVEN MORE speed you can use the stack to push bytes directly onto the picture (this looks sick but is actually pretty easy to do once you know what's involved). What's cool about stack-romping is that you can push arbitrary words with PEA's, repeat values and one-byte values with pha/phx/phy, and skip bytes with a sbc #xxxx; tcs; sequence (if you let A accumulate the hops that is -- a simple way to do this would be to pass the location of the object as the byte address of its last byte, so the object draw code can start with a tcs). The major drawback here is that you have hardcompiled code PER OBJECT -- I haven't tried to do this yet but I suspect that the code is about as large as the image & mask data so you are losing a bit of mask resolution but not much else. Time-waster #4: the shadowing itself. If you are going to be drawing over objects a lot then you should turn off shadowing while you are drawing the scene and then turn it back on and do a single romp copy of the bank 1 SHR buffer onto itself -- this can be done by remapping memory, the stack & dp, and issuing a series of pei $fe pei $fc ... pei $2 pei $0 and hopping the dp register after each page. I am not positive but I strongly suspect that both #3 and #4 are used by the FTA Space Harrier demo. Todd Whitesel toddpw @ tybalt.caltech.edu
meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) (04/17/91)
In article <1991Apr17.061057.22357@cs.uow.edu.au> u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes: > > Hello. I am hoping someone can help me. I am developing a set of >animation routines for the //GS. So far its okay, BUT things are becoming >a tad slow when many objects are on the screen. Heres how I am doing it.. > > If I want to put a shape on the screen, I go through a loop pulling >off each background byte (that the shape will be using) anding it with the >shapes mask (using a mask makes it really easy to have "see-through" holes >in the shapes and so that the background moves around the shape if the >background and shape happen to fall in the same byte) then I ora the shape >byte and place this calculated byte back on the screening (and yes I am >using the shadow screen in bank $01 for faster reads). So this is the >basic process. It slows down quite considerably but I can't think of any >other way of doing what am trying to do. This is the place the shape >down on top of the background nicely (so that it doesn't get jaggies if >I had for instance just Stored the shape directly on the screen. > > My question is, is my basic ideas/techniquie correct? Am I using >the wrong method for fast shape manipulation. OR am I using the correct >method and I should just try to improve upon my code and optimise where >I can? > > For simplicity I only let shapes move by 2 pixels so that they >always fall on a byte boundary. Also, I am usin the 320x200 resolution. > > Any ideas, what other techniques are there? > If all your shapes are the same size, then try unrolling all your loops in the shape drawing functions. Something like: ; ; On entry, X points to shape, Y points to screen ; Assume shape (for this example) is 8 bytes by 8 pixels. ; DrawShape phb ;Set to bank 1 pea $0101 plb plb lda $2000+0*160+0,y and >masktbl+0*8+0,x ora >shapetbl+0*8+0,x sta $2000+0*160+0,y lda $2000+0*160+2,y and >masktbl+0*8+2,x ora >shapetbl+0*8+2,x sta $2000+0*160+2,y ... lda $2000+6*160+6,y and >masktbl+6*8+6,x ora >shapetbl+6*8+6,x sta $2000+6*160+6,y plb rts It will take slightly more memory, but the speedup will be very noticable. If you don't re-map the bank, and use long address for the screen instead, you'll have to interchange the functions of the X & Y registers. > >-- >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- >Shane Richards >u9050728@cs.uow.edu.au >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- -- +---------------------------S-U-P-P-O-R-T-----------------------------------+ |/ Tim Meekins <<>> Snail Mail: <<>> Apple II \| |> meekins@cis.ohio-state.edu <<>> 8372 Morris Rd. <<>> Forever! <| |\ timm@pro-tcc.cts.com <<>> Hilliard, OH 43026 <<>> /|
meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) (04/17/91)
In article <108928@tut.cis.ohio-state.edu> meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) writes: >; >; On entry, X points to shape, Y points to screen >; Assume shape (for this example) is 8 bytes by 8 pixels. >; whoops! That should have read 8 bytes wide, 8 SCANLINEs high. -- +---------------------------S-U-P-P-O-R-T-----------------------------------+ |/ Tim Meekins <<>> Snail Mail: <<>> Apple II \| |> meekins@cis.ohio-state.edu <<>> 8372 Morris Rd. <<>> Forever! <| |\ timm@pro-tcc.cts.com <<>> Hilliard, OH 43026 <<>> /|
stephens@slc4.lat.oz.au (Philip J Stephens) (04/22/91)
Shane Kelvin Richards writes: > > If I want to put a shape on the screen, I go through a loop pulling >off each background byte...anding it with the shapes mask...then I ora the >shape byte and place this calculated byte back on the screen. The technique is correct; it's just the implementation that might be slowing you down. That is, if you're using indirect indexing to load and store bytes to and from the screen, you may wish to change this to absolute indexing and self-modify the base address for each new row. The same applies if you're using indirect indexing for accessing the shape tables. Self-modifying the code is always faster when dealing with large amounts of data. Also, use a table of row and shape addresses to get the new base from rather than using some form of computation. That should shave off a few more cycles per row. If you're really stuck, I could e-mail you an example of a faster routine for you. <\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\> < Philip J. Stephens >< "Many views yield the truth." > < Hons. student, Computer Science >< "Therefore, be not alone." > < La Trobe University, Melbourne >< - Prime Song of the viggies, from > < AUSTRALIA >< THE ENGIMA SCORE by Sheri S Tepper > </\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\></\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/>
johnmac@fawlty.towers.oz (John MacLean) (04/26/91)
In article <1991Apr17.061057.22357@cs.uow.edu.au> u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes: > Hello. I am hoping someone can help me. I am developing a set of >animation routines for the //GS. So far its okay, BUT things are becoming >a tad slow when many objects are on the screen. Heres how I am doing it.. > [LDA / AND / ORA / STA instruction sequence deleted] > My question is, is my basic ideas/techniquie correct? Am I using >the wrong method for fast shape manipulation. OR am I using the correct >method and I should just try to improve upon my code and optimise where >I can? > For simplicity I only let shapes move by 2 pixels so that they >always fall on a byte boundary. Also, I am usin the 320x200 resolution. > Any ideas, what other techniques are there? >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- >Shane Richards >u9050728@cs.uow.edu.au >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- At this level there is nothing wrong with the technique you are using. One thing you can do is use the //ec softswitches, and to move zero page up to the super hires screen. Of course, this can be a bit nasty if your using GS/OS / interrupts / etc. Make sure you're using the most optimal addressing modes, you have the data bank set to bank 1 etc, etc. It is also a good idea to base your animation on VBL interrups / so that your code does not slow down as you add more and more shapes - just set a limit that has acceptable speed, and update every 3,4,5 VBLs - whatever seems reasonable. John MacLean. -- This net: johnmac@fawlty.towers.oz Phone: +61 2 427 2999 That net: uunet!fawlty.towers.oz!johnmac Fax: +61 2 427 7072 Snail: Tower Technology, Unit D 31-33 Sirius Rd, Home: +61 2 960 1453 Lane Cove, NSW 2066, Australia.
ericmcg@pnet91.cts.com (Eric Mcgillicuddy) (04/27/91)
>slowing you down. That is, if you're using indirect indexing to >load and store bytes to and from the screen, you may wish to change >this to absolute indexing and self-modify the base address for each >new row. The same applies if you're using indirect indexing for >accessing the shape tables. Self-modifying the code is always faster >when dealing with large amounts of data. > ><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ >< Philip J. Stephens >< "Many views yield the truth." You do not need to use indirect indexing, absolute indexing from a base of 0000 is more than adequate. setup time is about as long and the data bank register must be the same (althoug I suppose absolute long addressing could be used for the shape table) for the shape table. Using the X register you can access any byte in a given bank and this is currently adequate to support the GS SHR (but throws out my earlier comment about the maintenance in the future). The cost of adjusting the X register for the next line may be as simple as INX or as complex as TXA, CLC,ADC #onerow-size_of_sprite, TAX. Regardless it is less than or equal to self modifying code and more "standard". Eric UUCP: bkj386!pnet91!ericmcg INET: ericmcg@pnet91.cts.com