[comp.sys.apple2] Animation

u9050728@cs.uow.edu.au (Shane Kelvin Richards) (04/17/91)

	Hello. I am hoping someone can help me. I am developing a set of
animation routines for the //GS. So far its okay, BUT things are becoming
a tad slow when many objects are on the screen. Heres how I am doing it..

	If I want to put a shape on the screen, I go through a loop pulling
off each background byte (that the shape will be using) anding it with the
shapes mask (using a mask makes it really easy to have "see-through" holes
in the shapes and so that the background moves around the shape if the 
background and shape happen to fall in the same byte) then I ora the shape
byte and place this calculated byte back on the screening (and yes I am
using the shadow screen in bank $01 for faster reads). So this is the 
basic process. It slows down quite considerably but I can't think of any
other way of doing what  am trying to do. This is the place the shape
down on top of the background nicely (so that it doesn't get jaggies if
I had for instance just Stored the shape directly on the screen.

       My question is, is my basic ideas/techniquie correct? Am I using
the wrong method for fast shape manipulation. OR am I using the correct
method and I should just try to improve upon my code and optimise where
I can?

      For simplicity I only let shapes move by 2 pixels so that they 
always fall on a byte boundary. Also, I am usin the 320x200 resolution.

	Any ideas, what other techniques are there?


-- 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Shane Richards
u9050728@cs.uow.edu.au
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

toddpw@nntp-server.caltech.edu (Todd P. Whitesel) (04/17/91)

u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes:

[ stuff deleted ]

>       My question is, is my basic ideas/techniquie correct? Am I using
>the wrong method for fast shape manipulation. OR am I using the correct
>method and I should just try to improve upon my code and optimise where
>I can?

Your method is reasonable, but the time-wasters are pretty obvious. Read on.

>      For simplicity I only let shapes move by 2 pixels so that they 
>always fall on a byte boundary. Also, I am usin the 320x200 resolution.

Time-waster #1: loops. If you are looping through the picture data and the
mask then you are spending a non-trivial amount of time in the loop overhead.
Unrolling consists of coding a long string of instructions with the offsets
hardcoded as the addresses; the index register(s) are used to hold the low
word of the data address. You can do truly evil things this way if you map
the SHR buffer to the stack (better disable interrupts temporarily though!):

	lda	0,x	;dp points to object location on screen
	and	|0,y	;DBR/Y points to mask
	ora	|$1000,y	;suppose the image is 4K past the mask
	sta	0,x
	lda	1,x
	and	|0,y
	ora	|$1000,y
	sta	1,x
	...

Note that the above example does assume the mask and image start at a fixed
distance from each other. It is a speed vs. memory tradeoff.

Time-waster #2: rectangular objects. Depending on the types of objects you want
to animate, it may actually help to pack the image and its mask so that dead
space in the object rectangle is replaced by offset/length values for each
line of the object. This is almost always a win.

Time-waster #3: the mask itself. If you can afford to let the mask be per
byte and not per pixel, you can get even more speed but at real memory
expense -- you hardcompile each object into code that draws it by simply
storing it (using the index w/ hardcoded offset technique from above).
If you want EVEN MORE speed you can use the stack to push bytes directly
onto the picture (this looks sick but is actually pretty easy to do once
you know what's involved). What's cool about stack-romping is that you
can push arbitrary words with PEA's, repeat values and one-byte values
with pha/phx/phy, and skip bytes with a sbc #xxxx; tcs; sequence (if you
let A accumulate the hops that is -- a simple way to do this would be to
pass the location of the object as the byte address of its last byte, so
the object draw code can start with a tcs). The major drawback here is
that you have hardcompiled code PER OBJECT -- I haven't tried to do this
yet but I suspect that the code is about as large as the image & mask data
so you are losing a bit of mask resolution but not much else.

Time-waster #4: the shadowing itself. If you are going to be drawing over
objects a lot then you should turn off shadowing while you are drawing the
scene and then turn it back on and do a single romp copy of the bank 1 SHR
buffer onto itself -- this can be done by remapping memory, the stack & dp,
and issuing a series of
	pei $fe
	pei $fc
	...
	pei $2
	pei $0
and hopping the dp register after each page.

I am not positive but I strongly suspect that both #3 and #4 are used by
the FTA Space Harrier demo.

Todd Whitesel
toddpw @ tybalt.caltech.edu

meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) (04/17/91)

In article <1991Apr17.061057.22357@cs.uow.edu.au> u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes:
>
>	Hello. I am hoping someone can help me. I am developing a set of
>animation routines for the //GS. So far its okay, BUT things are becoming
>a tad slow when many objects are on the screen. Heres how I am doing it..
>
>	If I want to put a shape on the screen, I go through a loop pulling
>off each background byte (that the shape will be using) anding it with the
>shapes mask (using a mask makes it really easy to have "see-through" holes
>in the shapes and so that the background moves around the shape if the 
>background and shape happen to fall in the same byte) then I ora the shape
>byte and place this calculated byte back on the screening (and yes I am
>using the shadow screen in bank $01 for faster reads). So this is the 
>basic process. It slows down quite considerably but I can't think of any
>other way of doing what  am trying to do. This is the place the shape
>down on top of the background nicely (so that it doesn't get jaggies if
>I had for instance just Stored the shape directly on the screen.
>
>       My question is, is my basic ideas/techniquie correct? Am I using
>the wrong method for fast shape manipulation. OR am I using the correct
>method and I should just try to improve upon my code and optimise where
>I can?
>
>      For simplicity I only let shapes move by 2 pixels so that they 
>always fall on a byte boundary. Also, I am usin the 320x200 resolution.
>
>	Any ideas, what other techniques are there?
>

If all your shapes are the same size, then try unrolling all your loops
in the shape drawing functions. Something like:

;
; On entry, X points to shape, Y points to screen
; Assume shape (for this example) is 8 bytes by 8 pixels.
;
DrawShape	phb		;Set to bank 1
		pea 	$0101
		plb
		plb

		lda	$2000+0*160+0,y
		and	>masktbl+0*8+0,x
		ora	>shapetbl+0*8+0,x
		sta	$2000+0*160+0,y

		lda	$2000+0*160+2,y
		and	>masktbl+0*8+2,x
		ora	>shapetbl+0*8+2,x
		sta	$2000+0*160+2,y

		...

		lda	$2000+6*160+6,y
		and	>masktbl+6*8+6,x
		ora	>shapetbl+6*8+6,x
		sta	$2000+6*160+6,y

		plb
		rts


It will take slightly more memory, but the speedup will be very noticable.


If you don't re-map the bank, and use long address for the screen instead,
you'll have to interchange the functions of the X & Y registers.

>

>-- 
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
>Shane Richards
>u9050728@cs.uow.edu.au
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-


--
+---------------------------S-U-P-P-O-R-T-----------------------------------+
|/ Tim Meekins                  <<>> Snail Mail:           <<>>  Apple II  \|
|>   meekins@cis.ohio-state.edu <<>>   8372 Morris Rd.     <<>>  Forever!  <|
|\   timm@pro-tcc.cts.com       <<>>   Hilliard, OH 43026  <<>>            /|

meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) (04/17/91)

In article <108928@tut.cis.ohio-state.edu> meekins@tortoise.cis.ohio-state.edu (timothy lee meekins) writes:
>;
>; On entry, X points to shape, Y points to screen
>; Assume shape (for this example) is 8 bytes by 8 pixels.
>;

whoops! That should have read 8 bytes wide, 8 SCANLINEs high.








--
+---------------------------S-U-P-P-O-R-T-----------------------------------+
|/ Tim Meekins                  <<>> Snail Mail:           <<>>  Apple II  \|
|>   meekins@cis.ohio-state.edu <<>>   8372 Morris Rd.     <<>>  Forever!  <|
|\   timm@pro-tcc.cts.com       <<>>   Hilliard, OH 43026  <<>>            /|

stephens@slc4.lat.oz.au (Philip J Stephens) (04/22/91)

Shane Kelvin Richards writes:
>
>	If I want to put a shape on the screen, I go through a loop pulling
>off each background byte...anding it with the shapes mask...then I ora the
>shape byte and place this calculated byte back on the screen.

  The technique is correct; it's just the implementation that might be
slowing you down.  That is, if you're using indirect indexing to
load and store bytes to and from the screen, you may wish to change
this to absolute indexing and self-modify the base address for each
new row.  The same applies if you're using indirect indexing for
accessing the shape tables.  Self-modifying the code is always faster
when dealing with large amounts of data.
  Also, use a table of row and shape addresses to get the new base from
rather than using some form of computation.  That should shave off a
few more cycles per row.
  If you're really stuck, I could e-mail you an example of a faster
routine for you.  

<\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\>
<  Philip J. Stephens                ><   "Many views yield the truth."       >
<  Hons. student, Computer Science   ><   "Therefore, be not alone."          >
<  La Trobe University, Melbourne    ><   - Prime Song of the viggies, from   >
<  AUSTRALIA                         ><   THE ENGIMA SCORE by Sheri S Tepper  >
</\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\></\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/>

johnmac@fawlty.towers.oz (John MacLean) (04/26/91)

In article <1991Apr17.061057.22357@cs.uow.edu.au> u9050728@cs.uow.edu.au (Shane Kelvin Richards) writes:
>	Hello. I am hoping someone can help me. I am developing a set of
>animation routines for the //GS. So far its okay, BUT things are becoming
>a tad slow when many objects are on the screen. Heres how I am doing it..
> [LDA / AND / ORA / STA instruction sequence deleted]
>       My question is, is my basic ideas/techniquie correct? Am I using
>the wrong method for fast shape manipulation. OR am I using the correct
>method and I should just try to improve upon my code and optimise where
>I can?
>      For simplicity I only let shapes move by 2 pixels so that they 
>always fall on a byte boundary. Also, I am usin the 320x200 resolution.
>	Any ideas, what other techniques are there?
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
>Shane Richards
>u9050728@cs.uow.edu.au
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

At this level there is nothing wrong with the technique you are using.
One thing you can do is use the //ec softswitches, and to move zero page
up to the super hires screen.
Of course, this can be a bit nasty if your using GS/OS / interrupts / etc.
Make sure you're using the most optimal addressing modes, you have
the data bank set to bank 1 etc, etc.

It is also a good idea to base your animation on VBL interrups / so that
your code does not slow down as you add more and more shapes - just set
a limit that has acceptable speed, and update every 3,4,5 VBLs -
whatever seems reasonable.

John MacLean.
-- 
This net: johnmac@fawlty.towers.oz                      Phone: +61 2 427 2999
That net: uunet!fawlty.towers.oz!johnmac                Fax:   +61 2 427 7072
Snail:    Tower Technology, Unit D 31-33 Sirius Rd,     Home:  +61 2 960 1453
          Lane Cove, NSW 2066, Australia.

ericmcg@pnet91.cts.com (Eric Mcgillicuddy) (04/27/91)

>slowing you down.  That is, if you're using indirect indexing to
>load and store bytes to and from the screen, you may wish to change
>this to absolute indexing and self-modify the base address for each
>new row.  The same applies if you're using indirect indexing for
>accessing the shape tables.  Self-modifying the code is always faster
>when dealing with large amounts of data.
>
><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/><\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
><  Philip J. Stephens                ><   "Many views yield the truth."      

You do not need to use indirect indexing, absolute indexing from a base of
0000
is more than adequate. setup time is about as long and the data bank 
register must be the same (althoug I suppose absolute long addressing could be
used for the shape table) for the shape table. Using the X register you can
access any byte in a given bank and this is currently adequate to support the
GS SHR (but throws out my earlier comment about the maintenance in the
future). The cost of adjusting the X register for the next line may be as
simple as INX or as complex as TXA, CLC,ADC #onerow-size_of_sprite, TAX.
Regardless it is less than or equal to self modifying code and more
"standard".

Eric

UUCP: bkj386!pnet91!ericmcg
INET: ericmcg@pnet91.cts.com