[comp.os.msdos.programmer] Fast EGA sprites library.

jeroenk@cnps.PHILIPS.nl (Jeroen Kessels) (10/23/90)

I had an idea for a high-speed game, and I wanted to program it in the
best video mode possible for me: EGA 640x350x16colors.  So I needed a
programmers library for sprites. Nowhere to be found!  I even
disassembled some commercial games (very educational :-), but I wasn't
satisfied with what I found. It looked like the EGA card high-res mode
simply wasn't suited for sprites.

So I experimented (I have almost no EGA documentation) and eventually
developed a fast sprite algorithm.  I have also developed a library
around the algorithm (Turbo-Pascal v5.5), including demo, a sprite-draw
program and a game. Before I release the package, I'd like to be sure
that the algorithm is the fastest possible.

I have the following questions, before I explain the algorithm:

	1. Is there a faster way to create sprites in EGA
	640x350x16colors?

	2. Does anyone have EGA documentation (electronically or
	otherwise) I can have? I have no ftp-access.

	3. Does anyone have a sprite picture library I can have? Any
	video mode? I'd like some examples of nice sprites.

	4. Anyone interested in my EGA-sprite programmers library (I
	won't email, but if there is enough interest, I'll post the lot
	to comp.binaries.ibm.pc)?

A "sprite" is a small picture that is "moved" accross the screen. While
moving, the sprite may change (pac-man's mouth opens and closes while he
moves around the screen).  In programming-primitives this means placing
and removing the picture on the screen (HA! :-).

My algorithm will XOR an image onto the screen. XOR'ing once more, and
the image is "removed". By setting an EGA-card register, the image can
also be AND'ed or OR'ed. The algorithm has an important disadvantage:
the image can only be placed at x=0, 8, 16, 24, etc. This is because the
algorithm writes at least 8 pixels.  It can be remedied by storing 8
copies of the image, each one shifted 0..7 pixel to the right, and by
selecting the proper image depending on X. This requires HUGE amounts of
memory, though.

Compared to the "putimage" routine in the Graph library of Turbo Pascal
(Borland), this algorithm proved to be 5 times faster. It depends on
image size, "putimage" has considerable overhead per image.  My
algorithm uses less than 20 clockticks per pixel. I have written a demo
that can keep 50 sprites (16x16 pixels) moving around at a very
acceptable speed.

The following code will put 8 pixels onto the screen. I have stripped
all the code around it (address calculation, X/Y looping, clipping) to
make the algorithm stand out more clearly.  The image is stored in a
number of 4-byte blocks. Every block contains all the information for 8
pixels. The first byte stores the information for the first color-plane,
the second byte for the second plane, etc.

The code assumes the following settings:
DX    = $03C4.
DS:SI = pointer to the stored image.
ES:DI = pointer to the video RAM (a function of X and Y).
EGA card registers:
$03CE port 3 = $18      XOR write mode 0, no rotate count.
$03CE port 5 = 0        Write mode 0.
$03CE port 8 = $FF      Bit mask enabled for all planes.



; Select first plane with 03C4h, register 2 (Map Mask).
mov     al,1
out     dx,al
; Fetch pattern from stored image (using pointer DS:SI).
; Notice that we fetch 2 bytes here. The second byte (AH) is used later on.
lodsw
; Store in video RAM. This is done with XCHG, because the video RAM must first
; be read before written, to be latched. The XCHG combines this Read and Write.
es: xchg al,[di]

; Select second plane with 03C4h, register 2 (Map Mask).
mov     al,2
out     dx,al
; Store pattern. Now a normal MOV can be used, because the video RAM is already
; latched.
es: mov [di],ah

; Select third plane with 03C4h, register 2 (Map Mask).
mov     al,4
out     dx,al
; Fetch pattern from stored image (using pointer DS:SI).
; Notice that we fetch 2 bytes here. The second byte (AH) is used later on.
lodsw
; Store pattern.
es: mov [di],al

; Select fourth plane with 03C4h, register 2 (Map Mask).
mov     al,8
out     dx,al
; Store pattern, and increment the video offset DI (next video byte).
mov     al,ah
stosb

; Ready writing 8 pixels!
-- 
Jeroen C. Kessels
Software Engineer, Philips C&P-LSS
VA-25, P.O. Box 218, 5600 MD Eindhoven, The Netherlands
Usenet = jeroenk@cnps.philips.nl