[comp.sys.amiga] Split the c.s.a group more? & Re: Bank switched CHIP RAM?

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (10/04/90)

(If you follow this up, split the subject to show which thread
you want to discuss; sorry for mixing them together here, but
it's late and I'm lazy.)

hazy> = daveh@cbmvax.commodore.com (Dave Haynie) writes:
kpd > = xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:

kpd > [The Apple ][e used bank switched memory to get past the 64K limit;
kpd > can Agnus do the same?]

hazy> [This has been done lots of times; can you say "kludge" Kent?]

kpd > [Agnus would never know we'd switched her bank.]

hazy> That's true. However, you would then have two completely
hazy> disjoint chunks of Chip memory. No piece from one bank could
hazy> have any effect on any piece of the other bank, or disaster
hazy> strikes. How would you use the second bank if, for instance,
hazy> Workbench and all the system gadgets are in the first bank.
hazy> You can't even blit between the two banks.

Not sure about workbench, but it was my goal that what was
being gained was support for deep pixels, not more CHIP ram
per se, so I envisioned that memory allocations in each bank
would be identical. Bank to bank blits would not be
meaningful -- the same addresses of chip ram in different
banks would hold the same picture, just different bit planes
of it.

This would get around memory management problems (sort of;
you might well have to special case the allocate and free
steps to loop on banks); but would probably require either
a) isolating other allocations to FAST ram (nicest), b)
isolating ones in CHIP to the first bank (gag) or, c) making
the same kinds of bank switching stuff work for other heavy
chip ram users.

Choice "c" would be nice for sampled sound, if it could be
played from the video DAC side of the dual ported memory,
and knew to play the banks in order, or pull the sample
bytes from multiple banks like a multiheaded disk drive with
the data arranged with a bytes bits on different cylinders
for speed; you'd have lots more sound space as well as
picture space.

Choice "c" would not so nice for other data that could
rapidly fragment various banks and make bitplane allocations
impossible. It would be especially ugly for disk DMA
buffers.

Well, I said I'd only half baked this one. A mix of the
strategies in "a", "b", and "c" would probably be necessary,
which heightens the impact on existing software (instead of
__chip, __fast, __don't_care, or whatever their called,
__chip would become several types).

kpd > [How badly would this break existing software?]

hazy> [Problem areas are other DMA's (memory refresh, floppy
hazy> disk), loss of Denise capabilities, no multiple screens...]

Try this again in the light of having the pixel plane data
for pictures at identical addresses in the various banks; I
think a lot of the whizzy window/screen stuff comes back
without that much effort. The worst case for other DMA is
that it is all done to the "front" bank; and that isn't much
worse than at current, except that only "bank capable"
(video, maybe sound data handlers) or "bank mandatory" (DMA
refresh; would each bank require a separate circuit?
Probably.) signals could be gated to banks other than the
first; the rest would all apply (for case "b", above) to the
"front" bank only, and corresponding addresses on the other
banks would be (temporarily) wasted.

hazy> [The problem was a lot easier with programs written with
hazy> bank switching in mind and with only monoprogramming and
hazy> monoprocessing.]

Yes.

hazy> Now, the idea of support multiple Agnus chips could work a
hazy> bit better, though still not optimally.

Well, you'd certainly have an impressive amount of horsepower.

hazy> Any program that doesn't know about the extra Agnus chips
hazy> could go about it's merry way; all disk DMA probably happens
hazy> only in the main Agnus.

I guess we both came to this conclusion as simplest.  I want to
add that you lock out the same addresses on the other banks as
well, (in fact, there really isn't any need for memory management
to touch the other banks, as long as they stay synched with the
main bank; just assume the "front" bank data applies to all alike).
This removes the possiblity of contending against the addresses
you need for video data.  This lets you address screens, window
parts and so on with one set of addressing hardware for all banks.

hazy> Each bank has it's own blitter, refresh, etc. Programs that
hazy> know about the extra Agnus systems would run the same
hazy> routines, only with an offset to pick the extra banks.

Yes.

hazy> What you would really like as well is some kind of
hazy> Denise-without-colormapping mode to be supported for the
hazy> custom multi-Agnus screens.

That's one approach; last time I looked (1978), color maps
above about 12 bits deep weren't practical anyway; you
couldn't get the data out fast enough. There's a big range
from 6 to 12 bits, though, where color mapping could still
be useful; and it needs further work to see if a multibank
color look up table could be made to work. As a start,
though, a system that either used the current color table
capabilities, or fed the appropriate bitplanes in without
color translation, as you described, from the MSB down as
far as pixel bits for each color were available, to three
eight bit input video gun control DACs would be a good
start.

hazy> System software wouldn't have a clue about which
hazy> Agnus/chipram bank to deal with; plain memory writes,
hazy> allocations, etc would work fine, but any addressing of
hazy> Agnus would have to change.

Some. Putting the data out there would require bank
switching software or new hardware that could write a pixel
to all banks at once; blitting it around and so on could be
done in parallel with identical instruction addresses (and
different "masks"/blit equation data) sent to your multiple
blitters, if synchronization of the addresses between banks
were maintained.

This implies some more not terribly nice stuff, though.
Suppose you have a picture big enough to only allow three
planes per bank, with working storage for off screen blocks.
If the user is delighted with 32 colors, then the second
bank only has two pixel bit planes; the third one, again,
goes to waste, to keep the addresses synchronized, and your
processing has to take into account that the data in the
sixth plane is invalid.  Easier to explicitly zero it and
lie about the colors for that plane.

There would also be an allocation strategy for bit planes
versus banks.  If your application needs eight bit planes
and you have four banks, you take a lesser hit per bank
if you put two bit planes in each than if you put five in
bank one, three in bank two, and leave bank three and four
unused.

hazy> And you would have to special case blits in-bank vs. blits
hazy> between banks.

I wanted to say you wouldn't have any between bank blits,
but then the case of blitting the front bit plane against
the back one of the same picture occurred to me as a
possible exception; if there is one, there are likely
hundreds, so the problem has to be solved in general. Damn.
This probably invalidates the stuff I said above about easy
blitter control.

I suppose you end up with some message passing architecture
like the systolic array processors or transputer chips use.
Needs more work.

Worse, with more bits per pixel, you have given the user the
ability to want much more complex within-pixel manipulations
to occur. Think of color cycling without a color lookup
table for one horrible possibility. Think of a typical image
processing convolution for another (not that the latter
couldn't be done in fast ram with the CPU, but how much
nicer if the blitters could help).

hazy> In other words, a hairy kludge, but still something that
hazy> could, with enough imagination, work.

I can imagine problems faster than I can imagine solutions,
but it is kind of fun, and if I throttle my urge to general
purpose things every step of the way, it doesn't grow
NP-hard quite so fast. ;-)

hazy> I would estimate the software effort to put this kind of
hazy> support into graphics.library just a little easier than what
hazy> it would take to support arbitrary video display devices.

With the simplification to synchronized addresses, does that
help any, or does the inherently greater complexity of
blitting overwhelm that?

hazy> Then again, I would really dig playing around with a blitter
hazy> per color, or better yet, a blitter per bitplane.

Not until you start sleeping at home nights. The word is out
on you, Dave. Trust me, I've been there, it's a _bad_
mistake. Ask the nearest shrink about caffeinosis. Damn near
killed me in 1982.

Still, the siren call of ever more powerful hardware and
software, grown in an evolutionary way from a strong base,
is hard to resist.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
--
Ready for the next barrage, and holding at 300 articles behind
in comp.sys.amiga.  Is it time to start the campaign to further
subdivide the group, now that the comp.sys.amiga.games vote is
done?  I'll do the organizing if any large number of people are
interested/in favor.  If we start right away we can be done before
christmas break.  Email suggestions, I really am having trouble
keeping up here, since I follow dozens of other groups too.