[comp.sys.amiga] Bank switched CHIP RAM?

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (10/01/90)

Part of the problem with trying for higher resolution/more colors
is not invalidating the existing OS/application software base.
Working (in this area) from a knowledge base of nearly zero, let
me blunder on ahead with Yet Another Proposed Improved Graphics
(YAPIG) for the Amiga family.

When Apple upgraded the Apple ][+ (64K box) to the Apple ][e (128K
box), they exceeded the addressing capability of the 6502 chip,
sort of like going past 2M would exceed the addressing capability
of 2M Agnus. They chose to make the memory addressable by the chip
in banks, with some sort of register where the memory could be
flipped (a low page was mapped common, and some other details not
too interesting in the spitballing part of a product's
evolutionary planning).

Agnus, I suspect, would be Fat, Dumb and Happy addressing _any_
two meg, as long as we didn't bother to let the old girl know the
rug had been pulled out from under her and a new one run in.

So here's my question: since we can already work with bitmaps at
least 1K x 1K, we can promote pretty good resolution. With a
(possibly third party) add on board that gave bank switched CHIP
RAM in two meg chunks, dual ported with a really zingy set of
video DAC's on the other side to pull up to (24-48, you pick a
number) bits of color off the other side with or without a color
look up table, and defaulting to act just like the current 2M CHIP
RAM, how big a hit would existing software take to shoehorn in
support for the bank switching commands?

The goal, of course, is to switch from bank to bank writing the
six or so bitplanes currently supportable until all the banks
needed to support (12, 18, 24, 30, 36, 42, 48) bitplanes to do one
modification of the picture had been done. Each bank would have
the usual slop left around all over for offscreen copies of
information hidden onscreen, and so forth.

The immediate difficulty I see is that you'd probably want to cut
overhead by writing everything for one bank before switching, but
current software probably would be more comfortable going pixel
deep by pixel deep.

On the hardware side, I'm guessing you'd have to pretty much
hijack the mother board's memory access; would an eight inch
ribbon cable and the additional add in board trace lengths kill
the CHIP memory timing, or is there some slack in there that
corresponds to the time to get to and from FAST RAM?

This isn't the world's prettiest solution (can you say "segmented
architectures reek?") but if it's feasible at all it could move
us off the dime without blowing away the chance to be a little
more clever further down the line.  If it looks doable, could the
good folks at "CBU" (really? I always found them under "Comdre")
set a standard way of addressing a bank flipping mechanism so that
the third parties, and Commodore if they like, could go crazy with
new hardware?  I'd love to see a solution that gates NTSC 24 bit
color plus 8 bits of overlay, or maybe 48 bits for double buffering
(actually you can fit two 1K x 1K x 6 maps per bank, I think, so
it wouldn't have to be _real_ 48 bit), today, with an easy upgrade
path to whatever HDTV standard finally emerges tomorrow.

If the video side of the port were widely controllable, this tool
might even be able to do most of HDTV just by downloading some
numbers describing the new scan dimensions and rates.

I know this is either three bricks shy of a load, or a little
under half baked, but what do you expect for free?

Fire when ready, Gridley!  ;-)

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

daveh@cbmvax.commodore.com (Dave Haynie) (10/03/90)

In article <1990Sep30.233751.3244@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:

>When Apple upgraded the Apple ][+ (64K box) to the Apple ][e (128K
>box), they exceeded the addressing capability of the 6502 chip,
>sort of like going past 2M would exceed the addressing capability
>of 2M Agnus. They chose to make the memory addressable by the chip
>in banks, with some sort of register where the memory could be
>flipped (a low page was mapped common, and some other details not
>too interesting in the spitballing part of a product's
>evolutionary planning).

This was hardly a new trick at Apple -- CP/M machines did the same kind of
thing for CP/M 3.0, Commodore B128s did this early on, and the Commodore 128 
did it a few years after the larger Apples came around.  You see the lastest
incarnation of this in the LIM (Lotus, Intel, Microsoft) banked memory 
scheme for MS-DOS machines.  The word that correctly describes all of these
schemes is:

		K   K	L	U    U	DDDDD	 GGGG	EEEEEE
		K  K	L	U    U	D    D	G    G	E
		K K	L	U    U	D    D	G	E
		KK	L	U    U	D    D	G	EEE
		K K	L	U    U	D    D	G  GGG	E
		K  K	L	U    U  D    D	G    G	E
		K   K	LLLLL	 UUUU	DDDDD	 GGGG	EEEEEE

>Agnus, I suspect, would be Fat, Dumb and Happy addressing _any_
>two meg, as long as we didn't bother to let the old girl know the
>rug had been pulled out from under her and a new one run in.

That's true.  However, you would then have two completely disjoint chunks of
Chip memory.  No piece from one bank could have any effect on any piece of
the other bank, or disaster strikes.  How would you use the second bank if,
for instance, Workbench and all the system gadgets are in the first bank.
You can't even blit between the two banks.

>So here's my question: since we can already work with bitmaps at
>least 1K x 1K, we can promote pretty good resolution. With a
>(possibly third party) add on board that gave bank switched CHIP
>RAM in two meg chunks, dual ported with a really zingy set of
>video DAC's on the other side to pull up to (24-48, you pick a
>number) bits of color off the other side with or without a color
>look up table, and defaulting to act just like the current 2M CHIP
>RAM, how big a hit would existing software take to shoehorn in
>support for the bank switching commands?

Well, even if you're attempting to organize this as "deeper" rather than
"more" banked memory, you're going to get into trouble.  For example, video
fetch DMA isn't the only DMA going on -- you have memory refresh DMA, which
would have to be applied to every bank, every time, to work.  You have floppy
disk DMA, which would wind up dumping to whatever bank was in context unless
you found a way to insure the alternate banks are in context only during 
video fetch.  And without a Denise doing the display, I don't see any
advantage to this scheme vs. a plain old VRAM based visual RAM display.  You
couldn't support multiple screens; the area of memory that's displayed would
be fixed.  Etc. and so forth.  Even setting up a system with more Agnus,
Denise, and memory subsystems would be easier to deal with than this banking
idea, which is just about impossible to consider in the context of what Agnus
does.

You have to realize that the CPU bank switching was under very controlled
conditions.  Those systems only have one processor playing with memory, we
have two.  All the software that supports their bank switching is written
with that switching in mind, and all software that doesn't support the
banking doesn't see it happen.  And of course, that's much easier to control
when you only have to deal with a single program running.

Now, the idea of support multiple Agnus chips could work a bit better, though
still not optimally.  Any program that doesn't know about the extra Agnus
chips could go about it's merry way; all disk DMA probably happens only in
the main Agnus.  Each bank has it's own blitter, refresh, etc.  Programs
that know about the extra Agnus systems would run the same routines, only
with an offset to pick the extra banks.  What you would really like as well
is some kind of Denise-without-colormapping mode to be supported for the
custom multi-Agnus screens.  System software wouldn't have a clue about which
Agnus/chipram bank to deal with; plain memory writes, allocations, etc would
work fine, but any addressing of Agnus would have to change.  And you would
have to special case blits in-bank vs. blits between banks.  In other words,
a hairy kludge, but still something that could, with enough imagination, 
work.  I would estimate the software effort to put this kind of support into
graphics.library just a little easier than what it would take to support
arbitrary video display devices.  Then again, I would really dig playing
around with a blitter per color, or better yet, a blitter per bitplane.

>Kent, the man from xanth.
><xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	Standing on the shoulders of giants leaves me cold	-REM

d87sg@efd.lth.se (Svante Gellerstam) (10/04/90)

In article <1990Sep30.233751.3244@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>
> < lots of inspiering ideas on bank switched CHIPRAM deleted :->
>

Ok, this is a way to get more bits per EACH color. It would not give
many more colors than presently available. And there's the thing with
the speed of the current chipset. Today it works just fine with our
normal resolutions, but try to run the WB on a 16-color HiRes +
Interlace screen on a A3000. Even on a 25MHz '030 a measly
(professionally speaking) 640 x 400 screen becomes zippy as old
chewinggum. 

The big step would be to break away from the current CHIP-mem +
generic chipset architecture (not replace, just add to, that is). The
really hairy problem of getting a hires display (say 1280 x 1024 x 8
plus) is mainly getting the bits onto the screen and then give the CPU
or graphics coprocessor time to do its stuff.

Let's assume we have a graphics adapter that can display 1280 x 1024 x
(whatever) at 50 (ok 60) frames per sec. Some calculations show that
we cannot meet that bandwidth even if we only allow video access to
CHIP. Clearly we need some other memory running at full speed to be
able to meet the screen DMA bandwidth. 

One hardware event that would make the bank switching idea
semi-interestring is an async chipset running at lots of nice little
MHz's. Then they would be able to more fully use an area of high speed
video RAM. But that would also mean that every special graphics
adapter would have to be specially designed for the Amiga. This
implies high R&D costs and a high end price. 

Compatibility? Ok, as the only thing I am suggesting is a way for the
system to use other video hardware transparently, the current modes
and compatibility will remain 100% intact. The device.driver method
seems able to accommodate all wishes graphics-wise. Just as the
dos.library uses a device.driver to talk to its disk-volumes (through
filesystems) the graphics.library should use a device(screen ?).driver
to talk to its output device. To use the normal CHIP/generic chipset
setup the driver would simply call on the current graphics.library.
Existing software would just hit the firmware d:-) as usual. I'm
interested in discussing details. 

>I know this is either three bricks shy of a load, or a little
>under half baked, but what do you expect for free?

All ideas has some use :-)

>Kent, the man from xanth.
><xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

-- 
        2:200/107.4 Svante Gellerstam (Fido) d87sg@efd.lth.se (InterNet)
	     It's the african anteater ritual! -- Can't Buy Me Love

mueller@hatteras.cs.unc.edu (Carl Mueller) (10/05/90)

In article <1990Oct3.194556.7031@lth.se> d87sg@efd.lth.se (Svante Gellerstam) writes:
>In article <1990Sep30.233751.3244@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>> < lots of inspiering ideas on bank switched CHIPRAM deleted :->
>
>Ok, this is a way to get more bits per EACH color. [...]
>Even on a 25MHz '030 a measly (professionally speaking) 640 x 400
>screen becomes zippy as old chewinggum.

Remember, it's still the slow graphics chips doing a lot of the work
here, not the '030.

>The big step would be to break away from the current CHIP-mem +
>generic chipset architecture (not replace, just add to, that is).

Anything done to increase the capabilities must definitely be an
extension of some sort, to maintain compatibility.

> The really hairy problem of getting a hires display (say 1280 x 1024 x
>8 plus) is mainly getting the bits onto the screen and then give the
>CPU or graphics coprocessor time to do its stuff.
[...]
>One hardware event that would make the bank switching idea
>semi-interestring is an async chipset running at lots of nice little
>MHz's. Then they would be able to more fully use an area of high speed
>video RAM. But that would also mean that every special graphics
>adapter would have to be specially designed for the Amiga. This
>implies high R&D costs and a high end price. 

Whoa!  You don't need all that.  That's what video RAMs were designed
to get around:  they have two access ports.  The CPU (or whatever) can
access that V-RAM on the main port as usual.  The V-RAM has a second
serial access read-port which the display refresh hardware would use
to update the screen.  The two ports can be active at the same time.

Integrating these devices into the Amiga would still require a lot of
custom design.  At this point, nobody should expect a color megapixel
display to come cheap.  But I'm sure we'd all like to see SOMETHING
available soon, even if it cost as much as a Mac-type video adapter.

It's also definitely agreed that there will have to be a defined
standard for the OS to use an add-on display.  This I'm sure will be
a subject for a lot of heated debate.  Some features of the current
chip-set just might not be available in all add-on boards, such as
pull-down screens with different display modes.

This is an area where I'm hoping things will start happening NOW!
Currently, the Mac seems to be the only computer where this is
handled well (i.e. OS & programs working well with different display
adapters).  Have you ever worked on a Mac with a color and a black
and white display attached and noticed how you can smoothly drag a
window from one to the other?  True, it's a simple thing, and some
people will have no use for it, but many people WILL!

In summary, the OS support should be developed ASAP, and then the
hardware can follow.  Remember, I'm talking about add-on hardware,
not necessarily included with a future Amiga.  That way, people
who don't need it won't have to shell out the $$.

Well, I think I've rambled enough for now!

>All ideas has some use :-)

>>Kent, the man from xanth.
>><xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

>        2:200/107.4 Svante Gellerstam (Fido) d87sg@efd.lth.se (InterNet)

-Carl Mueller (mueller@cs.unc.edu)

swarren@convex.com (Steve Warren) (10/05/90)

In article <1990Oct3.194556.7031@lth.se> d87sg@efd.lth.se (Svante Gellerstam) writes:
                               [...]
>Interlace screen on a A3000. Even on a 25MHz '030 a measly
>(professionally speaking) 640 x 400 screen becomes zippy as old
>chewinggum. 
                               [...]
The solution to the bandwidth problem is banks.  We already have two
seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
is to set it up so that as long as the video DMA is accessing one of the two
banks, the CPU can have transparent access to the other bank simultaneously
(ie no cycle-stealing necessary).  The best way to utilize this is by using
an 8- or 16- byte interleave, so that every 16 consecutive bytes will be
in alternating banks.

The CPU could then write at full speed in a 16-byte space until it ran
into the back end of the next 16 bytes in the other bank (which the video
DMA would be accessing).  Then it would be wait-stated until the video DMA
moved into the first bank again, at which time the CPU could continue
writing at the next locations.  This would give the CPU greater bandwidth to
chip mem at the highest available screen resolution than it currently gets at
the lowest resolution.  This setup would lower chip-mem contention
significantly, probably to the point that it would no longer be noticeable.

This would be fairly simple to implement.  The chip-mem system is already dual
ported.  The dual ports need to be extended into the two banks so that
they may be seperately addressed on the same cycle, and the address mapping
of the 2 banks relative to each other would change.

The bandwidth requirements at 1280 x 1024 are another matter.  Standard DRAMs
can be used for that but you have to go to a tighter interleave or a wider
data bus to the video section.  A wider data bus would probably be the
cheapest solution.  Now that there are 1 Mbit DRAM chips with 16-bit data
busses this solution is not unreasonably expensive.  With four chips you can
get a 512 Kbyte bank with a 64-bit data bus.

Note the follow-up line.

--
            _.
--Steve   ._||__      DISCLAIMER: All opinions are my own.
  Warren   v\ *|     ----------------------------------------------
             V       {uunet,sun}!convex!swarren; swarren@convex.COM