[comp.sys.amiga.hardware] Bank switched CHIP RAM?

swarren@convex.com (Steve Warren) (10/05/90)

In article <1990Oct3.194556.7031@lth.se> d87sg@efd.lth.se (Svante Gellerstam) writes:
                               [...]
>Interlace screen on a A3000. Even on a 25MHz '030 a measly
>(professionally speaking) 640 x 400 screen becomes zippy as old
>chewinggum. 
                               [...]
The solution to the bandwidth problem is banks.  We already have two
seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
is to set it up so that as long as the video DMA is accessing one of the two
banks, the CPU can have transparent access to the other bank simultaneously
(ie no cycle-stealing necessary).  The best way to utilize this is by using
an 8- or 16- byte interleave, so that every 16 consecutive bytes will be
in alternating banks.

The CPU could then write at full speed in a 16-byte space until it ran
into the back end of the next 16 bytes in the other bank (which the video
DMA would be accessing).  Then it would be wait-stated until the video DMA
moved into the first bank again, at which time the CPU could continue
writing at the next locations.  This would give the CPU greater bandwidth to
chip mem at the highest available screen resolution than it currently gets at
the lowest resolution.  This setup would lower chip-mem contention
significantly, probably to the point that it would no longer be noticeable.

This would be fairly simple to implement.  The chip-mem system is already dual
ported.  The dual ports need to be extended into the two banks so that
they may be seperately addressed on the same cycle, and the address mapping
of the 2 banks relative to each other would change.

The bandwidth requirements at 1280 x 1024 are another matter.  Standard DRAMs
can be used for that but you have to go to a tighter interleave or a wider
data bus to the video section.  A wider data bus would probably be the
cheapest solution.  Now that there are 1 Mbit DRAM chips with 16-bit data
busses this solution is not unreasonably expensive.  With four chips you can
get a 512 Kbyte bank with a 64-bit data bus.

Note the follow-up line.

--
            _.
--Steve   ._||__      DISCLAIMER: All opinions are my own.
  Warren   v\ *|     ----------------------------------------------
             V       {uunet,sun}!convex!swarren; swarren@convex.COM

d87sg@efd.lth.se (Svante Gellerstam) (10/09/90)

In article <106878@convex.convex.com> swarren@convex.com (Steve Warren) writes:
>In article <1990Oct3.194556.7031@lth.se> d87sg@efd.lth.se (Svante Gellerstam) writes:
>                               [...]
>>Interlace screen on a A3000. Even on a 25MHz '030 a measly
>>(professionally speaking) 640 x 400 screen becomes zippy as old
>>chewinggum. 
>                               [...]
>The solution to the bandwidth problem is banks.  We already have two
>seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
>is to set it up so that as long as the video DMA is accessing one of the two
>banks, the CPU can have transparent access to the other bank simultaneously
>(ie no cycle-stealing necessary).  The best way to utilize this is by using
>an 8- or 16- byte interleave, so that every 16 consecutive bytes will be
>in alternating banks.
>
>   < stuff deleted >...
>

This is a method I have not considered - interesting! But I really
think a more 'device' oriented solution is better - you send commands
to a video device and get results on the screen - this enables all
types of implementations. As it is now (and the line of thought of
most people) all video adapters have to positively rape the machine to
be both compatible and able to meet say 1280 x 1024 x 8 bandwidth.

It is of course possible to build some kind of video DMA that will
allow whatever desired with the current setup - the blitter (and
friends) does the graphics housekeeping. Thing is they are too slow to
cope with hires and many-colored screens.

Thus we need some agreed upon (sanctioned by Commodore) protocol that
allows any type of video adapter (all from 80 x 25 text only to 1600 x
1280 x 24) to be connected and useable. That is - the WorkBench should
run fine on it - applications should have the option of interrogating
the system for available resolutions and color modes.

The main point here is not to build the damn thing to be compatible
with present standards, but to define a way to connect every
concievable tope of video adapter.

Building something to be locked into the existing structure is an
indication of regression in  development d;^). We already have all
these beautiful dynamically loaded libraries and device drivers and
separate filesystems - why not device independent grahics?

Apart from being a way to control pee wee's do-it-yourself
megapixeldisplay, it would be a way to control other types of raster
displays like printers and so on...

(Sorry Steve - MG messed up your footer :-(

-- 
        2:200/107.4 Svante Gellerstam (Fido) d87sg@efd.lth.se (InterNet)
	     It's the african anteater ritual! -- Can't Buy Me Love

lron@easy.HIAM (Dwight Hubbard) (10/09/90)

[stuff deleted]
>
>think a more 'device' oriented solution is better - you send commands
>to a video device and get results on the screen - this enables all
>types of implementations. As it is now (and the line of thought of
>most people) all video adapters have to positively rape the machine to
>be both compatible and able to meet say 1280 x 1024 x 8 bandwidth.

Yes, it would also make the system more flexible since it would
be possible to send commands to the device directly.  Can you
see drawing a picture on the screen by typing: copy xxxpic to gfx:
It would also open up the possiblity of taking commands for the
graphics device from another device and piping them to the
local graphics device (It would be nice to run Amiga apps on
one machine and have the window for it on another)

It would seem to me that a graphics device could be built on top
of the current graphics library.  And while I'm talking about
a graphics device, why not an Intution device as well...

>Building something to be locked into the existing structure is an
>indication of regression in  development d;^). We already have all

Yes, staying locked into an outdated hardware structure is the
primary reason messydos machines are so difficult to use.

>Apart from being a way to control pee wee's do-it-yourself
>megapixeldisplay, it would be a way to control other types of raster
>displays like printers and so on...

Yes, and how about any PC graphics board in one of the PC slots
for those bridgeboard owners.
>
[removed old footers]
--
-Dwight Hubbard,                      |-Kaneohe, HI
-USENET:   uunet.uu.net!easy!lron     |-Genie:    D.Hubbard1
           lron@easy.hiam             |-GT-Power: 029/004

peterk@cbmger.UUCP (Peter Kittel GERMANY) (10/09/90)

In article <106878@convex.convex.com> swarren@convex.com (Steve Warren) writes:
>The solution to the bandwidth problem is banks.  We already have two
>seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
>is to set it up so that as long as the video DMA is accessing one of the two
>banks, the CPU can have transparent access to the other bank simultaneously
>(ie no cycle-stealing necessary).  

Crap. (Sorry) But PLEASE understand that Chip RAM in an Amiga is not solely
used for video! There are many many other important data held there. Second
caveat is that you would lose the big advantage of arbitrarily placeable
bitmaps in memory, you would be forced to arrange them properly across
your banks. And what happens with our system or sound data during a bank
switch?

-- 
Best regards, Dr. Peter Kittel  // E-Mail to  \\  Only my personal opinions... 
Commodore Frankfurt, Germany  \X/ {uunet|pyramid|rutgers}!cbmvax!cbmger!peterk

jms@tardis.Tymnet.COM (Joe Smith) (10/10/90)

In article <106878@convex.convex.com> swarren@convex.com (Steve Warren) writes:
>The solution to the bandwidth problem is banks.  We already have two
>seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
>is to set it up so that as long as the video DMA is accessing one of the two
>banks, the CPU can have transparent access to the other bank simultaneously
>(ie no cycle-stealing necessary). This would be fairly simple to implement. 
>The chip-mem system is already dual ported.

One major problem.  The Chip RAM is NOT dual ported.  There is not a set
of address & data lines coming from the CPU and another set from the graphic
chips.  None of the CPU address lines go to the RAM that is Chip memory.
Although the memory system does have seperate RAS and CAS strobes, it has
only a single set of row/column address lines.

All access to Chip RAM go through Agnus.  The "dual porting" is done inside
Agnus.  Requests to access Chip memory can come from the CPU bus or from
Agnus DMA registers.  It is Agnus who decides whether to honor the external
request or an internal one.  Without Agnus' cooperation, the CPU gets nothing.

True dual porting needs RAM chips twice as fast as the current ones, to
ensure that Agnus gets its data on time in all cases (including when the CPU
gets there first).  Remember, these graphics chips cannot tolerate any wait
states whatsoever.
-- 
Joe Smith (408)922-6220 | SMTP: jms@tardis.tymnet.com or jms@gemini.tymnet.com
BT Tymnet Tech Services | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms
PO Box 49019, MS-C41    | BIX: smithjoe | 12 PDP-10s still running! "POPJ P,"
San Jose, CA 95161-9019 | humorous dislaimer: "My Amiga 3000 speaks for me."

swarren@convex.com (Steve Warren) (10/10/90)

In article <1280@tardis.Tymnet.COM> jms@tardis.Tymnet.COM (Joe Smith) writes:
>In article <106878@convex.convex.com> swarren@convex.com (Steve Warren) writes:
>>The solution to the bandwidth problem is banks.  We already have two
>>seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
>>is to set it up so that as long as the video DMA is accessing one of the two
>>banks, the CPU can have transparent access to the other bank simultaneously
>>(ie no cycle-stealing necessary). This would be fairly simple to implement. 
>>The chip-mem system is already dual ported.
>
>One major problem.  The Chip RAM is NOT dual ported.  There is not a set
>of address & data lines coming from the CPU and another set from the graphic
                                    [...]
You need to reread my post, Joe.  I never said the chip *RAM* was dual-ported.
I said the *chip-mem system* was dual ported.  The chip-mem system obviously
includes Agnus.  I stated in my post that the ports would have to be extended
down into the ram organization by providing seperate switchable ports into
each of the two banks.

>All access to Chip RAM go through Agnus.  The "dual porting" is done inside
>Agnus.  Requests to access Chip memory can come from the CPU bus or from
>Agnus DMA registers.  It is Agnus who decides whether to honor the external
>request or an internal one.  Without Agnus' cooperation, the CPU gets nothing.

Of course.  This is not a problem.

>True dual porting needs RAM chips twice as fast as the current ones, to
>ensure that Agnus gets its data on time in all cases (including when the CPU
>gets there first).  Remember, these graphics chips cannot tolerate any wait
>states whatsoever.

This is not true (that the ram chips have to be twice as fast).  I work in
the development of multi-ported memory systems.  We use interleaved banks
of 100 ns DRAMs.  The memory systems we produce are used in CONVEX
supercomputers.  It is the *system* that has to be faster, not the individual
chips.

By properly designing the controller it is possible to use the 2-bank/2-port
system I described, although I did not go into any details.  You would have to
preemtively lock the processor out of the next bank when the graphics
processor is on the last location in the first bank.  These are details, but
I assure you that it is possible with normal speed drams and to present the
graphics processor with unrestrained access to the chip mem, while providing
the CPU with greatly enhanced bandwidth to the same memory space.  All waits
that were necessary would of course be applied to the CPU, not the graphics
processor.

In any case, I don't think it matters that much, since any decisions of
this nature have most likely already been made for some time.  The solution
I described is quite doable, though.  And not especially more complex than
the system already implemented on the 3000 (but it is a little more complex).
More I/O pins to double the number of dram control/address/data ports would
be required.  Agnus or something new would be required to do the muxing of
the two ports.  Etc.

--
            _.
--Steve   ._||__      DISCLAIMER: All opinions are my own.
  Warren   v\ *|     ----------------------------------------------
             V       {uunet,sun}!convex!swarren; swarren@convex.COM

daveh@cbmvax.commodore.com (Dave Haynie) (10/11/90)

In article <lron.1045107@easy.HIAM> lron@easy.HIAM (Dwight Hubbard) writes:

>Yes, it would also make the system more flexible since it would
>be possible to send commands to the device directly.  Can you
>see drawing a picture on the screen by typing: copy xxxpic to gfx:

That might be kind of cool.  And in fact, you could write such a GFX:
filesystem, though of course that's not the same thing as pure device
independent graphics -- you don't want the overhead of a filesystem type
server to do all your graphic commands (then again, the X Windowing system
does it kind of similarly).  I suppose you would want the GFX: device to
support multiple devices.  A preference editor might set up the default
for GFX:, maybe as a window on Workbench.  You could pick a display card
simply by name; maybe "GFX:BuiltIn" for the standard Amiga graphics,
"GFX:A2410" for that ULowell card, etc.  I suppose the best thing for such
a device to do would be for it to speak in a high level graphics language
that's byte stream rather than function call based.  Just like with disks,
there would be a device driver under the GraphicsSystem, and for higher
performance you could look up the particular graphics.device based on the
filing system name for the unit.  Kind of weird, but interesting.  Based on
the fact that it's all done in byte streams, you could do things with filters,
like:

	type pic.ilbm | ilbm_2_gfx | GFX:Builtin/640/400/4

Or somesuch.  Similar filters could handle PHIGS, PostScript, whatever, with
enough effort.  You don't want GFX: to speak ILBM as a native language, but
something much higher level, so you can send it "draw a circle", "fill this
rectangle", etc. type commands.

>-Dwight Hubbard,                      |-Kaneohe, HI

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	Standing on the shoulders of giants leaves me cold	-REM

swarren@convex.com (Steve Warren) (10/11/90)

In article <489@cbmger.UUCP> peterk@cbmger.UUCP (Peter Kittel GERMANY) writes:
>In article <106878@convex.convex.com> swarren@convex.com (Steve Warren) writes:
>>The solution to the bandwidth problem is banks.  We already have two
>>seperate banks of chip mem in the 3000 with 2 Megs of chip.  The trick
>>is to set it up so that as long as the video DMA is accessing one of the two
>>banks, the CPU can have transparent access to the other bank simultaneously
>>(ie no cycle-stealing necessary).  
>
>Crap. (Sorry) But PLEASE understand that Chip RAM in an Amiga is not solely
>used for video! There are many many other important data held there. Second
>caveat is that you would lose the big advantage of arbitrarily placeable
>bitmaps in memory, you would be forced to arrange them properly across
>your banks. And what happens with our system or sound data during a bank
>switch?

Peter, I have had my Amiga since 1985  ;^).

You have missed the point.  I am not talking about "bank-swapping"
ala Apple ][.  All the memory is always there.  These banks do not share
the same physical addresses.  On either side of the memory system there
is no appearance of any difference in the location of data.  You write
data to the same location as always, and any time you read that data, it
is always available at the same address - same data, unless the other
processor has modified the data.  At no time does one bank supersede
the other or present alternate data in the same location.  All chip
memory continues to appear as continuous addresses as before.  The devices
accessing it will have no way of knowing that some of the data came from
bank A, while some of the data came from bank B.  It is only a function
of the address mapping.

The only purpose of the banks is to allow you to double your bandwidth
without increasing the speed of your memory chips, by allowing simultaneous
access to the memory by two processors, as long as they do not both make
requests to the same bank.  This is a standard technique utilized in
multiprocessor systems like CONVEXen (see header), of which the Amiga is
an example (although not a general multiprocessor machine).  The technique
can be applied and will improve bandwidth much more cheaply than trying
to make the memory chips go faster.

In order to implement this you must have a 2-way crossbar switch that
would be controlled by Agnus or a relative of Agnus.  On one processor
port of the crossbar you have the CPU and related devices (harddrives, etc).
On the second processor port you have the address-limited devices (all
custom-chip accesses that are limited to *only* chip mem).  I have
referred to all these devices collectively (and erroneously - but I was
just trying to be succinct) as "the graphics processor".  Understand that
by this term I actually mean all the chip-mem limited devices.  On the
other side of the crossbar switch you have bank A chip-mem and bank B
chip-mem connected to the two memory ports of the crossbar switch.

Now the graphics processor has to have preemptive access to chip-mem
(no waiting allowed).  Therefore the CPU must synchronize with the graphics
processor so that arbitration will always give a bank in contention to
the graphics processor.  The CPU may not be granted access to bank B
while the graphics processor is in the middle of a bus cycle in bank A.
But the CPU may be allowed access to bank B at the *start* of a bus
cycle in which the graphics processor is accessing bank A.

In addition, the CPU of course could access either bank at the start
of a cycle in which the graphics processor makes no requests.  Of course
the assumption is that the graphics processor is now so hungry that this
will rarely be the case.  That is why the CPU needs its own independent
port into chip-mem.  Otherwise the CPU rarely gets access (in this
hypothetical system with higher resolution).

I am sorry if my original post was confusing because I did not include
enough details.  My own job is involved in the development of multi-bank/
multi-port interleaved memory systems, so I sometimes take these ideas
for granted.

Feel free to poke holes in this concept; just make sure that it is really
what I am talking about  ;^).

Regards,
--
            _.
--Steve   ._||__      DISCLAIMER: All opinions are my own.
  Warren   v\ *|     ----------------------------------------------
             V       {uunet,sun}!convex!swarren; swarren@convex.COM

d87sg@efd.lth.se (Svante Gellerstam) (10/14/90)

In article <lron.1045107@easy.HIAM> lron@easy.HIAM (Dwight Hubbard) writes:
>[stuff deleted]
>>
>>think a more 'device' oriented solution is better - you send commands
>
>Yes, it would also make the system more flexible since it would
>be possible to send commands to the device directly.  Can you
>see drawing a picture on the screen by typing: copy xxxpic to gfx:
>It would also open up the possiblity of taking commands for the
>graphics device from another device and piping them to the
>local graphics device (It would be nice to run Amiga apps on
>one machine and have the window for it on another)

Just to clarify - a device in my text is primarily the 'gfx.device'
part. But of course you could extend the useability by adding a GFX:
dos level device.

If the gfx.device provided output aswell for the position of the
pointer or some such the window-on-another-Amiga would be a definite
possibility. And since Intuition by then can have many screens
connected to one system, you could play the 'move object across
screens' that MAC people talk about. 

In an extension one could imagine a graphic clip-device. Say you had a
network with three stations, two of them being A*000 (latest model :-)
running latest version of PPage (17.0 or some such :-) using station #
3 as a graphical clip board. Then one could put pages that needed
graphics on the third machine for collection from the second. That
would greatly enhance productivity for many types of applications. 

Windowed operations over a modem link would also be possible. Mind
boggles... We need more ideas. I hope some one at Commodore are
working on this. If someone does - make it an open end system.

>-Dwight Hubbard,                      |-Kaneohe, HI
>-USENET:   uunet.uu.net!easy!lron     |-Genie:    D.Hubbard1
>           lron@easy.hiam             |-GT-Power: 029/004

Regards, Svante

-- 
        2:200/107.4 Svante Gellerstam (Fido) d87sg@efd.lth.se (InterNet)
	     It's the african anteater ritual! -- Can't Buy Me Love

lron@easy.hiam (Dwight Hubbard) (10/16/90)

>In article <15057@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes:
>In article <lron.1045107@easy.HIAM> lron@easy.HIAM (Dwight Hubbard) writes:
>
[previous junk deleted]
>That might be kind of cool.  And in fact, you could write such a GFX:
>filesystem, though of course that's not the same thing as pure device
>independent graphics -- you don't want the overhead of a filesystem type
>server to do all your graphic commands (then again, the X Windowing system
>does it kind of similarly).  I suppose you would want the GFX: device to
>support multiple devices.  A preference editor might set up the default
>for GFX:, maybe as a window on Workbench.  You could pick a display card
>simply by name; maybe "GFX:BuiltIn" for the standard Amiga graphics,
>"GFX:A2410" for that ULowell card, etc.  I suppose the best thing for such
>a device to do would be for it to speak in a high level graphics language
>that's byte stream rather than function call based.  Just like with disks,
>there would be a device driver under the GraphicsSystem, and for higher
>performance you could look up the particular graphics.device based on the
>filing system name for the unit.  Kind of weird, but interesting.  Based on
>the fact that it's all done in byte streams, you could do things with filters,
>like:
>
>       type pic.ilbm | ilbm_2_gfx | GFX:Builtin/640/400/4
>
>Or somesuch.  Similar filters could handle PHIGS, PostScript, whatever, with
>enough effort.  You don't want GFX: to speak ILBM as a native language, but
>something much higher level, so you can send it "draw a circle", "fill this
>rectangle", etc. type commands.
>

Very close to what I was thinking.  The graphics.device itself ideally should
be in a rom on the display card and autoconfig on power on.   Also, it should
be able to handle the same low level functions that the current graphics
library handles.  I like the idea of a Dos level handler because it would
make it possible to access the graphics display as a device.  Possibly
allowing more than one system to use the same display adapter over a network,
as well as the fact that it would make it more flexible.  I think however
it would be better to make a dos level Intuiton Device which would access
all the display adapters and filters installed in the system.  For example

        type postscriptfile.eps > INTUI:Postpic/640/400/2/2024/EPS

Could be used to display an EPS file in a window with  2 colors on a 2024
monitor.
(Suggestion only, your turn to poke it full of holes.)
--
-Dwight Hubbard,                      |-Kaneohe, HI
-USENET:   uunet.uu.net!easy!lron     |-Genie:    D.Hubbard1
           lron@easy.hiam             |-GT-Power: 029/004