c162-fe@zooey.Berkeley.EDU (Jonathan Dubman) (04/02/88)
A little while back (in the Mac II vs. Amiga article) I mentioned that I had an idea that would make the blitter obsolete. Well, here it is. * How would you like to drag windows as smoothly as you drag screens, not just dragging the outline, but dragging the whole window and its contents? * How would you like arbitrarily fast animation of large regions? The principle is so simple that I am sure it has been thought of before: EVERY WINDOW IS A HUGE SPRITE. The graphics chip can handle multiple overlapping rectangular regions with differing priorities and (maybe) palette. In essence, it is more like the copper than the blitter. To move a window, merely change the x and y positions. The chip could handle off-screen movement and clipping very easily. To move a window to the front, merely shuffle the priorities. No refreshing. No software clipping. Can you say, "blindingly fast"? This ain't no pipe dream. I have a decent idea of how to design it, and I'm not even a "hardware person". The problem with the copper is that it is trying to reload the palette and so forth WHILE it is extremely busy sending bits to the video generation circuitry. It takes one blank line to change resolutions and so forth. A better solution is to separate the functions (like two small parallel processors in a master-slave relationship.) One part, the "bottom" part, looks in memory and sends bits to the video generation circuitry (which handles horizontal and vertical sync, A-D conversion, etc.) The "top" part, every once in a while, resets the counter to point to a different bitmap in memory. Jay Miner et al did the gruntwork during the horizontal retrace, when the copper wouldn't be busy spewing out bits. Well, while not use very simple parallel processing and get around the whole problem? The whole thing is not difficult even in concept, much less in execution. Low memory bandwidth, incredible performance by today's standards. And the windowing idea is not going away. You get higher bandwidth if you worry about transparency, which i would insist upon, but parallel processing can lower the bandwidth with very little overhead. And each of the processors is extremely simple- I'm talking equivalent of a dozen counters and latches, etc. Naturally, there will have to be some finite limit to the "see-through", like 8 deep or something. THE KEY IDEA The key is to distinguish that which must change memory from that which need not. A graphics coprocessor that does 3d wire frame drawing needs to change memory. A windowing chip doesn't NEED to change some big bitmap in memory. Think of the overhead- we spend a lot of time moving bits into a huge bitmap (the screen bitmap) which later we spend a lot of time moving bits out of! Why the silly intermediary step? Have you ever used a Mac II? The window motion is SOOOOO SLOW. It is very annoying to use. The slowness of the graphics and the inability to distinguish conceptual contexts (can't have different programs on different screens- you just get everything running on one cluttered screen that each application probably completely fills) are the major reasons preventing me from getting one. (Oh, and also the price.) Actually, this whole idea started when I was using a Mac II and thinking about how much software complication there is just in moving the mouse pointer around on the screen. The mouse pointer flickered wildly as it was constantly overwritten and updated. Then I thought of sprites and how handy they were, but they were a departure from the elegant (but slow) standard bitmapped screen. And then a friend of mine, Ofer Licht, made the critical step: why not link sprites together, across the screen, to make a huge sprite, and use that for moving small windows? Well, why stop at small windows? OK- I'm ready for the bad news. Somebody respond and tell me that Fairchild or Texas Instruments or somebody has had the GFX1152 for three years that already does everythig I'm saying. cu, *&(Jonathan Dubman)
jesup@pawl8.pawl.rpi.edu (Randell E. Jesup) (04/03/88)
In article <2007@pasteur.Berkeley.Edu> c162-fe@zooey.Berkeley.EDU (Jonathan Dubman) writes: >A little while back (in the Mac II vs. Amiga article) I mentioned that I had >an idea that would make the blitter obsolete. Well, here it is. >The principle is so simple that I am sure it has been thought of before: >EVERY WINDOW IS A HUGE SPRITE. The graphics chip can handle multiple >overlapping rectangular regions with differing priorities and (maybe) palette. >In essence, it is more like the copper than the blitter. To move a window, >merely change the x and y positions. The chip could handle off-screen movement >and clipping very easily. To move a window to the front, merely shuffle the >priorities. No refreshing. No software clipping. Can you say, >"blindingly fast"? Yes, it has been thought of before; and yes, there are chips that do things like this. The problem is that you are going to be limited in the number of these on screen at a time. The memory bandwidth problems will impose limitations, and the circuitry to decide priorities. If you want transparency, you must fetch the bitmaps for EVERY window in their entirety, THEN decide how to handle overlaps. Sprites work because they are small. The fetching is done early on the line, and 8 sprites is a small enough number that the priority stuff is fairly simple. The proper way to think of it is that there are about 800 pixels by 4 bitplanes worth of memory bandwidth per horizontal line. To do windows as sprites, including transparency, it would limit you to about 4 windows in monochrome. You can have more if they aren't on the same horizontal line, or aren't full width, but that really gets to be a pain (look at the problems dealing with Vsprites). >Low >memory bandwidth, incredible performance by today's standards. And the >windowing idea is not going away. You get higher bandwidth if you worry >about transparency, which i would insist upon, but parallel processing can >lower the bandwidth with very little overhead. And each of the processors >is extremely simple- I'm talking equivalent of a dozen counters and latches, >etc. Naturally, there will have to be some finite limit to the "see-through", >like 8 deep or something. The ONLY way you could get your transparency is to have seperate memory spaces for each "window processor", and that would be very complicated/ expensive. The performance would be nice, but the limitations wouldn't be. Look how people are annoyed by GEM only allowing (4? 8?) windows open at a time. It also would be much more expensive if you either want more windows, or colors. >Have you ever used a Mac II? The window motion is SOOOOO SLOW. It is very >annoying to use. The slowness of the graphics and the inability to distinguish >conceptual contexts (can't have different programs on different screens- you >just get everything running on one cluttered screen that each application >probably completely fills) are the major reasons preventing me from getting >one. (Oh, and also the price.) Notice how much faster the Amiga is at graphics due to the blitter? Remember that the amiga is using a 16-bit bus, at 7Mhz, and an old technology chip (the blitter). Think about the posibility of a 32-bit blitter, running at 14Mhz (or better), with new (faster) technology, and if you really want all-out performance, use a blitter per bitplane. >OK- I'm ready for the bad news. Somebody respond and tell me that Fairchild >or Texas Instruments or somebody has had the GFX1152 for three years that >already does everythig I'm saying. I know they exist, don't remember who makes them. // Randell Jesup Lunge Software Development // Dedicated Amiga Programmer 13 Frear Ave, Troy, NY 12180 \\// beowulf!lunge!jesup@steinmetz.UUCP (518) 272-2942 \/ (uunet!steinmetz!beowulf!lunge!jesup) BIX: rjesup (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)
cmcmanis%pepper@Sun.COM (Chuck McManis) (04/04/88)
[Note I reordered some of the lines here ... ] In article <2007@pasteur.Berkeley.Edu> (Jonathan Dubman) writes: > OK- I'm ready for the bad news. Somebody respond and tell me that Fairchild > or Texas Instruments or somebody has had the GFX1152 for three years that > already does everythig I'm saying. Intel makes it, and 'it' is called the 82786, I helped design it. The 82786 is designed into three basic parts, the memory/processor interface, the Graphics engine, and the Display processor. The Display processor does what you describe, basically you set it up with a bunch of 'descriptors' for the screen which is made up of tiles, overlapping windows can be designed by breaking them up into the required tiles like so ... +-----------------------------+ | Tile 1 | |........+----------------+...| | | | T | | Tile 2 | Tile 3 | i | | | | l | | | | e | | | | 4 | +--------+----------------+---+ Window 1 is composed of Tiles 1,2,4 and Window 2 is composed of Tile 3. The algorithim for converting from window space into tile space was interesting but not especially difficult. The limitations are 8 tiles to a horizontal line, tiles are a minimum of one line high. This in turn limited you to 8 totally generic windows or more 'virtual' windows where window position could be restricted. (Why 8 tiles? well that had to do with how fast the Display processor could fetch descriptors from memory) The other nice thing about this chip is/was that each window could be any 'depth' (1,2,4,8 bits) and characters could be blitted rather easily, although they were limited to 64 X 64 pixels if I recall. > This ain't no pipe dream. I have a decent idea of how to design it, and > I'm not even a "hardware person". ... > ... Naturally, there will have to be some finite limit to the "see-through", > like 8 deep or something. The key thing is memory access time. If you keep the sprite descriptions 'on chip' then you eat lots of real estate, if you keep them in RAM then you need really fast RAMs. The problems you encounter are very similar to those encountered when designing a memory management unit. Basically, the 'video' processor puts out a request for the pixel at [x,y] which your mapper has to map into a pixel of some sort. This involves mapping the virtual x,y address to a physical address, and if it is transparent then checking the next mapping that fits the requirements. Doing this 8 times is pushing it for 100ns DRAMS. (Note that Video RAMs are of absolutely no use here.) For a given display (say 640 X 480) and a 32 Khz scan rate, one pixel time is 49 nanoseconds. You can fudge a couple of nanoseconds by starting right at the horizontal retrace time and slowly getting behind during the line. As you can see your 'counters and gates' had better be Gallium Arsenide! :-) --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.
stever@videovax.Tek.COM (Steven E. Rice) (04/04/88)
In article <603@imagine.PAWL.RPI.EDU>, Randell Jesup (beowulf!lunge!jesup@steinmetz.UUCP) writes: > . . . > Remember that the amiga is using a 16-bit bus, at 7Mhz, and an old technology > chip (the blitter). Think about the posibility of a 32-bit blitter, running > at 14Mhz (or better), with new (faster) technology, and if you really want > all-out performance, use a blitter per bitplane. I'm thinking, I'm thinking! Drool, drool, drool. . . Steve Rice ----------------------------------------------------------------------------- * Every knee shall bow, and every tongue confess that Jesus Christ is Lord! * new: stever@videovax.tv.Tek.com old: {decvax | hplabs | ihnp4 | uw-beaver}!tektronix!videovax!stever