botton@laidbak.UUCP (Brian D. Botton) (08/06/89)
A week ago or so I posted an article describing how I gained access to the video ram on my 3B1. To tell the truth, I've been a little under-whelmed with the response I've received. I did receive a few letters, one form John Bly Milton IV asking some questions about why I went to such extreme measures. I hope John doesn't mind if I answer his questions/comments publicly so that others will understand just why it is worth the trouble. JOHN: Seems a bit brute force. It is, but because of the design of the 3B1 there is little alternative. If you want to access an area of protected memory you have basicly three choices, device driver (see below), use the virtual memory system to map that memory into your process's address space, or allow direct access. Personally I would have preferred the second option, but the page table rams do not allow access to memory greater then 4Meg, and therefore this wouldn't work. It's too bad too, because there are three unused bits in the page table rams that could have been used, which would have allowed this :-(. JOHN: Is mgr really that good? I'll admit that I don't get my jollies from window managers, but it is public domain and for those who have used sunview on a Sun 3/50 under 4.0, painful isn't it ;-), Mgr is easily an order of magnitude faster. It is fairly small, ~200k, so it shouldn't eat up gobs of memory. It sure is nice when your window manager doesn't have to be paged in or out. JOHN: Why didn't you try a software (loadable device driver) approach? A very good question and one that bears answering. Lets take a look at what happens when you do a system call, such as a write. Assuming you have already opened the device, the sequence of events are: 1. The user process puts data into a buffer, it doesn't matter what kind of buffer, variable, array, malloced memory, etc. 2. The user process calls write() with the proper parameters, this causes several bytes to be pushed onto the process's stack. 3. That write() routine is actually a stub routine, probably written in assembly, that further manipulates several bytes on the stack. The stub then executes a trap instruction that forces the processor into the supervisor mode and transfers execution to the kernel. To do this several more bytes are written onto the stack, take a look at the Motorola documents on the 680x0 family for details. 4. The kernel figures out which system call was desired from a lookup table and then jumps to that routine. 5. The device driver retrieves the address of the buffer and transfers the data, in this case to video memory. If this had been a block device instead of a character device, data would have been transferred to a buffer after it was allocated, but for video ram we would have a character device and thus no extra buffer. 6. Step 4, 3, and 2 are reversed, undoing all of those stack manipulations. Also, when the system call returns the kernel takes the opportunity to check if another process should run, so you may loose the processor until the next context switch. Now lets take a look at my solution, assuming that you have already set the video pointer like so: unsigned short data, *video = (unsigned short *)0x420000; And let's assume you want to write to the 23rd u_short, you would do: video[22] = data; I'm sorry folks, but this seems like a heck of a lot easier. There are some additional benefits to this approach, such as: 1. You don't have to spend who knows how may hours writing and testing a device driver. I wrote a device driver for a Ramtek graphics device on a BSD 4.3 VAX when I was in college and I know how hard it can be to find subtle bugs. But I must admit, a device driver for access to the video ram is fairly trivial, just look at the vidram device driver that has been posted on the net by Mike "Ditto" Ford. 2. The special window functions are now in user level code which is far easier to debug. When Brad and I were working on the portable bit blit code it was made a lot easier than if we had to keep reloading a device driver. And who knows how many times we would have crashed our machines getting it right. 3. Because you now have one screen worth of ram available where it belongs, you can allocate one less buffer. Plus you don't have to make expensive system calls to update the buffer. For those people who are sleeping, this works too: data = video[22]; 4. Many window managers expect to see the video memory mapped into user space, Mgr does, and I suspect that X does also, even though I haven't seen any code. Having this access makes porting a whole lot easier. In fact, Brad and I weren't going to do the port until we came up with an easy way to get to the video ram. 5. It is fast, as fast as the 3B1 with a 10 MHz clock will ever get. My method requires two operations, an offset added to the base and one word of data transferred to that address, i.e, a few machine instructions with < 10 memory references. The device driver method requires what, 10 - 30 instructions, 10 - 30 instruction fetches, and all those stack writes and reads. Plus the device driver has to have a way to calculate the offset, possibly requiring an address to be sent in the data stream. 6. This is a security hole. If the page table could have been modified then the MMU pal would take care of this for us. But since it can't we have a hardware mod. But this really isn't that big of a deal on a small system. It isn't like there are a hundred users and you have to protect the screen from peepers. Security is one of the resons I went to the trouble of using all those address lines in my pal. 7. Window manager code doesn't belong in the kernel anyway. When we get Mgr working all the way we're going to remove the wind.o driver, which will give us better than 40k of precious kernel space back. 8. I don't know if I should mention this, but I don't see any reason to hide it. The displayable portion of the video does not use up all of the video ram. So we also have an automatic shared memory segment at the end of video ram. BUT, it is wide open and you're probably a fool to use it and an idiot to rely on it. I hope this answers some questions and piques some curiosity about what we, the 3B1/7300 user community, can do with our machines. Personally, I think the ability to get away from ua and use a "real" window system is worth the afternoon it takes to make a daughter board. We also get to have source for a major part of the system, that alone is enough for me to want to change to Mgr, X, or what ever. Again, I welcome comments, good and bad. And if you too need a pal and don't have access to a programmer, let me know and we'll see what happens. -- ... ___ _][_n_n___i_i ________ Brian D. Botton (____________I I______I laidbak!botton /ooOOOO OOOOoo oo oooo
jbm@uncle.UUCP (John B. Milton) (08/09/89)
In article <2575@laidbak.UUCP> botton@laidbak.UUCP (Brian D. Botton) writes: ... >JOHN: Why didn't you try a software (loadable device driver) approach? I was refering to an everything in the driver aproach, where access would still be fast, but your point is well taken. It is wonderful to be able to work on drivers the way we can on this machine, but it's still a bitch next to regular user level programs. The idea is begining to grow on me, which does bring up a bit of a delema as far as how I'm going to implement the screen part of the X server. > 8. I don't know if I should mention this, but I don't see any reason > to hide it. The displayable portion of the video does not use up > all of the video ram. So we also have an automatic shared memory > segment at the end of video ram. BUT, it is wide open and you're > probably a fool to use it and an idiot to rely on it. I WAS hoping you wouldn't it, but it is 32768-31320=1448 bytes... John -- John Bly Milton IV, jbm@uncle.UUCP, n8emr!uncle!jbm@osu-cis.cis.ohio-state.edu (614) h:294-4823, w:785-1110; N8KSN, AMPR: 44.70.0.52; Don't FLAME, inform!