mash@mips.com (John Mashey) (03/18/91)
In article <ZU=9R=8@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >Here we're talking about putting file systems in smart processors. How about >putting other stuff there? > > Erase and kill processing. (some PC smart cards do this, > as did the old Berkeley Bussiplexer) > Window management. (all the way from NeWS servers > with Postscript in the terminal, > down through X terminals and Blits, > to the 82786 graphics chip) > Network processing. (Intel, at least, is big on doing > lots of this in cards, to the point > where the small memory on the cards > becomes a problem... they do tend > to handle high network loads nicely) These have been done, in various ways, and have at least fairly often been good ideas, although some of the requirements keep changing on you. In general, these OFTEN have the characteristics that make it reasonable to distribute the processing, although people argue about the window management case. The relevant characteristics are: 1: The ratio of # of device interactions to bandwidth or processing is high, i.e., there are many interactions, but most of them neither move much data, nor require much processing. 2: There is relatively little global data needed to process the device interactions. 3: There is a reasonable protocol, such that the work can be split between I/O processor and main CPU(s), such that the main CPU(s) normally interact with the I/O processor much less frequently than the IOP interacts with the devices. I.e., if the IOP does NOT lessen the frequency of interaction, it may well get in the road more than not. 4: The devices demand lower response latency than can cost-effectively be done by the main CPU. [Note that my argument against doing this for disk file systems was that disk I/ O was the opposite of 1 (much data, few requests), and disobeyed 2 (much global data).] Consider some of the paths through which asynch serial I/O has gone through (for concreteness, on UNIX): 1. Simple UARTS for serial ports, with no special queueing. CPU polls for input or gets 1 interrupt per char CPU emits output on scheduled basis, or 1 interrupt per char Cheap, but needs main CP Uthat can stand high interrupt rates if supporting many and/or fast lines. CPU does echoing EX: DEC DL11 (yes, that goes a way back) (in this case just 1 line) 2. #1 plus input silo EX: DEC DJ11: 64-entry silo, handling 16 lines. CPU can either: poll the device regularly to see if there is any input available, gather 1-64 entries and parcel them out to the processes they belong to. OR ask for an interrupt whenbver a character appears, thus possibly trading away overhead ot minimize latency. CPU does echoing 3. #2 plus more control EX: DEC DH11: like DJ11, but for example, you could ask for an interrupt only if there were N characters in the silo. Also, it could do auto-echo 4. #3 plus DMA output EX: (I think): DEC DZ11 + DQS11 com processor (My memory gets vague on this one. The DQS was a com processor with good speed, albeit "interesting" to code for.) 5. Input silos with settable parameters, output silos; maybe DMA There have been lots of these: no special processing, but: a) On input, poll, or ask for an interrupt if the silo has N characters, or possibly, ask for interrupt if even 1 character is there, and M milliseconds have elapsed. if something is set up as a comm processor geared for non-terminal use, it may well do DMA, to support fast lines. b) On output, either have deep silos that you can stuff many characters into in bunches, or else do DMA out. 6. Move echo, erase, and line kill into the IOP. This has been done in various commercial UNIXes (maybe people can post examples). For line-oriented input, this is not a bad idea, and especially as cheap micros became available, became much more reasonable. (Note that in the DQS/DZ era, anything like a micro was NOT cheap.) The CPU: needs a protocol to hand to the IOP definitions of erase & kill characters (if the user is allowed to change them), and in fact, any other parameters of relevance The IOP: needs more local storage (to hold parameters) needs even more local storage, because the natural unit of buffering is now a complete input line Must deal with all of the UNIX escape sequences and conversions, as well as interrupts. (For example, if an interrupt comes, it may want to terminate output in progress to that line.) BENEFITS: CPU gets one interrupt for an entire line of input. CPU need not do echoing. DRAWBACKS: Often, for cost reasons, the IOP could just barely do what was needed, and then along would come additional requirements. For example, suppose you had a definite idea of "Interrupt" and that was wired into the IOP, and then UNIX starts letting people change that, etc. In general, to make this work, the IOP must almost always be able to deal with every input character, without bothering the main CPU. Of course, historically, about the same time such things became widespreadly practical, they became almost useless in some environments, because people were using CRTs, not with line mode, but with visually oriented terminals. At that point, everybody started running the terminals in "raw" mode, thus bypassing all of the above processing! So, the next step: 7. Make the IOP smart enough to distinguish between "ordinary" characters which should be echoed, and stuck in an input queue, and other characters and/or sequiences, which demanded response from the CPU. The theory here, is that, for example, you sit typing text into vi or emacs fairly efficiently, and still get the various control sequences to do something, without making the IOP huge. Of course, since every program might well use different special sequences, this may require more interaction with the main CPU. I think this one has been done. 8. Put termcap in the IOP. Implications of this left as an exercise :-) One clear lesson: When doing an IOP, be careful that the interface requirements don't change on you, especially if they often force you into a transparent pass-thru mode where the IOP really is doing nothing. Another one: watch out if you trying to make the same IOP handle both terminal I/O and fast serial lines to connect systems I believe that most systems tend to use devices in groups 1-5 these days, because the processing requirements have gotten complex enough that you might as well just have the main CPU do it, irritating though it may be. Anyway, those are a few samples. Maybe people would post examples of these various approaches, or others, with comments on: a) How well they worked b) How long they lasted c) What the weirder bugs and problems were -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086
cs450a03@uc780.umd.edu (03/18/91)
For the case of an IOP.... line-orientation is an approximation of what you want, but not exactly. Essentially, what you need is a way of saying "just stuff these characters in the buffer, but when you get one of those characters, send the buffer up to me for handling." 'Implicitly', if the buffer gets nearfull, you'll have to read it anyways. Information of this sort could be communicated by sending short lists of characters (turn on pass through for these, turn off pass through for these), sending blanket commands (turn off pass through for all), or sending bitmaps (set passthrough as indicated, for all characters). As always, when you introduce a new class of feature, there will be lots of programs that cannot take advantage of the feature. Most notably, programs where you have no source code access (nor a support contract). On the other hand, this mechanism is pretty general, and could be implemented with a few kernal mods (in the case of unix), and a little out-board hardwre. Note that mostly I'm thinking about buffering printing characters, and sending control characters (including delete) through to the 'main processor' .. so perhaps you could get away with a simple switch to turn on/turn off such a feature. How useful? How many users? What is communication overhead? I think this would integrate pretty well with something like emacs.. though it is arguable how much cpu overhead you'd "save" in that case. Which, I suppose, is the reasoning behind X-terminals. Raul Rockwell
sef@kithrup.COM (Sean Eric Fagan) (03/19/91)
In article <18MAR91.09033243@uc780.umd.edu> cs450a03@uc780.umd.edu writes: >line-orientation is an approximation of what you want, but not exactly. This kind of thing is done all the time for pc's running unix. The ones I have most experience with are the anvil/stallion boards. There, they literally download code onto the board, which has its own processor, to handle the line discipline. Then, the "real" line discipline and device driver in the kernel cooperate to do things correctly. Basicly, everything gets buffered, until one of the "special" characters comes acros (intr, swtch, or quit; the others are handled on board [no pun intended]). Using a smart board, I've seen 20MHz '386s running with a couple of telebits and 5 users. Not bad. However, there is a problem with this. Specifically, sco added job control to their version of unix, but the onboard card doesn't recognize it (yet). As a result, if I want to use job control, I cannot use the neato nifty-keen device driver (since all it does is echo the control-Z back at me, which, on a wyse-60, clears the screen). -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others.
johnl@iecc.cambridge.ma.us (John R. Levine) (03/19/91)
In article <1088@spim.mips.COM> mash@mips.com (John Mashey) writes: >7. Make the IOP smart enough to distinguish between "ordinary" >characters which should be echoed, and stuck in an input queue, >and other characters and/or sequiences, which demanded response from >the CPU. Funny you should mention that. We did something like that on the GEM system at Yale in about 1976. The CPU was an 11/45 running 6th edition Unix (or maybe 7th by then.) The IOP was an 11/05 running a terminal emulator I wrote in C. They shared a chunk of screen buffer memory for 16 screens, the IOP had a keyboard multiplexor for the keyboards in front of the screens, and there was a word at a time interface between the /45 and the /05 that looked on each end like a paper tape reader and punch. (Paper tape was the usual boot device in those days, by emulating its interface we could boot the /05 with the standard boot ROM.) The /45 passed characters to the /05 to draw on the terminals, and the /05 passed back characters that were typed on the keyboards. We defined a variety of control sequences to handle the various special devices that we attached, e.g. "send 8KB of screen buffer to the D/A converter" that we used for real-time music synthesis. We had a screen editor, about a second cousin once removed of the Rand editor, that ran on the 11/45. It used a special editor input mode in Unix. All printing characters and "easy" control characters, e.g. cursor motion, were echoed directly. Anything else was passed to the application so it could update the screen. It really worked well. We could run 8 to 12 editor users on a poor little 11/45 with 208KB of RAM, total, with pretty much instantaneous response, in spite of the fact that the 11/05 had to draw the characters on the screen in software, there being no hardware character generator. In our implementation, the special editing mode was actually on the Unix side, since the /05 had to pass it the characters anyway, but the /05's ability to handle the easy screen updates itself was critical. One might claim that we took the "salvation through suffering" mentioned in the original CACM Unix paper to extremes, but we got an awful lot of computing done on some rather small computers. For further details you can see my 1982 article in Software Practice and Experience. -- John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@iecc.cambridge.ma.us, {ima|spdcc|world}!iecc!johnl Cheap oil is an oxymoron.