BILLW@SRI-KL.ARPA (07/01/84)
From: William Chops Westfield <BILLW@SRI-KL.ARPA> If this whole thing doesnt make it, Im going to give up. You can FTP the description from SRI-KL::<BILLW>CONCEPT.DATA-COMPRESSION using anonymous login. perhaps INFO-MICRO-REQUEST will FTP it over to BRL and send it locally... We should be running new net software Real Soon Now... -------------------------------------------------------------------- Date: 5 Aug 1981 2251-EDT From: (Leonard N Zubkoff) <Zubkoff at CMU-20C> Subject: New Concept-100/104 Software This is a general announcement to the ArpaNet community of an alternate set of software that may become available for the Concept line of terminals made by Human Designed Systems (HDS). For the past several months, I have been engaged in a personal project to rewrite the Concept software in order to provide a level of functionality more in keeping with the needs and capabilities of the Computer Science community. While the software is not yet completely written, it has been operational for the last 3 months and is in use in several terminals here at Carnegie-Mellon University. Since we at CMU have copies of the original HDS software under a non-disclosure agreement, I am not at liberty to distribute the software I have written beyond CMU. I have been in contact with officials of HDS, and they have shown interest in making this available to the Concept user community if there is sufficient demand. The purpose of this message is two-fold. First, I want to solicit comments about the software I've designed in order that I may incorporate other features that I may have overlooked. Second, I need to determine whether there is a demand in the Concept user community either to upgrade existing terminals or purchase new ones with my software as opposed to the standard software supplied by HDS. Let me begin by giving a brief description of the goals I set for the software. The software was designed explicitly for the type of environment we have now and are developing at CMU. At present, screen oriented editing with Tops-20 Emacs/Tops-10 Fine/Unix Emacs is the norm and terminals are used both at home over dialup lines and in offices at 1200 baud and 9600 baud. In the future, Spice (personal) machines will be the dominant resource in the department and terminals like the Concept will have little use except as a home terminal with which to call up one's Spice machine. Thus the dominant use of the terminal where sophisticated capabilities are required is in the area of screen management by an editor. In addition, the likelihood that most of these terminals will ultimately be used from home over a 300 or 1200 baud dialup line has made the issue of efficient screen management extremely critical. All office terminals will be supported at 9600 baud in the near future. Thus the issues of efficient screen management and data transmission to the terminal emerged as the most critical areas deficient in most terminals available commercially, and it is to optimize the Concept terminal in these respects that the greatest part of the development of my software has been devoted. Since we already have a great deal of software that supports Concept terminals, it was also necessary that my software be able to emulate a Concept with HDS software to such a degree that the normal Emacs, Fine, BBoard, and other programs would operate properly without change. In order to utilize the unique features of my software, and to gain information about how it is performing, I have written a screen management program that I use to run Tops-20 Emacs. This program is invoked like Emacs but crosspatches the terminal through to an Emacs running as a subfork. In general, support for the new terminal software should be placed in the editor itself, but the nature of the implementation of Emacs/Teco is such as to preclude doing this directly without a great deal of work. Emacs itself provides a very poor screen management facility; it is not smart about making optimal use of the primitives available in a terminal to cut down on the number of characters sent to update the screen. My program maintains screen images representing the actual state of the screen and the desired state of the screen, and attempts to perform an near-optimal transform from the actual to desired states at appropriate intervals. In order to measure the improvement in screen update time, the program keep counts of both the number of characters that Emacs sent to the program (which would have gone to the terminal if it were not for the screen management algorithms) and the number of characters that the program was required to send to the terminal in order to effect the same resulting screen image. I term the ratio between these two numbers the compression ratio achieved by the screen management program. It represents a very real measure of the actual speedup in screen redisplay provided by the program. In actual use, this program now achieves compression ratios that are typically in the range from 2.5 to 4.0. In paging through a typical textual file (such as a Scribe manuscript file), the compression ratio is usually between 2.5 and 3.0. In paging through some Bliss programs, I have achieved ratios of 4.5. Thus in editing Bliss code from home over a 1200 baud modem, I regularly get an effective transmission rate to the terminal well in excess of 3600 baud. In addition, the new terminal software is written to enable a full screen update to be performed at 9600 baud with no padding whatsoever, even when achieving compression ratios of 4 to 1. Thus the new software may be used efficiently either locally over a high speed line or remotely over a lower speed modem without change to the program driving the terminal. The following sections describe orthogonal ideas which are all (to varying degrees) present in the current terminal: virtual terminals, screen management support, and data compression. Unfortunately, the implementation of these ideas in the current software makes it very difficult for them to be exploited. Virtual terminals The terminal as a whole is composed of one or more virtual terminals, each possessing the state one would normally associate with a physical display terminal. A redisplay process handles the mapping of virtual terminal screen images to non-overlapping rectangular windows on the screen whenever the terminal is not otherwise occupied in processing keyboard or communication line input (the currently running version only supports a single virtual Terminal, but this should change shortly). A virtual terminal has a fixed number of lines and columns, independent of the number of lines and columns actually being displayed on the screen at any given time; the user may select whether a window narrower than the virtual terminal it displays is to displays a truncated line, or wraps the logical line onto the next physical line. Within each virtual terminal, there are four contexts. Contexts provide the means for switching quickly between radically different states of terminal operation, without the high overhead of sending commands to effect all the individual changes, and allow for a program to use the terminal without interfering with the user's preferred settings of parameters. At any time, the input stream is connected to exactly one of these contexts. A context contains the information describing where and how a received character is to be displayed: logical cursor position; mark position (the mark is a saved cursor position, and is a familiar idea to EMACS users); character set; video attributes; insert, overwrite, or overstrike mode; the width of fixed tab-stop settings; and the current region top and region line count (a region is a horizontal band the full width of the virtual terminal to which all operations are limited; it is identical to the HDS notion of window if the window left and window columns parameters are restricted to be 0 and 80, respectively). Switching between contexts may be done either in a push/pop style or may be done by explicitly naming the context to be connected to. In addition, one may connect to a context specifying that the old context is to be used to initialize the new one. Thus, for example, a user may be handling normal typein to the system in wrap mode (ala ITS wrap mode, but with a blank line kept between old and new text at all times), can enter my screen management program which changes various modes, and then exit my program to be returned to the context to which he was previously connected leaving the terminal again in wrap mode. Screen management Some of the screen management support is inherent in the proper implementation of virtual terminals. When moving from one virtual terminal to another, or one context to another, it is not necessary to send dozens of bytes of control information to the terminal to establish the new set of parameters. However, the compression achieved by this technique does not apply to the sort of screen management done by screen editors like EMACS, where one remains in one context in one virtual terminal during the editing session. In order to achieve the efficient screen management and data compression described above, several techniques have been used. Eight bit transmission to the terminal is used so that the most commonly needed screen management commands may be invoked with a single byte to specify the type of command followed by whatever parameter bytes are necessary. The screen management program was heavily instrumented to determine the types of operations which were most frequently needed, and commands that minimize the total number of bytes to be sent to the terminal have been defined. Commands are provided with built-in repetition counts for two reasons: it is poor practice to waste precious communication bandwidth sending five commands when one can send a command with a parameter of five; it also requires more processing time in the terminal to perform the five individual operations than the single unified one. Data compression In order to speed display of text and programs still further, two token dictionaries are present in the terminal. A token, in this use, is an all upper case, all lower case, or capitalized sequence of letters. I analyzed over 13 million tokens from textual-type files on CMUA and CMUC and have stored a predefined dictionary of the most common 1024 tokens in the terminal eproms. When the screen manager determines that it would be about to send one of these, it can send a command to the terminal that specifies which token to display, the case to be used, and whether to follow the token by a space. These commands require two bytes to send, thus saving a great number of characters for most uses of the terminal. In addition, the best 32 combinations of token number, case, and spacing are directly displayable with a single byte command. For example, "the " may be displayed by sending a single byte to the terminal. In order to handle the case of tokens not stored in the static dictionary, there is a dynamic dictionary as well. When token parsing mode is enabled, the terminal will parse tokens out of the input stream and will place them sequentially in an internal table containing the last 256 tokens received. The screen manager recognizes when it is about to send a token that is already in the dynamic dictionary and can request its display with a two byte command. There is no transmission overhead involved in this process since both the program and the terminal parse the input stream and the program knows exactly what tokens are in the terminal at all times. This token parsing process (static and dynamic) is a simple state machine and symbol table, but it is responsible for a large percentage of the speedup in data transmission rates attainable through the use of my software. I apologize for the brevity of the above description, and for the overall length of this message. In general, the software has been designed to provide exactly the features that are most needed by our type of environment; misfeatures have been avoided (I hope). No terminal supplied with a non-test version of my software has ever crashed (requiring power-cycling, or loss of high voltage), nor can the terminal be placed in a state where the operator does not have complete control. Those desiring to examine an actual list of the commands implemented to date may ftp and peruse the files [CMUC]<Zubkoff>Concept.mss and Concept.press. Now I have several questions, both of a design nature and a logistics one. At this point, little is cast in concrete with regard to some aspects of the design. If this software is ever to extend beyond use at CMU, now is the time when those who will be working with it may affect the design. (1) What should function keys do? Should they be programmable to send or execute variable sequences as now, or should they send fixed character sequences? In the best of all possible worlds, the operating system would be capable of performing the translation from a special input code to a user-defined string. It does seem ridiculous to send a string to the terminal just so that it can regularly send it back. Is this practical, or should the terminal conform to the world? (2) Is transmitting part of the screen back useful? The primitives available in the terminal are generally dead wrong for any reasonable notion of text editing. Should this be provided at all? Is printing directly from the screen useful, or should hooks merely be provided to allow the host to send text directly to an attached printer? (3) Which pieces of status information are interesting enough to be displayed on a status line or screen, and which are not of interest except in bizarre cases? (4) If a meta key is available, how is it best handled? Setting the high order bit of the input byte appears best. Would this be acceptable to most systems? (5) Would transmission of packets with CRC to the terminal be beneficial to cut the overall error rate over phone lines? (6) Do you think that you would be interested in upgrading existing Concept-100 and Concept-104 terminals to this level of functionality, or would only consider purchasing new ones so equipped? If there is sufficient demand, I expect it would not require a great deal of work to port my software to the newer hardware in the Concept-108. Would $300 be a reasonable figure if HDS were to offer an upgrade kit for existing Concepts? (7) Assuming an upgrade is being considered, the terminal board must be jumpered to accomodate 16k of dynamic ram (newer Concepts have this already) and two jumpers installed to permit the replacement of the 2716 proms with 2732s. Would you want to purchase a kit to perform this in-house or would you feel that it was necessary to return the terminal boards to HDS? I ask these last two questions due to the fact that any release of this software beyond CMU will have to come through HDS, and they must be convinced that enough people in the computer science world care about having it. I shall be happy to receive any comments on the above questions and will be glad to discuss my Concept software further with anyone who is interested. Please address all replies and questions to Zubkoff @ CMUC. Unless requested otherwise, I will make responses publicly available. Leonard N Zubkoff ------- -------