aarons@syma.sussex.ac.uk (Aaron Sloman) (02/13/90)
In the recent discussion of editor buffer management I don't think anyone mentioned the sort of technique used in VED, the Poplog editor. It's a bit like a linked list of records, but more compact, and less flexible. On the other hand, for some patterns of use it seems to work very well, although it was originally implemented a long time ago as a temporary measure, to be replaced by something cleaner and more general eventually! VED is implemented in Pop-11, the core Poplog language, with Lisp-like facilities but a Pascal-like syntax. In fact VED uses the "Syspop" dialect of Pop-11 which also includes C-like pointer manipulation for efficiency, and which is used for building and porting Poplog. From the point of view of a programmer, VED is just an extendable, collection of Pop-11 procedures that operate on a collection of Pop-11 data-structures (something like Emacs and its Lisp?). Because Poplog provides a very fast garbage collector VED simply uses it to reclaim space when necessary. Garbage collecting a process around 2MB, on a Sun360 with 8Meg takes just over a second, and doesn't happen often while editing. VED's strategy would not be tolerable in those AI systems where a garbage collection takes up time for a coffee break. In VED most users are never aware of garbage collection. VED reads in complete files on the assumption that it would always be used on machines with virtual memory, and we could not hope to improve on the performance of the pager. The text buffer is a vector of strings in memory, where the strings are represented by pointers. Each string corresponds to one line of text. So shifting N lines of K characters to insert or delete a line requires shifting only N pointers, not NxK bytes. In general that's very fast, though not as fast as manipulating a linked list. The vector is allowed to have spare space at the end, but occasionally it overflows. Then a new larger one is created and the string pointers copied to it. How much new space that has to be allocated and how much has to be copied depends on the number of lines of text, not the number of characters in the buffer, so in general it is MUCH faster than copying a complete character buffer would be, and turns over much less store. Two integer variables, vedline and vedcolumn record cursor position, and there is no data movement during browsing through a file, once it has been read in. Editing uses the concept of the "current" line, which is the string of characters returned by vedbuffer(vedline). On every (non-static) insertion or deletion the characters to the right of the cursor are shifted left or right. This is generally very quick, because lines are not very long, and most of the work is done near the end of a line anyway (except for things like global substitions and text justificiation). Whenever there's not enough space in the string, it is replaced with a copy that has 20 extra spaces on the right, then when you are about to move off that line the line has to be trimmed. (A point that people who extend VED by programming it at a very low level sometimes forget). The number 20 was a guess made around 1982 and has never been reconsidered. Perhaps it should be context sensitive. (However increments to fixed sizes would simplify the use of a set of free lists for strings, which we probably should implement to reduce garbage collection even further, but haven't bothered.) VED is one of those dirty hacks, that works. In order to avoid the problem of working out what has changed whenever it is time to refresh the screen, which a clean implementation with separate buffer manager and screen manager would do, VED provides a collection of procedures for altering the buffer (e.g. linedelete, charinsert, chardelete, etc.) that also automatically update the screen directly via procedure calls or sending out screen control characters. (Side-effecting the screen can be turned off for programs that just need to use the VED buffer as a data-structure). This was done because initially VED was implemented for teaching on an overloaded VAX running VMS and efficiency was at a premium, especially as the system management kept asking why we didn't use EDT. We intended to revise the implementation, but somehow it has survived since 1982, though with the usual accretions. There are some features of VED that make it slower than it need be, but more flexible than some other editors, especially the handling of tabs. Users can switch on and off the conversion of tabs to spaces. They can decide whether tabs should be hard or soft. If soft, a tab is automatically converted to spaces if you try to delete or insert in the middle of it. Users can also specify how far apart the tab stops are (default 4, which is nicer than 8 for indenting programs.) A single tab is represented in the buffer by a number of special characters, depending on how much space on the screen the tab should take up, which simplifies screen management. All this means, however, that when reading and writing files containing tabs there is an extra overhead. Also searching is more complicated than if we just used a vector of characters with a gap, since newlines are not explicitly represented in the buffer, but are implicit between the strings. Unlike Emacs, VED is typically part of the same process as user programs, when used with the Poplog languages (Pop-11, Prolog, Common Lisp, ML), and can therefore provide a convenient set of facilities for text manipulation, including portable user interfaces for some programs. There are many subtle differences between Emacs and VED, partly because the design of VED was driven by the needs and responses of totally naive student users, secretaries, etc. as well as the experts. For example, in Emacs you can inadvertently leave trailing spaces in a line and then wonder why your favourite formatter will not centre your lines properly, whereas in VED you will find it hard to leave a space at the end of a line when you want to! Vertical movement of the cursor to the right of text, and joining lines when the second line has leading tabs or spaces are examples of different behaviour. As a result some people love Emacs and hate VED (even though it has a partial Emacs emulation mode) while some love VED and hate Emacs. Nothing will ever change that polarisation, I am sure. Aaron Sloman, School of Cognitive and Computing Sciences, Univ of Sussex, Brighton, BN1 9QH, England EMAIL aarons@cogs.sussex.ac.uk aarons%uk.ac.sussex.cogs@nsfnet-relay.ac.uk