[comp.editors] How it's done in VED

aarons@syma.sussex.ac.uk (Aaron Sloman) (02/13/90)
In the recent discussion of editor buffer management I don't think
anyone mentioned the sort of technique used in VED, the Poplog editor.
It's a bit like a linked list of records, but more compact, and less
flexible. On the other hand, for some patterns of use it seems to work
very well, although it was originally implemented a long time ago as
a temporary measure, to be replaced by something cleaner and more
general eventually!

VED is implemented in Pop-11, the core Poplog language, with Lisp-like
facilities but a Pascal-like syntax. In fact VED uses the "Syspop"
dialect of Pop-11 which also includes C-like pointer manipulation for
efficiency, and which is used for building and porting Poplog.

From the point of view of a programmer, VED is just an extendable,
collection of Pop-11 procedures that operate on a collection of Pop-11
data-structures (something like Emacs and its Lisp?). Because Poplog
provides a very fast garbage collector VED simply uses it to reclaim
space when necessary. Garbage collecting a process around 2MB, on a
Sun360 with 8Meg takes just over a second, and doesn't happen often
while editing. VED's strategy would not be tolerable in those AI systems
where a garbage collection takes up time for a coffee break. In VED most
users are never aware of garbage collection.

VED reads in complete files on the assumption that it would always
be used on machines with virtual memory, and we could not hope to
improve on the performance of the pager.

The text buffer is a vector of strings in memory, where the strings are
represented by pointers. Each string corresponds to one line of text. So
shifting N lines of K characters to insert or delete a line requires
shifting only N pointers, not NxK bytes. In general that's very fast,
though not as fast as manipulating a linked list. The vector is allowed
to have spare space at the end, but occasionally it overflows. Then a
new larger one is created and the string pointers copied to it. How much
new space that has to be allocated and how much has to be copied depends
on the number of lines of text, not the number of characters in the
buffer, so in general it is MUCH faster than copying a complete
character buffer would be, and turns over much less store.

Two integer variables, vedline and vedcolumn record cursor position,
and there is no data movement during browsing through a file, once
it has been read in.

Editing uses the concept of the "current" line, which is the string
of characters returned by vedbuffer(vedline). On every (non-static)
insertion or deletion the characters to the right of the cursor are
shifted left or right. This is generally very quick, because lines
are not very long, and most of the work is done near the end of a
line anyway (except for things like global substitions and text
justificiation).

Whenever there's not enough space in the string, it is replaced with
a copy that has 20 extra spaces on the right, then when you are
about to move off that line the line has to be trimmed. (A point
that people who extend VED by programming it at a very low level
sometimes forget). The number 20 was a guess made around 1982 and
has never been reconsidered. Perhaps it should be context sensitive.
(However increments to fixed sizes would simplify the use of a set
of free lists for strings, which we probably should implement to
reduce garbage collection even further, but haven't bothered.)

VED is one of those dirty hacks, that works. In order to avoid the
problem of working out what has changed whenever it is time to refresh
the screen, which a clean implementation with separate buffer manager
and screen manager would do, VED provides a collection of procedures for
altering the buffer (e.g. linedelete, charinsert, chardelete, etc.) that
also automatically update the screen directly via procedure calls or
sending out screen control characters. (Side-effecting the screen can be
turned off for programs that just need to use the VED buffer as a
data-structure).

This was done because initially VED was implemented for teaching on an
overloaded VAX running VMS and efficiency was at a premium, especially
as the system management kept asking why we didn't use EDT. We intended
to revise the implementation, but somehow it has survived since 1982,
though with the usual accretions.

There are some features of VED that make it slower than it need be, but
more flexible than some other editors, especially the handling of tabs.
Users can switch on and off the conversion of tabs to spaces. They can
decide whether tabs should be hard or soft. If soft, a tab is
automatically converted to spaces if you try to delete or insert in the
middle of it. Users can also specify how far apart the tab stops are
(default 4, which is nicer than 8 for indenting programs.) A single
tab is represented in the buffer by a number of special characters,
depending on how much space on the screen the tab should take up, which
simplifies screen management.

All this means, however, that when reading and writing files containing
tabs there is an extra overhead. Also searching is more complicated
than if we just used a vector of characters with a gap, since newlines
are not explicitly represented in the buffer, but are implicit between
the strings.

Unlike Emacs, VED is typically part of the same process as user programs,
when used with the Poplog languages (Pop-11, Prolog, Common Lisp, ML),
and can therefore provide a convenient set of facilities for text
manipulation, including portable user interfaces for some programs.

There are many subtle differences between Emacs and VED, partly because
the design of VED was driven by the needs and responses of totally naive
student users, secretaries, etc. as well as the experts. For example, in
Emacs you can inadvertently leave trailing spaces in a line and then
wonder why your favourite formatter will not centre your lines properly,
whereas in VED you will find it hard to leave a space at the end of a
line when you want to! Vertical movement of the cursor to the right of
text, and joining lines when the second line has leading tabs or spaces
are examples of different behaviour.

As a result some people love Emacs and hate VED (even though it has a
partial Emacs emulation mode) while some love VED and hate Emacs.

Nothing will ever change that polarisation, I am sure.

Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QH, England
    EMAIL   aarons@cogs.sussex.ac.uk
            aarons%uk.ac.sussex.cogs@nsfnet-relay.ac.uk