[comp.lang.eiffel] Spanning Sets: A Design Guideline.

paj@uk.co.gec-mrc (Paul Johnson) (04/11/91)

I would like to propose a guideline for designing abstract data types
(ADTs).  It is

   When writing an ADT, pick a minimum spanning set of operations and
   define all other operations in terms of that set.  Then include a
   list of these operations in the documentation.

ADTs are things like sets, lists, hash tables, arrays, matrices, and
complex numbers, as opposed to the more concrete objects like
airplanes or hire_cars.  There is no sharp boundary, but some things
have the ADT nature more than others.

By `minimum spanning set' I mean the smallest set of operations
necessary to support the functionality of the class.

The justification for this rule is that subsequent reuse of a class
may include a descendant which must implement new invarients.  These
may require the interception of operations on the instances in order
to maintain these invarients.  If a minimum spanning set of operations
is used consistently then it is much easier to do this, since only the
relevant routines in the spanning set need to be redefined.

For example, consider the ISE list hierarchy.  The deferred class LIST
has only a few deferred methods.  All the position features are
defined in terms of `count' and `position', so `offright' returns True
if `position > count'.  Movement is (mostly) defined in terms of
`go_offleft', `go_offright' and `forth'.  This means that subsequent
implementors only need to write a few operations.

On the other hand the LINKED_LIST class does not follow this
philosophy.  Suppose I want a list of REALs which also includes
statistics such as average and standard deviation.  I define a class
STATISTIC_LIST which inherits from LINKED_LIST[REAL] and also defines
the features `average', `standard_deviation', `sigma_x' and
`sigma_x_squared'.  In my implementation `sigma_x' and
`sigma_x_squared' are variables rather than functions, since functions
would have to walk the list for each evaluation.  Every time the list
is changed (element changed, new element inserted, element deleted)
these variables will have to be updated.

This presents a problem because the update operations in LINKED_LIST
are all implemented separately.  The minimum spanning set for updates
is `put', `put_right' and `remove', but these are not used.  Hence I
must go through LINKED_LIST re-implementing all the update methods.

One problem with the spanning set concept is the requirement for
efficiency.  As an example, take the merging of linked lists.  Using
the minimum spanning set, each item in the source list would have to
be inserted into the target list and then removed from the source.
This would make the procedure an O(n) operation.  Instead the lists
are simply spliced together through their links, a much quicker
operation.

(Detailed point: it is still necessary to zip through the source list
in order to get to the last element, but even so the operation is still
much faster than with the minimum spanning set)

The answer to this point is that the minimum spanning set algorithm
should be used in an ancestor class (such as D_LIST or L_LIST) and
then redefined for efficiency in the effective classes while the
original routines are renamed and not exported.  Hence the minimum
spanning set algorithm is still available if it is required in
subsequent descendants.

If LINKED_LIST had followed this philosophy then I could create my
STATISTIC_LIST by redefining `put', `put_right' and `remove', and then
renaming and redefining in the `inherit' clause to bring back the
original spanning set implementations of the other methods.

(At the International Eiffel Users Conference in Paris, Dr. Meyer said
that Eiffel 3 will include `synonym' and `frozen' constructs to help
with the management of features in descendant classes.)

Paul.

Paul Johnson                               UUCP: <world>!mcvax!ukc!gec-mrc!paj
--------------------------------!-------------------------|-------------------
GEC-Marconi Research is not 	| Telex: 995016 GECRES G  | Tel: +44 245 73331
responsible for my opinions.	| Inet: paj@gec-mrc.co.uk | Fax: +44 245 75244