[comp.lang.ada] Self-Modifying Abstract Data Types

rjh@cs.purdue.EDU (Bob Hathaway) (03/12/89)

This is a copy of a long paper on self-modifying abstract data types which
is being mailed to comp.sw.components since a method being proposed could
apply to an implementation technique which was used here.  My apologies for
anyone getting this twice, I posted to comp.lang.sigplan first and
apparently because that newsgroup is moderated, the message wasn't sent
to comp.lang.ada.


(Readers uninterested in abstract data types can skip this article)

ABSTRACT

Last September I developed a program for an Icon subset which implemented
a self-modifying abstract data type using a transparent multi-level
design to allow simultaneous, multiple implementations.  I considered
writing a journal article on the technique at the time and will 
eventually get around to it, but for now I'll give a quick description.
Part of the technique presented could roughly correspond to a "synthesis"
approach and should be appropriate to the current discussion.  Since this
is essentially a rough draft, comments and criticisms are welcome.


TOP LEVEL SPECIFICATION OF TABLES

Icon subset tables are arrays indexed by strings and have a polymorphic
basetype, i.e. the elements are heterogeneous.  In the subset, table elements
contained objects (1. objects could also be tables, resulting in a recursive
definition).  It was simple to devise a top level design specification for
tables as an abstract data type, which is repeated below without 
documentation (2. the description and operation semantics aren't relevant
to the example).

    TOP LEVEL SPECIFICATION
	Icon Table
	    <Description deleted>
	Operations (interface)
	    CreateIconTable 
	    Initialize 
	    Lookup 
	    Insert 
	    Print 
	    GetDefaultValue 


LOWER LAYER DESIGN OF TABLES

There were several alternatives for the design and representation of the
table's lower layer.  It was desirable and sufficient to meet the above
interface in a general way.  A linked list would provide a simple but
inefficient implementation, a tree implementation would allow future slicing
extensions, and a hash table implementation would be more complex but allow
constant time lookup.

To provide the necessary generality to allow any representation, the lower
level design of the Adt was based on a generic dictionary.  This
dictionary provided the usual lookup and insert operations and a print
routine was later added to simplify debugging.  All operations in the 
top level design were implemented in terms of dictionary operations
with the exception of GetDefaultValue, which is specific to tables.


Dictionaries

An Adt had to provide the following (necessary and sufficient) operations
to satisfy the dictionary operations required to implement tables:
	Initialize
	Insert
	Lookup
	Print
Any Adt providing these operations could be used as the dictionary.  This
implementation of the table's lower layer resulted in:

	 TABLE SPECIFICATION GRAPH

1.   >-------> table:  Create Initialize Lookup Insert Print 
     |          /|\    GetDefaultValue
     |         / | \
     |        /  |  \
     |       /   |   \
2.   |    list  tree  hash table ...:  Initialize Insert Lookup Print
     |     /     |     \
3.   ^__ ...    ...    ... :  objects 

Here, layer 2 corresponds to both the implementation of tables and the
specification of dictionaries because the table implementation is an
application of the dictionary specification (there are actually four layers).
Since layer 2 is hidden in the lower layer of the table Adt, it should be
transparent, i.e. completely hidden from applications using tables.  This
scheme views tables as indexed containers of objects.  Layer 1 table
operations and layer 2 dictionary operations should then return objects and
not the internal elements used to store objects.  Since the implemented
Icon subset allows tables to change dictionary (layer 2) representations 
dynamically, an insecurity would result if the table returned an internal
element type and then changed representation.  By returning objects, tables
are free to undergo changes in representation without concern for the 
context of operations, i.e. self-modifying tables should be stateless.
Since objects are implemented by reference, the semantics of table entries
and returned objects are preserved.  For example, if a lookup operation
returned an internal element type and the table subsequently changed
representation, the returned element would be undefined because the table
entry corresponding to that element had changed.  Change of representation
(self-modification) occurs in layer 2 when a table operation detects a
specific condition such as a dramatic size change.  The table then extracts
all elements from the current dictionary and inserts them into a new
dictionary which then composes the table representation.  If the Adt is
implemented as a lightweight process (task or thread), any condition or
event could trigger self-modification.  The same applies to abstract data
objects.


LOWER LAYER IMPLEMENTATION

The table implementation used a record consisting of a pointer and a set of
operations implemented as procedural variables.  In Ada the operations
correspond to generic subprogram parameters.  Any Adt which provided the
dictionary operations described above was suitable to implement the lower
layer.  In Modula, initializing the table involved assigning the operations
provided by the dictionary Adts to the procedural variables and setting the
pointer variant to the dictionary Adt variable.  In Ada, generic subprogram
and type parameters are sufficient for the simplist case.  For example,
the hash table Adt provided lookup, insert, print, and initialize operations
which were assigned to the procedural variables.  The pointer (3. actually
a variant record with the dictionary and ADDRESS types as alternatives)
was then interpreted as an instance of a hash table Adt.  The lower layer
of the hash table was implemented as a pointer to an array but this was
only consequential to the initialize routine.


			TABLE IMPLEMENTATION

Layer 1                      Layer II           Layer III

Create		   >------>|-----------|
Initialize	   | 	   |Initialize |   (Dictionary Operations)
Lookup     	   | 	   |Insert     |
Insert		   | 	   |Lookup     |
Print		   | 	   |Print      |   (Dictionary Representations)
GetDefaultValue	   | 	   |table      |--->|--->linked list
value  ------------^  	   |-----------|    |
				    	    |--->tree
				            |		  
				            |--->hashtable
				            |
				            |--->other


In Ada, layer 2 instantiates the dictionary with the object type
(4. implemented as an Adt encapsulating a pointer to a variant record 
containing all Adts representing the object).  Polymorphic variables
would have provided better table elements, and in fact, the above
scheme uses non-polymorphic Adts to implement polymorphic table variables.
A Next operation was also used to implement self-modifying tables.


SUMMARY

A technique to allow simultaneous, multiple implementations of self-modifying
abstract data types was presented.  Two distinct advantages of the technique
are ease of modification and efficient representation.  Since no single
representation is "hard-wired" into the lower layer of an Adt, they are
easy to change.  Experiments with alternate representations can discover
optimal lower layers for particular applications.  New representations
can also be added easily.  Self-modification allows efficient representations
for each Adt variable or object.  In the presented example, small tables
could use a list dictionary representation and convert themselves
into trees or hash tables as necessary, e.g. during dramatic size changes.  


EARLIER DISCUSSIONS

In the "synthesis" terminology, if I follow it correctly, the dictionary
corresponds to a class structure and the list, tree, and hash tables
correspond to types satisfying the dictionary class.  The table operations
would then invoke dictionary class operations to invoke the appropriate
type operation for any dictionary representation; a high level view of
the above implementation technique.  For discussion, is this view of
synthesis correct?  What solutions would inheritance provide?

In regard to the earlier discussion in which I proposed dynamically
parameterized Adt operations, Ada generic subprogram parameters could allow
an instance to be programmed in the context of an instantiation but the
operation would not be defined outside that scope except by default. 
For this case procedural variables are necessary as memory to store a new
operation.  I don't know if the "synthesis" class can provide change of
operations, it appears classes could allow complete change of representation
(type) as described above but not of selected operations.

Bob Hathaway
rjh@purdue.edu