[comp.sw.components] What is a reusable software component

paul@athertn.Atherton.COM (Paul Sander) (08/01/90)

In article <71800004@m.cs.uiuc.edu> johnson@m.cs.uiuc.edu (Ralph Johnson)
writes:
>I think that Paul Sander's B-tree abstraction is a pretty good
>reusable component.  However, I think that Bill Ogden has a
>good point.  Bill's basic point is that it is really more
>important to reuse specifications than it is to reuse
>implementations.  Of course, we need to provide implementations
>for our specifications, but the user should think mostly of the
>specification and not the implementation.  B-trees are an
>implementation, while almost-constant functions are specification.

I was under the impression that a major reason for inventing reusable
components was to reuse code, that is to recycle implementations.
A clean, well-documented interface is a tool that encourages using a
piece of existing code rather than rewriting it.  Producing a set of
components with uniform (or nearly so) interfaces will hopefully flatten
the learning curve for that particular set of tools, encouraging
reuse of several components in that set.  Documenting all of the
components of that set in the same place will increase the chances
of discovery of a piece of code.

I believe, however, that reusing specifications for their own sake
is of limited value.  Though it is possible to write several implementations
of, say, an almost-constant-function (e.g. B-tree, doubly-linked list,
splay tree, etc.), each implementation has strengths, and implementation
details that could be very useful if they are presented in the interface.

Is the direction of the development of reusable software components really
toward interchangeability?  Is that really necessary?  Why would I want to
simply unplug a B-tree and plug in a doubly-linked list in its place?  Or
vice-versa?  Performance reasons, perhaps, but making minor changes to the
"glue" code surrounding the calls to the components doesn't add much to
relinking.  But other than performance characteristics, why would someone
want to keep multiple implementations of a specification, or multiple
implementations of an almost-constant-function?  If one does the job
just as well as the other, why have both?

>Further, I take the point of view held by the object-oriented
>community that it is not particular kinds of components that are
>important, but rather sets of components and their relationships.
>The advantage of concentrating on specification is that you end
>up with lots of components that share the same specification.
>When you learn one, you have learned something about all of them.
>Also, it becomes easier to replace one component with another
>because they will share specification.  Thus, it becomes easier
>to write utilities that work with any of the related components.

One of the major selling points of object-oriented anything is that
when refining something to make something else, the properties of
the original thing are inherited automatically by the new thing, and
those properties are incrementally changed by the new thing.  In
practice while programming, it turns out that the interfaces of the
two objects may be the same, but the objects are hardly interchangeable,
since their semantics are different.  They may not even be compatible in
the sense that one can replace the other in some section of code.

In any case, the inheritance mechanism is simply an automatic way
for a new object to reuse the implementation of another in all except
those features that are different from its predecessor.
-- 
Paul Sander        (408) 734-9822  | "Passwords are like underwear," she said,
paul@Atherton.COM                  | "Both should be changed often."
{decwrl,pyramid,sun}!athertn!paul  | -- Bennett Falk in "Mom Meets Unix"

ogden@seal.cis.ohio-state.edu (William F Ogden) (08/02/90)

Paul writes:

>I was under the impression that a major reason for inventing reusable
>components was to reuse code, that is to recycle implementations.

I believe that there could be an ends/means confusion here. It's something
like saying of war that the objective is clearly to discharge weapons 
[which is perhaps a natural perception for an arms manufacturer].

Similarly, if we look at the construction industry and only see that
it constantly reuses I-beams, bricks, etc., or look at the electronics
industry and only see that newer components are plug compatible with
older ones [so that you can, say, roll out an old mainframe computer,
roll in a newer model, plug in the peripherals, and you're off and running
again], then we are fixing on the physical aspect of the situation and
missing much of what is really happening.

A major reason architects and construction engineers reuse I-beams is because 
they have fixed, well-understood specifications and because their performance
characteristics are completely analysed. When they are used as components,
all sorts of structural analysis can treat the I-beams as whole units
which do not require further costly microanalysis. 

In electronics, plug compatible components don't just physically interchange
with their precursors, they provide the same [or even extended] functionality,
although they often differ in preformance or cost characteristics. [The
role of specs is again clear.]

Our intent in the reusable components effort is to help to support the
development of correct, efficient, useful, ... , brave, clean, and
reverent software in a timely and cost-effective manner. This will
certainly involve `recycling implementations', but that's just a piece
of the story. Adequate specifications and techniques for identifying
generally reusable facilities are some of the other pieces.

>  ...
>But other than performance characteristics, why would someone
>want to keep multiple implementations of a specification, or multiple
>implementations of an almost-constant-function?  If one does the job
>just as well as the other, why have both?

Performance is the reason [well perhaps programmer egotism too].
If you convert your B-tree code into a realization of the almost-constant-
function concept and add performance specs for the various operations,
you'll find that the operation for getting the value of F at index i
(in worst case) will have a duration O(log(n)), where n = Card( { j: index |
   F(j) # C } ). Now the performance of this operation might be improved
by using a hashing scheme instead of a B-tree, but then the operation which
gets the next [interesting] index at which F is non-constant would have
duration O(n). [Moreover, the hash table would require more space.] If an
application of the almost-constant-function mostly looks up random indices,
then the hashing realization is probably preferable. If it predominantly visits
the interesting indices in order, then the B-tree realization (or the
linked-list one perhaps) might be appropriate. 
The important point to note is that reusable facilities provide several
operations, so the performance optimization problem is far from single
dimensional. If realization A beats realization B in all aspects of
performance, then Paul is right and we don't need B [unless we or
some politically powerful group happened to write B :-) ]. After
we dispose of all such useless realizations we will normally be left
with a collection of performancewise incomparable realizations.
Functionally they are interchangable, so we may end up tuning an
application by the old swap and see technique.

Another concern here:
>   ....    Though it is possible to write several implementations
>of, say, an almost-constant-function (e.g. B-tree, doubly-linked list,
>splay tree, etc.), each implementation has strengths, and implementation
>details that could be very useful if they are presented in the interface.

No matter how fascinating implementation details are to a programer, they
are precisely the information we are trying to hide from facility clients.
The conceptual specification of a facility gives as simple a picture of
what functionality a client should expect as we can come up with. [E.g.
the almost-constant-function] In general this description should say
nothing about any clever pointer schemes, etc. that might be used to
realize the facility -- especially since we may develop realizations
that use completely different tricks. 
Now performance characteristics are particular to each different
realization of a concept, so specifications for them must be attached
to each realization. Even here, we do not want to complicate the clients
situation by drawing him into the details of our implementation, so
the performance specifications should be formulated in terms of the
simple conceptual picture we used to give the functional specs.

/Bill