[comp.sw.components] Reasons for low ADT reuse

scotth@boulder.Colorado.EDU (Scott Henninger) (09/22/89)

|>From: billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe, 2847 )
|
|    All of the negatives you cite (except "may use more storage", which 
|    is bogus) are overcome by the reuse paradigm.  The ADT is developed
|    with extremely high quality at a high cost, but this cost is then 
|    spread over an extremely large number of users until the end of time;
|    the result is very much an economic win-win situation.

OK, so let's say this is true - that the high cost of creating "quality" ADTs
is distributed over the number of people that reuse it.  You gain a "cost"
advantage under the following conditions:

  1) The cost of reusing the component is small.
  2) It fits the needs of a large number of programmers AND they choose to use it. 

The extent to which these conditions are not realistic is the extent to which
this method will ultimately fail.

The ADT approach has been around for a long time, so I ask you, why is it not
currently practiced?  Don't tell me that training is the answer - programmers are
already trained to design and implement ADTs.

By the way, I'm still not convinced that ADTs will ever constitute anything more
than a trivial percentage of real application code, given either today's or tommorow's
technology.


-- Scott
   scotth@boulder.colorado.edu

billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe, 2847 ) (09/22/89)

From scotth@boulder.Colorado.EDU (Scott Henninger):
> The ADT approach has been around for a long time, so I ask you, 
> why is it not currently practiced?  Don't tell me that training 
> is the answer - programmers are already trained to design and 
> implement ADTs.

   Consider some of the things that have happened within the
   last few years: there is now a language whose definition
   is an ANSI and international standard which prohibits subsets
   and supersets (ensuring portability), which provides *secure*
   facilities for the construction of quality ADTs (and by that
   I mean limited private types, which can be built such that
   instances of the ADT will maintain consistency in the presence
   of multitasking, and will in fact process requests in parallel
   while ensuring serializability and recoverability, etc. etc.),
   and for which second-generation highly optimizing compilers
   are now widely available.  This language is also saving lots
   of money in practice (as described in the IEEE Software article
   of November 1988, "Large Ada projects show productivity gains"),
   which is stimulating considerable growth in its use.  Finally,
   companies have developed catalogs of components for this language
   (Wizard Software, lib systems, etc.) and are actively selling them 
   into the user market.  In short, the conditions in which the ADT
   approach can thrive are now being established.

   You want results?  From the article cited above: Magnavox did a
   Ada project at the 1.2-million-line level in which reuseable
   software NOT developed on the project was not counted at all,
   and reuseable software developed on the project was counted only
   once; the productivity was 550 lines per man-month for the systems
   software and 704 lines per man-month for the applications software.
   The average productivity found in a productivity consultant's database
   of 1,500 projects at the 1.2-million-line level was only *77* lines per
   man-month.   Reuse saved 190 man-months (9%) of effort and reduced
   the schedule by two calendar months (4%); Magnavox expects to increase
   the reuse rate to 25% on the next project, and believes that a reuse
   rate of 50% is possible.

   Bottom line: Reuse is now proving itself.  We are making it happen.

   (By the way, reuse is going to be a major topic at the Tri-Ada '89
   conference up in Pittsburgh, and I will be in an excellent position
   to give recent results as soon as I get back from the conference.)


   Bill Wolfe, wtwolfe@hubcap.clemson.edu
   

scotth@boulder.Colorado.EDU (Scott Henninger) (09/22/89)

|>From scotth@boulder.Colorado.EDU (Scott Henninger):
|> The ADT approach has been around for a long time, so I ask you, 
|> why is it not currently practiced?  Don't tell me that training 
|> is the answer - programmers are already trained to design and 
|> implement ADTs.
|
|>From: billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe, 2847 )
|   You want results?  From the article cited above: Magnavox did a
|   Ada project at the 1.2-million-line level in which reuseable
|   software NOT developed on the project was not counted at all,
|   and reuseable software developed on the project was counted only
|   once; the productivity was 550 lines per man-month for the systems
|   software and 704 lines per man-month for the applications software.
|   The average productivity found in a productivity consultant's database
|   of 1,500 projects at the 1.2-million-line level was only *77* lines per
|   man-month.  Reuse saved 190 man-months (9%) of effort and reduced
|   the schedule by two calendar months (4%); 

Although I am truly happy these managers were able to find ways to claim
productivity gains (and probably get promotions for it), we should be
skeptical of these numbers for many reasons, two of which follow:

  1) Cobol could probably do better by the lines-of-code measurement. 
     In other words, just because more lines of code were developed
     doesn't mean that productivity was enhanced.

  2) Different projects have different dynamics.  It is a fallacy to
     compare man-months of projects that simply had the same number of
     lines of code.  First of all, 1) still applies.  Secondly, the
     difficulty of the domain is not accounted for.  Thirdly, the skills
     of the programmers is not accounted for; ad infinitum.  You can't
     compare apples to oranges! 

|   Magnavox expects to increase the reuse rate to 25% on the next
|   project, and believes that a reuse rate of 50% is possible.

General Electric claimed the same thing in the 60's.  It never
materialized.

The bottom line is that there is a distinct possibility that these
results are contrived by self-serving interests.  I think you're right
in stating that reuse is making headway, but it is extremely small, and
is certainly much smaller than the results you cite indicate.  

I characterize it as a monkey climbing a palm tree to reach the moon. 
He sees he is making progress and begins to make wild claims. 
Unfortunately, he soon reaches a dead end.  I claim the dead end in
software reuse is cognitive overload.  It will take thousands or
millions of components to make a truly useable software reuse
environment.  There is no way that a human can keep track of what is
available at that kind of scale.

The cognitive overload also comes in understanding what a component does
for you.  Do you realize how hard it is to get novice programmers to
understand what a stack does?  And this is one of the simplest
constructs in computer science.  Just think of training someone to
understand 1000 ADTs (that are much more difficult to understand than
stacks) to the point that he can (re)use them. 

I also must emphasize that the problems of reuse ARE LANGUAGE
INDEPENDENT.  Even with all of the religious followers of ADA out there,
you will never in a million years convince me that ADA is anything more
than trivially different from Pascal, C, Algol, etc.

|   [...] there is now a language whose definition is an ANSI and
|   international standard [...]

Emerson said that "a foolish consistency is the hobglobin of little
minds...".  In this spirit, I would claim that we do not know enough
about human factors in programming (note that I did not say programming 
*languages*) to point to standards as a means to claim that your favorite
language is the best. 


-- Scott
   scotth@boulder.colorado.edu

rang@cs.wisc.edu (Anton Rang) (09/23/89)

In article <11963@boulder.Colorado.EDU> scotth@boulder.Colorado.EDU (Scott Henninger) writes:
>  I claim the dead end in
>software reuse is cognitive overload.  It will take thousands or
>millions of components to make a truly useable software reuse
>environment.  There is no way that a human can keep track of what is
>available at that kind of scale.

  The trick is to let the computer help with this, and find a
reasonable way to organize the components.  There were those who
claimed in the 17th century that widespread printing would lead to a
disaster--libraries would be overloaded and it would be impossible to
find the texts on a given subject.
  What happened?  Library science started appearing.  It tackles
exactly the same problem which will occur with re-use in software
engineering: millions of components, all doing slightly different
things.  The library on campus here has somewhere between 1 and 2
million books; yet, when I go to look for a book on finite automata, I
can find a set of them easily.  We need to develop a similar
organizational technique for software components.

>The cognitive overload also comes in understanding what a component does
>for you.  Do you realize how hard it is to get novice programmers to
>understand what a stack does?  And this is one of the simplest
>constructs in computer science.

  Yes, but it gets easier to understand new data structures as one
goes along.  I've only been programming for 8 years or so, but I can
look at a data structure description in Knuth and understand it in an
hour or two (as a general rule :-).  I can even define what I need
("a priority queue with fast removal of the first element and a fast
way to merge two queues") and go find a data structure for it.  Of
course, there aren't enough books which describe the structures....

>I also must emphasize that the problems of reuse ARE LANGUAGE
>INDEPENDENT.

  Definitely true.  I reuse a good chunk of my Pascal code...just
change the names of the structures and key fields.  Now, a nice way to
do this automatically (without manual editing) would be nice--that's
what Ada gives you.  But the basic problem isn't going through and
making the minor fixes, it's figuring out what you can reuse and how
to do it.

[ stuff about why "standard" doesn't necessarily mean "best" ]

  I agree and disagree with this, I guess.  Having a standard is good
because it means that components are more portable.  Right now, I have
a lot of Pascal libraries using VAX Pascal-specific constructs (i.e.
modules); moving these to other systems is possible, but quite a bit
of work.  On the other hand, I can move Ada code around without any
problems (barring compiler bugs/features :-).
  Using the library analogy, most of the books here are in English.
There are some in German, Spanish, and other languages.  If someone
wants to take the time, they can translate them, and it will take less
effort than writing them from scratch.  On the other hand, if I knew
how to read all the languages in the library, I could use every book
as a potential source.
  Similarly, I think more attention should be paid to interfacing
different languages together.  Calling Pascal or FORTRAN code from Ada
ought to be trivial, as long as certain Ada-specific semantics (such
as exception handling) are handled right.  Right now, I only know of
one operating system (VMS) which lets me mix languages freely without
worrying too much about their idiosyncrasies.  It might be nice to
have more development environments which handle multiple languages.
  Just my thougts....
   
+----------------------------------+------------------+
| Anton Rang (grad student)        | rang@cs.wisc.edu |
| University of Wisconsin--Madison |                  |
+----------------------------------+------------------+

"You are in a twisty little maze of Unix versions, all different."

eichmann@h.cs.wvu.wvnet.edu (David A Eichmann,316K) (09/23/89)

From article <11963@boulder.Colorado.EDU>, by scotth@boulder.Colorado.EDU (Scott Henninger):
> 
> ...It will take thousands or
> millions of components to make a truly useable software reuse
> environment.  There is no way that a human can keep track of what is
> available at that kind of scale.
>

   The key word here is *environment*...requiring an electrical engineer
to have memorized the specifications of all components available for use
in a circuit is a similarly daunting requirement.

   What's truly necessary in this setting is a library retrieval system
capable of narrowing the field of search to a managable set of alternatives.
Prieto-Diaz' work on faceted classification of software is one such approach
Classification of a component can be costly, but this can be amortized over
the life of its retrieval and reuse.

- Dave
------ 
David Eichmann
Dept. of Statistics and Computer Science
West Virginia University                  Phone: (304) 293-3607
Morgantown, WV  26506                     Email: eichmann@a.cs.wvu.wvnet.edu