[comp.lang.smalltalk] Results of the Search for Object-Oriented Metrics

eberard@ajpo.sei.cmu.edu (Edward Berard) (08/10/89)

I received about 20 replies to my request for information on
object-oriented metrics. A few were simply requests for more
information. Some respondents requested that I not publish all, or
part, of their response.

The respondents came from many different backgrounds, and represent a
variety of programming languages and approaches. If you chose to
contact any of them please be (at least temporarily) respectful of
their way of doing business.

Next, I would like to make some general observations:

	- There are more than a few graduate students (worldwide) who
	  are researching object-oriented technology in general, and  
	  several who are specifically focusing on object-oriented
	  metrics.

	- More than once, I was informed that while many people valued
	  metrics, few people actually collected them. Of those that
	  did collect metrics, few did it systematically and/or
	  regularly. 

	- I was also made aware that many, if not all, CASE vendors
	  did not supply tools which would automate the gathering of
	  object-oriented metrics.

Regarding software engineering metrics in general:

	- There are three broad categories of software engineering
	  metrics: 

		- Metrics which are used to measure software (both
		  code and non-code software) itself, e.g., lines of
		  code, pages of documentation, numbers of process
		  boxes, and ease of reusability.

		- Metrics which are used to measure various software
		  engineering processes, e.g., how long should
		  "design" take.

		- Metrics which are used to measure software engineers,
		  either individually, or in groups, e.g., effort per
		  unit output

	- Seldom is any one metric sufficient. Since emphasis in one
	  area can result in disaster in another area. Collecting only
	  one metric is usually not worth the effort, and may actually be
	  detrimental. 

	- For comparison of metrics to be meaningful, the terms have
	  to be precisely defined, e.g., what is a class, an
	  operation, a method.

	- For comparison of metrics to be meaningful, the environment
	  has to be defined, e.g., productivity goes down as the size
	  of the project goes up.

	- Even the best, and most meaningful, metrics can be rendered
	  useless through a faulty analysis.

[I, as always, have much more to say on this topic. Besides
object-oriented technology, and software resuability, metrics are also
an area of interest and study for me.]

I was surprised about how few people mentioned metrics like:

	- the typical number of operations in the interface of a
	  class

	- the typical number of instances created per class per
	  application 

	- the average size of a method (e.g., in lines of code)

	- the number of classes per application

	- metrics for measuring polymorphism

	- the numbers of superclasses for a typical class

	- the relative efficiencies of object-oriented databases
	  compared with relational databases

	- metrics for the effort and resources needed to transition an
	  organization to the object-oriented paradigm.

	- metrics that would allow for the meaningful comparison of
	  the object-oriented paradigm with other paradigms.

	- metrics which are especially useful for object-oriented
	  development in the large (OODIL)

	- metrics such activities as testing, debugging, and
	  maintaining objects, classes, and other object-oriented
	  entities


I wish to thank all who contributed. My next step will probably be to
write an article on the topic, after some more research.

				-- Edward V. Berard
				   Berard Software Engineering, Inc.
				   18620 Mateney Road
				   Germantown, Maryland 20874
				   Phone: (301) 353-9652
				   E-Mail: eberard@ajpo.sei.cmu.edu

----------------------------------

Ralph Johnson (johnson@p.cs.uiucf.edu):

Every so often I see a message that makes me want to mount a soap-box
and preach.  My apologies to Edward Berard.  This is certainly not
an attack on you, even though you are the cause of it!

I am usually suspicious of software metrics.  In general, we do
not know what to measure of software.  We really need to measure the
IDEAS in software, not the lines of code, though nobody knows how to
do that.  

The typical size of a class depends on how good the design is.
Those new to OOP often write huge classes, but a good design
will have a larger number of smaller classes.  The proper way
to measure the size of a class is NOT in source code instructions
but in methods (member functions, etc), since the most important
aspect of a class is its external interface.  The number of
methods is directly related to the number of source code instructions
since no method should be more than a page long.  In Smalltalk, the
median size of a method is three lines.  A class with more than 
50 methods is probably getting too large.

All classes are not equal.  A minor variation of a well-understood concrete
class can be built in a few minutes.  A concrete subclass of a well-designed
abstract class takes a few hours.  Designing a good abstract class
can take months.  Thus, "the typical amount of time necessary for a designer
to design a class" is simply not a well-formed question!  This is like
asking the average lifetime of a mammal.  It makes a big difference whether
you are a mouse or an elephant.

In my opinion, the difficulty of moving to the object-oriented paradigm
is overrated.  The problem is that you have to have somebody who
understands it who can teach the others.  This person has to focus
on teaching, not on building or designing.  Most groups don't have
anybody like this.  I have just started teaching all the employees in a
small company object-oriented programming and design.  My guess is
that it will take about 3 months before they are able to use the tools
of the trade well and 6 months to become experts.  This is based on
the time it takes for me to train my own students.

The effort required to build systems using object-oriented technologies
depends almost entirely on the size and quality of the existing class
libraries.  Designing these libraries is hard and requires good designers.
Using them is much easier.  One of the major advantages of Smalltalk is
that it comes with a good class library.  Of course, eventually the other
languages will have something similar, but telling people to just go
out and write their own is silly.  The whole point of object-oriented
programming (in my opinion) is code reuse, and a system for code reuse
that doesn't come with any reusable code doesn't do much good in the
short term.  You will first have to spend the long process of writing
reusable classes before you can receive the tremendous payback that
is possible from object-oriented programming.

The size and performance of a system depends as much on the language
implementation and the class libraries as anything else.  Object-oriented
language implementation is still at an early stage, so size and speed
should improve a lot in the future.  For example, C++ is as efficient
as other languages, though programs that make heavy use of class libraries
tend to be a lot bigger because they tend to include code they don't need.
Smalltalk programs have a very large minimum size because they all include
the programming environment (though this will change) but they grow a lot
slower because of code reuse and efficient representations of programs.
Purely numeric Smalltalk programs are orders of magnitude slower than
the equivalent FORTRAN programs, but programs that use a lot of graphics
or that depend heavily on polymorphism might be just as fast or faster
than their C equivalents.  Thus, it is impossible to make simple
statements about relative size and speed.

---------------------------------------

Rajendra K. Raj <rkr@june.cs.washington.edu>

I am responding to your recent message about object-oriented metrics on the
net. I have been looking at object-oriented metrics that help measure the
amount of reuse present in an object-oriented system, but I don't have
anything to distribute as yet. Sometime in the next few months is what I
usually tell people who ask me when I'll have something ready.

Please keep me informed about any responses you get on this issue. If you
plan on posting a summary to the net, that will be okay too. Thanks,

    EMail:  rkr@cs.washington.edu,     rkr@uw-june.UUCP

    Mail:   Department of Computer Science, FR-35
            University of Washington
            Seattle   WA 98195
            USA

--------------------------------

Jakob Nielsen (jn@iddth.dk):

To your list I would add "The time it takes a programmer who is experienced
in traditional languages to transfer to the object-oriented language".

Check out my article in the May 1989 IEEE Software about the difficulties
of learning Smalltalk. We only give anecdotal numbers, but the learning
time for Smalltalk may be in the order of two months for an experienced
programmer. This is much longer than many people might expect and could
screw up the time estimates for some projects. Maybe other languages such
as C++ are faster to learn.

Jakob Nielsen
Asst.Prof. of User Interface Design
Technical University of Denmark, Dept. of Computer Science
Building 344, DK-2800 Lyngby Copenhagen, Denmark
datjn@neuvm1.bitnet or jn@iddth.dk

------------------------------------

David Weiss (weiss@software.org):

I suggest that you look at the papers published concerning
measurement of the A-7 project at the Naval Research Laboratory.
I believe the approach used on that project would certainly be
considered object oriented, despite the fact that the project predates
the term object oriented by several years.  In particular, I am thinking
of papers such as

Basili & Weiss, Evaluation of a Software Requirements Document By
Analysis of Change Data, Proc. 5ICSE, 1981 (This paper describes the
measurement approach and how it was applied to a requirements
document.  Except for the measurement approach, the data it contains
is probably of little interest from the object oriented viewpoint.)

Norcio & Chmura, Design Activity in the Software Cost Reduction
Project, NRL Report 8974, August, 1986 (This paper analyzes design
activity on the project, including an analysis of number of labor
hours needed to design different modules)

Chmura, Norcio, Wicinski, Design Changes in the Software Cost
Reduction Project, NRL Report 9124, June 1988 (This paper analyzes the
changes made to the design during the development process. The
analysis looks mostly at the effort to make changes.)

There are several other references that may be of interest, but these
are a good starting point.

-----------------------------

Doug L. Bryan (bryan@sierra.STANFORD.EDU):

Ed,

>       - any other object-oriented metrics, or metric techniques,
>         that you can think of, or have experience with

One area of computer science that I think software engineers have
woefully and wrongly ignored is graph theory (with the notable
exception of one good paper at the International S/W Eng. conference
in Singapore that showed why the cohesion of Booch's components wasn't
too good).  I've been thinking about metrics that use graph theory
lately since my real work is starting to use graphs a lot.

Consider the following object/class/package/type dependency:

                A
               /|
              v |
              B |
               \|
                v
                C

A depends on B and C; B depends on C.  This is nicey represented by
lby a directed graph.  Is this a good dependency structure?  Some would
say yes, some would say no; I say no.  This can be measured using many
heuristics/specifications, including:

        for all nodes x,y: exists 2 paths of different lengths
            between x and y.

food for thought.

--------------------------------------

Steve Sanderson (halley!san@cs.utexas.edu):

I remember a paper at SIGCHI '89 here in Austin that may be relevant, I
haven't looked for this paper in the proceedings, but I'll bet its worth
a shot.

Steve Sanderson

san!halley@cs.utexas.ed  -or-  cs.utexas.edu!halley!san

---------------------------

Ra'ad Siraj (siraj@harvard.harvard.edu): 

Two months to get familiar with the environment, oo concepts, and oo syntax.
And then roughly one day per class.

----------------------------------------



David Wheeler (wheeler@ida.org): 

The "Law of Demeter" is documented in a few places, inc.
page 323 of the OOPSLA '88 Conference proceedings.
It's essentially a measure of good O-O "style".

Title: "Object-Oriented Programming: An Objective Sense of Style"
by K. Lieberherr, I. Holland, A. Riel.

--- David A. Wheeler
wheeler@ida.org

------------------------------

Shari Pfleeger (pfleeger@ctc.contel.com):

In response to your request for information on object-oriented
metrics:  We at Contel are doing quite a bit.  I have developed a
cost model for object-oriented development that seems to be far more
accurate than COCOMO and Ada-COCOMO on the small set on which I have
tested it.  We are using a count of objects and methods as a size
estimator, in the same way that HP is using the count.  As head of
the Contel Software Metrics Program, I am having all of our
object-oriented projects count objects and methods as well as lines
of code and function points.  In the next few years, we will be able
to tell which measure is most appropriate for what projects.

You may want to contact Sally Schlaer at Project Technology.  She is
asking companies to allow her to use their data on object-oriented
projects for various metrics calculations.  I don't know how many
companies have agreed, but she can tell you.

If you are interested in our metrics program at Contel, please give
me a call.

Shari Lawrence Pfleeger
Contel Technology Center
15000 Conference Center Drive
PO Box 10814
Chantilly, VA 22021-3808

(703)818-4498

pfleeger@ctc.contel.com

---------------------------------

Richard Locke (locke@pdn.paradyne.com): 

...  
[On Bertrand Meyer's presentation at a recent object-oriented forum]
Meyer decries the lack of libraries with C++, and feels that the
technical means to support reusable libraries don't exist in the
language, and that this is *why* there are no libraries.

[Meyer cited an] Example of successful Eiffel development effort:
350,000 lines, 55% reusable code, 100 on team, 1 year start to market
time.  Allegedly improved productivity, time to market, and quality of
result.  (Cognos corp., Ottowa.)