[comp.software-eng] Re^2: Programmer productivity

schultz@cell.mot.COM (Rob Schultz) (12/01/89)

peterd@cs.washington.edu (Peter C. Damron) writes:
>                         how does one use source lines of code (SLOC)
>in any predictive way?  Presumably, we are talking about estimating
>how long a particular project will take.  I understand that once you
>know how many SLOC it will take for the project then you can predict
>how long the project will take.  But, how does one translate a
>specification into some number of SLOC?  This seems difficult to me.

Maurice Halstead (_Elements_of_Software_Science_, Elsevier North-Holland, NY,
1977) worked on a set of textual software metrics. Now I don't claim to fully
understand all of his work, but here's a shot at some of what he worked on.
(Don't be phased by the math - once you understand it, it's not that bad!)

We can measure the number of distinct operators (n1) we will use in the
implementation of an algorithm. Similarly, we can measure the number of
distinct operands (n2) we will use in that implementation. Given a correct
design of the software component, we can also determine the total number
of operators (N1) and operands (N2) that we will use in this implementation.

Halstead went on to define the vocabulary of a software component to be
n = n1 + n2 (that is, the number of distinct operators and operands used
in the implementation of an algorithm), and its length to be N = N1 + N2
(or the total number of operators and operands used). He furthur postulated
that N can be estimated by N' = n1 log2 n1 + n2 log2 n2. (The accuracy of
this equation has been independently validated with confidence factors of
.95+.)

(Bear with me, this starts to get more interesting soon . . .)

The volume of the program is then defined as V = N log2 n. This means that
for each of the N elements of a program, log2 n bits are required to choose
one of the operators or operands for that element. So, V is the number of 
bits required to specify a program. As you might suspect, V increases as
we move from a higher level language to a lower level language. Or, V is 
inversely proportional to the level of abstraction L of the program.

Halstead proposed a conservation law between L and V (LV = k). L is defined
as the ratio of potential to actual volume, or L = V* / V where V* is the
volume of the most compact, or highest level implementation of the algorithm.
It follows that L will increase with n2, and decrease with both n1 and N2.
So Halstead proposed a length estimator, L' = (2 / n1)(n2 / N2). Again, furthur
research has indicated a correlation coefficient of .90, suggesting that L' / L
is very nearly constant.

Any given language will limit the level of even the tightest programs to some
L < 1. So we now come up with a language level, lambda, where lambda = LV*.
lambda measures the inherent limitation imposed by the language on the volume
of a program. Since V* = LV, lambda = L^2 * V.

Continuing on, we can easily see that the difficulty of programming increases
as the volume of the program increases. Halstead proposed E = V / L as a
measure of the mental effort required to create a program. This number is
actually the number of mental discriminations, or decisions, that a fluent,
concentrating programmer should make in implementing an algorithm.

But where does all this get us in terms of estimating the amount of time
required to implement this algorithm? Well, psychologist J M Stroud did
some research into the speed with which we make decisions. His experiments
led to the conclusion that a concentrating person is able to make between
5 and 20 mental discriminations per second, depending on the individual.
Halstead's research showed that concentrating programmers tend to be able
to work near the upper end of this range, at about 18 decisions per second.
he called this number S, or the Stroud rate. So it follows from this that
the time T in seconds for a fluent, concentrating programmer to implement
an algorithm is T = E / S, or the number of decisions to be made divided by
the speed at which those decisions can be made.

Now I realise that this entire article has aboloutely nothing to do with SLOC,
but we do now have a way to determine how long it should take a programmer
to implement an algorithm, based on data that is easily obtainable from a
good design.

There is much additional research to be done here, including how long does 
it take to design an algorithm? How does all of this relate to SLOC? and many
other unanswered questions.

Incidentally, additional research has shown that the total number of errors
in a program is directly related to E, the effort required to specify the
program. Halstead proposed the following formula for the estimation of B:
B = (E^2/3) / 3000. If correct, then for S = 18, a programmer commits one
error every three minutes!! (Makes me glad I'm in testing and research, not
in software development :-)

>Just curious,
>Peter.

Curiosity is the only way to learn. I wonder why cats are so dumb :-)
(No offense to cat lovers)

>---------------
>Peter C. Damron
>Dept. of Computer Science, FR-35
>University of Washington
>Seattle, WA  98195

>peterd@cs.washington.edu
>{ucbvax,decvax,etc.}!uw-beaver!uw-june!peterd
-- 
Thanks -            Rob Schultz, Motorola General Systems Group
     rms          1501 W Shure Dr, Arlington Heights, IL  60004
708 / 632 - 2757   schultz@mot.cell.COM   !uunet!motcid!schultz
"There is no innocence - only pain."        (Usual disclaimers)