[comp.software-eng] Using COCOMO to estimate development schedules

lrm5110@tahoma.UUCP (Larry R. Masden) (03/23/89)

I've seen references to a method of estimating software
development schedules and costs called COCOMO.  I'm interested,
and I would appreciate some good references or brief descriptions 
of the system.

Thanks!

-- 
Larry Masden       	      Voice: (206) 237-2564
Boeing Commercial Airplanes   UUCP: ..!uw-beaver!ssc-vax!shuksan!tahoma!lrm5110
P.O. Box 3707, M/S 66-22
Seattle, WA  98124-2207

victor@concour.CS.Concordia.CA (Victor Krawczuk) (03/24/89)

In article <351@tahoma.UUCP> lrm5110@tahoma.UUCP (Larry R. Masden) writes:
>I've seen references to a method of estimating software
>development schedules and costs called COCOMO.  I'm interested,
>and I would appreciate some good references or brief descriptions 
>of the system.
>
>Thanks!
>

    A brief description can be found on pp. 76-82 in a book
by Richard Fairley called "Software Engineering
Concepts" (McGraw-Hill, Inc.).
    Call letters are QA76.6.F35 1985.  It's a pretty 
popular book & should be available in all university
libraries.  This book gives a broad overall picture of
S.E.

shapiro@rb-dc1.UUCP (Mike Shapiro) (03/28/89)

In article <351@tahoma.UUCP> lrm5110@tahoma.UUCP (Larry R. Masden) writes:
>I've seen references to a method of estimating software
>development schedules and costs called COCOMO.  I'm interested,
>and I would appreciate some good references or brief descriptions 
>of the system.

The primary reference to COCOMO is the book by its originator:

     Barry W. Boehm
     _Software Engineering Economics_
     Prentice-Hall, 1981
     ISBN 0-13-822122-7

I have seen commercial software implementing the COCOMO calculations,
but do not have a reference at hand.

-- 

Michael Shapiro, Gould/General Systems Division (soon to be Encore)
15378 Avenue of Science, San Diego, CA 92128
(619)485-0910    UUCP: ...sdcsvax!ncr-sd!rb-dc1!shapiro

simpson@minotaur.uucp (Scott Simpson) (03/28/89)

In article <351@tahoma.UUCP> lrm5110@tahoma.UUCP (Larry R. Masden) writes:
>I've seen references to a method of estimating software
>development schedules and costs called COCOMO.  I'm interested,
>and I would appreciate some good references or brief descriptions 
>of the system.

I don't think Barry reads netnews so I'll answer it for him.  COCOMO
(COst COnstructive MOdel) was written about by Dr. Barry Boehm in the
book Software Engineering Economics published in 1981 by Prentice Hall
(ISBN 0-13-822122-7).  Barry is the Chief Scientist of the Defense
Systems Group here at TRW.  There are some computer programs out there
that implement the COCOMO model.  I know of one from the now defunct
Wang Institute of Graduate Studies called wicomo and there are others.
	Scott Simpson
	TRW Space and Defense Sector
	oberon!trwarcadia!simpson  		(UUCP)
	trwarcadia!simpson@oberon.usc.edu	(Internet)

shilling@pinto.gatech.edu (John Shilling) (03/28/89)

I have two questions about COCOMO

1.  Isn't the accuracy of the model limited by an initial guess of
    DELIVERED lines of source code?  Is there any reason to believe that
    a guess of delivered lines of source code is any more accurate than
    simply guessing at the resources required directly?  And just what is a 
    line     of source code anyway?  Seriously.  Is it delineated by a newline
    character?  Is it the number of statements?  Does it include comments?

2.  In what I have seen on the COCOMO model the weights on the factors 
    are given to two significant digits past the decimal point.  This raises 
    all sorts of flags with me.  Back in numerical analysis I learned not to 
    represent digits     that were below the level of accuracy of the 
    computation.  Given     the level of uncertainty in the evaluation of 
    the factors, I have a     hard time believing that the computations are 
    accurate to two decimal digits (it has to be more than crunching numbers, 
    it has to mean something).  Does anyone out there have evidence to the
    contrary?


John J. Shilling
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ihnp4,linus,rutgers}!gatech!shilling
Internet:	shilling@gatech.edu

coggins@coggins.cs.unc.edu (Dr. James Coggins) (03/29/89)

In article <18252@gatech.edu> shilling@pinto.UUCP (John Shilling) writes:
>I have two questions about COCOMO
>
>1.  Isn't the accuracy of the model limited by an initial guess of
>    DELIVERED lines of source code?  Is there any reason to believe that
>    a guess of delivered lines of source code is any more accurate than
>    simply guessing at the resources required directly?  And just what is a 
>    line     of source code anyway?  Seriously.  Is it delineated by a newline
>    character?  Is it the number of statements?  Does it include comments?

Yes, you need to estimate first a figure denoted KDSI for "thousands
of delivered source instructions (excluding comments)".  By
assumption, size in KDSI is the principal factor driving software
cost.  Now, true, you still have to come up with this number somehow,
usually from experience on similar projects.  But at least you know
WHAT you need to estimate.  It guides your study of other similar
systems to the critical factor you need to learn about in order to
estimate accurately in the future. 

It has been shown that counting "tokens" does not provide a
significant improvement over "lines" and "tokens" requires a parse. 
In counting such lines, several language statements on one line counts
as 1 and a data declaration stretching over 8 lines counts as 8.
You can also automate these things by counting semicolons or using
similar tricks.  Some students of mine wrote a script to count 
C statements, # directives, and macros correctly.  Since the unit
used is THOUSANDS of lines, these noise factors don't matter much.

>2.  In what I have seen on the COCOMO model the weights on the factors 
>    are given to two significant digits past the decimal point.  This raises 
>    all sorts of flags with me.  Back in numerical analysis I learned not to 
>    represent digits     that were below the level of accuracy of the 
>    computation.  Given     the level of uncertainty in the evaluation of 
>    the factors, I have a     hard time believing that the computations are 
>    accurate to two decimal digits (it has to be more than crunching numbers, 
>    it has to mean something).  Does anyone out there have evidence to the
>    contrary?
>
>John J. Shilling
>Internet:	shilling@gatech.edu

You have been trained well.  But note: those figures are used only in
the Detailed COCOMO level where they have some chance of being valid.
In Intermediate COCOMO you just use numbers like .8, 1.2, etc. to
represent qualifiers like "low" "medium" or "high".  The precision
issue you raise was indeed factored into the modelling, and I'm not
going to quibble over whether 1 digit or 2 is "Right".  Now if
detailed COCOMO were presented with 5 digits after the decimal, I
would complain.  As it is, 2 digits after the point is not
unreasonable. 

See Boehm's book, Software Engineering Economics for the full (quite
readable) discussion.  Visit your library and Read More About It.
---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   Old: Algorithms+Data Structures = Programs
UNC-Chapel Hill               New: Objects+Objects=Objects
Chapel Hill, NC 27599-3175    
and NASA Center of Excellence in Space Data and Information Science
---------------------------------------------------------------------
 

hollombe@ttidca.TTI.COM (The Polymath) (03/29/89)

In article <351@tahoma.UUCP> lrm5110@tahoma.UUCP (Larry R. Masden) writes:
}I've seen references to a method of estimating software
}development schedules and costs called COCOMO.  I'm interested,
}and I would appreciate some good references or brief descriptions 
}of the system.

This is the system developed by Barry Boehm (pronounced Bame(?)).  I and
my boss at the time attended one of his lectures and acquired copies of
his book in the process.  Two things stick in my mind from that
experience:

1)  His entire system of formulas rests on a figure for the estimated
    difficulty of the project.  My boss and I badgered him for three hours
    to find out where this fundamental figure came from.  When we finally
    pinned him down, the answer he gave amounted to "You take a flying
    guess." (He actually said something about basing the estimate on past
    experience with other projects and the programmers involved.  Given
    that each project is different and involves a different mix of skills
    and people, this amounts to guessing).

2)  My boss, the company's software development manager, then took the book
    home and tried applying COCOMO to figures derived from projects we'd
    already completed.  This man had over 20 years experience in the field
    and routinely made project estimates as part of his job.  He was
    intimately familiar with the work and people involved in the test data
    he used.  The estimates he produced using the COCOMO model differed
    from the actual project completion times by as much as 50%, either
    way.

That said, I'll add that COCOMO is probably at least as good as any other
system available for software development estimates.  The problem is,
there's no good way to estimate something that's different every time you
do it.  I don't think that problem is going to go away for the forseeable
future.

If I used COCOMO, I'd take the results as a ballpark figure, at best, and
add a large grain of salt.

Disclaimer:  The above information is over 5 years old.  In this business,
that means there's a good chance that 50% of it is obsolete or wrong.
Your mileage may vary.

-- 
The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com)  Illegitimati Nil
Citicorp(+)TTI                                                 Carborundum
3100 Ocean Park Blvd.   (213) 452-9191, x2483
Santa Monica, CA  90405 {csun|philabs|psivax}!ttidca!hollombe

sheppards@tiger.UUCP (Scott Sheppard) (03/29/89)

In article <18252@gatech.edu>, shilling@pinto.gatech.edu (John Shilling) writes:
> I have two questions about COCOMO
> 
> 1.  Isn't the accuracy of the model limited by an initial guess of
>     DELIVERED lines of source code? Is there any reason to believe that
>     a guess of delivered lines of source code is any more accurate than
>     simply guessing at the resources required directly?  

Yes, it is limited; however, our experience here at AG Communication Systems
has been that statements counts are easier to estimate than effort. We get to
ignore complexity and concentrate on the delivered product in terms of its
statement count. Complexity is handled by the model.

>     And just what is a 
>     line     of source code anyway?  Seriously.  Is it delineated by a newline
>     character?  Is it the number of statements?  Does it include comments?

We use KNCSS - 1000 Non-Comment Source Statements. A statement is a language
statement - not a line in a file. For languages such as Pascal, we use a precise
counting mechanism that is implemented by a tool; however, a close approximation
to a Pascal statement count can be achieved by hand if one simply counts the number
of semicolons.

> 2.  In what I have seen on the COCOMO model the weights on the factors 
>     are given to two significant digits past the decimal point.  This raises 
>     all sorts of flags with me.  Back in numerical analysis I learned not to 
>     represent digits that were below the level of accuracy of the 
>     computation.  Given the level of uncertainty in the evaluation of 
>     the factors, I have a hard time believing that the computations are 
>     accurate to two decimal digits (it has to be more than crunching numbers, 
>     it has to mean something).  Does anyone out there have evidence to the
>     contrary?
> 

The degree of accuracy achieved with the model has been fair for us. The decimal
places are merely the level needed to propogate the impact of the factors. In other
words, something that has a value of 5.01 instead of 5.00 usually makes a difference
of about 1 unit since the 5.xx number gets multiplied by 100 before being used. We never
sit around and ask, Is this project really a 5.01 or is it a 5.005? The decimal points
are just part of the formulas that drive the model.
-- 
Scott Sheppard                               
  UUCP: ...!ncar!noao!asuvax!gtephx!sheppards

coggins@coggins.cs.unc.edu (Dr. James Coggins) (03/31/89)

In article <4148@ttidca.TTI.COM> hollombe@ttidcb.tti.com (The Polymath) writes:
>
>1)  His entire system of formulas rests on a figure for the estimated
>    difficulty of the project.  My boss and I badgered him for three hours
>    to find out where this fundamental figure came from.  When we finally
>    pinned him down, the answer he gave amounted to "You take a flying
>    guess."

Ridiculous.

Computer scientists still do not appreciate the difference between
guessing and estimating.  A guess is a number produced out of thin
air.  An estimate is the result of an explicit computation based on
clearly stated, testable assumptions.  Estimates can be accompanied by
error ranges and analyses of the sensitivity of the result to the
initial assumptions. Engineers learn this as undergrads.  Computer
scientists miss it (unless they encounter a course I'm teaching).

This is not the only instance of some engineering math that computer
scientists need and usually don't get, resulting in computer scientists
writing one form of idiocy or another in project estimates or in
net postings.

>    (He actually said something about basing the estimate on past
>    experience with other projects and the programmers involved. 

That sounds more like it.

>    Given
>    that each project is different and involves a different mix of skills
>    and people, this amounts to guessing).

Both the premises and the conclusion are false.

Parnas says "All design is redesign." He's right.  It is rare and
special to be writing a truly new software product.  In reality,
almost all projects fall into some large class of projects that have
been done before.  Boehm's COCOMO technique helps you to know WHAT you
need to observe from the previous projects.  Rather than the
seat-of-the-pants guesses your boss came up with COCOMO shows a few
specific quantities that need to be observed over time.  It also helps
to focus efforts to evaluate and test programmer productivity that go
into the COCOMO equations.  A great contribution. 

>-- 
>The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com)

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   old: Data Structures + Algorithms = Programs
UNC-Chapel Hill               new: Objects + Objects = Objects
Chapel Hill, NC 27599-3175    
and NASA Center of Excellence in Space Data and Information Science
---------------------------------------------------------------------
 

hollombe@ttidca.TTI.COM (The Polymath) (04/04/89)

In article <7518@thorin.cs.unc.edu> coggins@coggins.UUCP (Dr. James Coggins) writes:
}... Rather than the
}seat-of-the-pants guesses your boss came up with COCOMO shows a few
}specific quantities that need to be observed over time.  It also helps
}to focus efforts to evaluate and test programmer productivity that go
}into the COCOMO equations.  A great contribution. 

Apparently I didn't make myself clear.  My (former) boss didn't compare
COCOMO figures to seat-of-the-pants estimates.  He plugged actual figures
from real, completed projects into the COCOMO formulas and compared the
COCOMO projections with the actual project figures.

He found the COCOMO projections to be wrong by as much as 50%.

Maybe your company can live with 50% cost overruns or project over-bids.
Ours couldn't.

-- 
The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com)  Illegitimati Nil
Citicorp(+)TTI                                                 Carborundum
3100 Ocean Park Blvd.   (213) 452-9191, x2483
Santa Monica, CA  90405 {csun|philabs|psivax}!ttidca!hollombe

coggins@coggins.cs.unc.edu (Dr. James Coggins) (04/06/89)

In article <4183@ttidca.TTI.COM> hollombe@ttidcb.tti.com (The Polymath) writes:
>
>Apparently I didn't make myself clear.  My (former) boss didn't compare
>COCOMO figures to seat-of-the-pants estimates.  He plugged actual figures
>from real, completed projects into the COCOMO formulas and compared the
>COCOMO projections with the actual project figures.
>
>He found the COCOMO projections to be wrong by as much as 50%.
>
>Maybe your company can live with 50% cost overruns or project over-bids.
>Ours couldn't.
>
This just shows how little you know about estimation.  A factor of 2
is not bad for an initial estimate such as one obtains from the Basic
level of COCOMO.  Is this what you plugged numbers into?  I assume so,
since your description of the process your boss went through is
superficial.  The deeper levels of COCOMO (called Intermediate and
Detailed COCOMO) have plenty of degrees of freedom to bring that
estimate right into line with practice.

Read the book; note the assumptions; study the results; test the results.
Then complain.

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   
UNC-Chapel Hill               Complaining is not a spectator sport.
Chapel Hill, NC 27599-3175    
and NASA Center of Excellence in Space Data and Information Science
---------------------------------------------------------------------
 

edward@vangogh.sybase.com (Ed Hirgelt) (04/07/89)

In article <4183@ttidca.TTI.COM> hollombe@ttidca.TTI.COM (The Polymath) writes:
+   Apparently I didn't make myself clear.  My (former) boss didn't compare
+   COCOMO figures to seat-of-the-pants estimates.  He plugged actual figures
+   from real, completed projects into the COCOMO formulas and compared the
+   COCOMO projections with the actual project figures.

+   He found the COCOMO projections to be wrong by as much as 50%.

But isn't this a necessary part of using the model -- tuning it to the
characteristics of the software and environment that you are trying to
represent. 

Since this is a model and since there are things in it to tweak (if I
remember what little I've read about COCOMO), you can take your actual
results, evaluate the model and see what it estimates. From that
determine changes to the model that make it more accurate. With a
reasonable base of data (it sounds like your boss had that) it should be
possible to get a much better fit for a particular environment. Then the
model becomes a better predictor. 

Models only work well when their assumptions are met. If your
environment, project, or staff don't match the model, the model needs to
be enhanced.

I have often wished that I had the records necessary to tune a model and
make it moderatey accurate. I know -- I should start keeping them. One
of these days I will, really I will.

Ed
--
---------------------------------------------------
Ed Hirgelt		sun!sybase!edward
Sybase, Inc.   		edward@sybase.com
6475 Christie Ave
Emeryville, Ca. 94608

coggins@coggins.cs.unc.edu (Dr. James Coggins) (04/07/89)

Allina, Chapel Lines: 31

In article <4183@ttidca.TTI.COM> hollombe@ttidcb.tti.com (The Polymath) writes:
>
>Apparently I didn't make >COCOMO figures to seat-of-the-pants estimates.  He plugged actual figuresmounmom real, completed an

Hcts into the COCOMO formulas and compared the
>COCOMO an

Hctions w-Rpactual project figures.
S
>He found the COCOMO projections to be w ab-by as much as 50%.
>
>thenc
Hct overs: byds.
>Ours couldn't.
>
This just shows how little yis not bad for an initial estimate such as one obtains from the Basic
levesince your description of the process your , etds went through is
superficial.  The deepe "Suhevels of COCOMO (called Intermediate and
Detailed COCOMO) have plenty of degrees of freedom to bring that
estimate right into line with practice.

Read the bookThen complain.

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   
UNCChapel Hill, NC 27599-3175    
and NASA Cen---------------------------------------------------------------------
 
#! rnews 2214
Path: pacbell!ames! Sys-mcnc!From: cNewsgroups: rec