[comp.sources.unix] v20i009: Tools for generating software metrics, Part02/14

rsalz@uunet.uu.net (Rich Salz) (09/20/89)

Submitted-by: Brian Renaud <huron.ann-arbor.mi.us!bdr>
Posting-number: Volume 20, Issue 9
Archive-name: metrics/part02

---- Cut Here and unpack ----
#!/bin/sh
# this is part 2 of a multipart archive
# do not concatenate these parts, unpack them in order with /bin/sh
# file doc/References continued
#
CurArch=2
if test ! -r s2_seq_.tmp
then echo "Please unpack part 1 first!"
     exit 1; fi
( read Scheck
  if test "$Scheck" != $CurArch
  then echo "Please unpack part $Scheck next!"
       exit 1;
  else exit 0; fi
) < s2_seq_.tmp || exit 1
echo "x - Continuing file doc/References"
sed 's/^X//' << 'SHAR_EOF' >> doc/References
X	Systems Development Activities with Function Points", IEEE
X	Transactions on Software Engineering, Vol. SE-9, No. 6 (November
X	1983).
X
X	[Both articles reference: Allen J. Albrecht, "Measuring
X	application development productivity", Proceedings IBM Applications
X	Development Symposium Oct 14-17, 1979, GUIDE Int. and SHARE Inc.,
X	IBM Corp., p. 83; which I do not have.
X	
X	Conte, Dunsmore and Shen state "... [Allbrecht] obtained a
X	reasonably smooth fit of his weighted function points to actual
X	costs.  However, the programs he studied were all commercial
X	applications, in which the function units are more or less clearly
X	definable and fairly homogeneous.  For systems programs, ...
X	function units are more difficult to define precisely and, in any
X	case, may differ significantly in size and scope." So, if you are
X	working with fairly standard commercial applications, you may want
X	to check this out.]
X
X
XB4.	Chris F. Kemerer, "An Empirical Validation of Software Cost
X	Estimation Models", Communications of the ACM, Vol 30, No. 5
X	(May 1987).
X
X	[Pretty easy to read.  Kemerer, like most writers, found that
X	different models do better in different environments.  He thinks
X	that function points (Albrecht, see above) may be a better basis
X	for estimation than lines of code.]
X
X
XModels based on software science (halstead)
X
XC1.	M.H. Halstead, _Elements of Software Science_, Elsevier
X	North-Holland, Inc., New York, New York, 1977.
X	[Apparently the seminal statement on software science.
X	Unfortunately, I don't have the book, so I cannot provide any
X	commentary on it.]
X
XC2.	"Commemorative Issue in Honor of Dr. Maurice H. Halstead", IEEE
X	Transactions on Software Engineering, Vol. SE-5, No. 2 (March 1979)
X	[A number of papers on the subject.  Some good, some not so
X	great.  Also contains a good article by Parnas.]
X
XC3.	Linda M. Ottenstein, "Quantitative Estimates of Debugging
X	Requirements",  IEEE Transactions on Software Engineering, Vol
X	SE-5, No. 5 (September 1979).
X	[Ok.  Nice example of use of software science model.]
X
XC4.	Martin Trachtenberg, "Order and Difficulty of Debugging",
X	correspondence in IEEE Transactions on Software Engineering, Vol.
X	SE-9, No. 6 (November 1983).
X
X	John E. Gaffney, "Estimating the Number of Faults in Code", 
X	correspondence in IEEE Transactions on Sofware Engineering, Vol.
X	SE-10, No. 4 (July 1984).
X
X	Martin Trachtenberg, "Validating Halstead's Theory with System 3
X	Data", correspondence in IEEE Transactions on Software
X	Engineering, Vol. SE-12, No. 4 (April 1986).
X
X	Myron Lipow, "Comments on ``Estimating the Number of Faults in
X	Code'' and Two Corrections to Published Data", correspondence in 
X	IEEE Transactions on Software Engineering, Vol. SE-12, No. 4
X	(April 1986).
X
X	[Various comments on estimating bugs, debugging difficulty.]
X
XC5.	R.R. Oldehoeft and Leonard J. Bass, "Dynamic Software Science with
X	Applications", IEEE Transactions on Software Engineering, Vol SE-5,
X	No 5 (September 1979)
X	[Somewhat interesting.]
X
XC6.	T.Y. Chen and S.C. Kwan, "An Analysis of Length Equation Using a
X	Dynamic Approach", ACM SIGPLAN Notices, Vol. 21, No. 4 (April 1986)
X	[Uhhh, well, to be honest, I didn't really understand what they
X	were trying to do here.  I think it was really just a pointer to a
X	more comprehensive article which Chen was publishing in Computer
X	Journal sometime later.  Looks interesting though.]
X
XC7.	Ann Fitzsimmons and Tom Love,  "A Review And Evaluation of Software
X	Science", ACM Computing Surveys, Vol. 10, No. 1 (March 1978).
X	[Reasonable overview of early work.]
X
X
X
XModels based on cyclomatic complexity (mccabe) and other syntatic
Xcomplexity measures:
X
XD1.	Thomas J. McCabe, "A Complexity Measure", IEEE Transactions on
X	Software Engineering, Vol SE-2, No. 4 (December 1976).
X	[The classic.]
X
XD2.	Edward T. Chen, "Program Complexity and Programmer Productivity",
X	IEEE Transactions on Software Engineering, Vol. SE-4, No. 3 (May
X	1978).
X	[An important reworking of some of McCabe's ideas.]
X
XD3.	Martin R. Woodward, Michael A. Hennel and David Hedley, "A Measure
X	of Control Flow Complexity", IEEE Transactions on Software
X	Engineering, Vol SE-5, No. 1 (January 1979).
X	[Another view of complexity]
X
XD4.	Sallie Henry and Dennis Kafura, "Software Structure Metrics Based
X	on Information Flow", IEEE Transactions on Software Engineering,
X	Vol. SE-7, No. 5 (September 1981).
X	[Develop concept of complexity as function of module
X	interconnection, apply this complexity measure to the Unix kernel.]
X
XD5.	Victor R. Basili and David H. Hutchens, "An Empirical Study of a
X	Syntactic Complexity Family", IEEE Transactions on Software
X	Engineering, Vol. SE-9, No. 6 (November 1983).
X	[A typically well done article from Basili and company.  They find
X	that (no surprise), some programmers handle complexity better than
X	others.  They are not especially thrilled with any particular
X	complexity measure, but find that statement counts and a hybrid
X	measure which combines software volume and control organization
X	seem a little better.]
X
XD6.	David H. Hutchens and Victor R. Basili, "System Structure Analysis:
X	Clustering with Data Bindings", IEEE Transactions on Software
X	Engineering, Vol. SE-11, No. 8 (August 1985).
X	[They attempt to use modularization as a complexity measure.
X	Conclusions are a little weak, but some interesting ideas here.]
X
XD7.	Kenneth Magel.  "Efficient Calculation of the Scope Program
X	Complexity Metric", ACM SIGPLAN Notices, Vol. 21, No. 9 (September
X	1986).
X	[A complexity measure based on levels of nesting.  It has some
X	intuitive appeal, but no experimental results provided.]
X
XD8.	Dennis Kafura and Geereddy R. Reddy, "The Use of Software
X	Complexity Metrics in Software Maintenance", Vol. SE-13, No. 3
X	(March 1987).
X	[Using metrics to guide and control software maintenance.]
X
XD9.	H. Dieter Rombach, "A Controlled Experiment on the Impact of
X	Software Structure on Maintainability", IEEE Transactions on
X	Software Engineering, Vol. SE-13, No. 3 (March 1987).
X	[Just like the title says.]
X
XD10.	Leslie J. Waguespack, Jr., and Sunil Badlani, "Software Complexity
X	Assesment: An Introduction and Annotated Bibliography", ACM SIGSOFT
X	Software Engineering Notes, Vol. 12, No. 4 (October 1987).
X	[A very complete bibliography, much better than this.]
X
X
XArticles that discuss more than one estimation method:
X
XE1.	Victor R. Basili, Richard W. Selby, Jr., Tsai-Yun Phillips, "Metric
X	Analysis and Data Validation Across Fortran Projects", IEEE
X	Transactions on Software Engineering, Vol SE-9, No. 6 (November
X	1983)
X	[Basili, et. al. compare software science, cyclomatic complexity
X	and source lines of code measures and find them all somewhat
X	lacking as effort estimators.]
SHAR_EOF
echo "File doc/References is complete"
chmod 0644 doc/References || echo "restore of doc/References fails"
echo "x - extracting doc/Results1 (Text)"
sed 's/^X//' << 'SHAR_EOF' > doc/Results1
XThe results of running a multi-variate linear regression on the
X(somewhat massaged) ouput of the various tools for the source code
Xin a Pascal compiler (30,000 - 40,000 lines of code).  "Changes"
Xare the number of delta's (excluding version rollovers) in the sccs
Xfiles.  I assumed that was a good estimator for errors.
X
Xchanges	= -0.1362 comments  +  0.00282 volume  -  0.1065 mccabe 
X	  + 0.3178 returns  +  3.11129
X
XCorrelation Matrix:
Xchanges		1.0000
Xcomments	0.3670	1.0000
Xvolume		0.8801	0.5137	 1.0000
Xmccabe		0.7912	0.4550	 0.8969	1.0000
Xreturns		0.5319	0.2885	 0.5221	0.7635	1.0000
XVariable	changes	comments volume	mccabe	returns
X
XThe equation was significant at an R-squared of 0.8034, with the
Xfollowing t test values:
X
Xcomments	3.5532
Xvolume		13.3951
Xmccabe		3.5085
Xreturns		4.5460
X
X
XSo, what does all this stuff mean?  Well, that is a good question.
XNote that I am not sure what the mccabe and returns variables mean,
Xexactly.  I believe that I summed the mccabe and returns values for
Xeach file, however I am not positive about that.
X
XObviously, the most important predictor of the number of changes in a
Xmodule is its halstead volume.  This is very intuitive.  Comments
Xapparently help cut down on changes.  This is also intuitive.  Code
Xcomplexity (mccabe) also seems to cut down on changes.  This is not at all
Xintuitive.  In fact, I don't understand it at all.  My guess is that there
Xwere a few routines with much higher complexity than the others which were
Xnot changed much, while a number of ``simple'' routines had many changes.
X
XIn other words, I think it is a statistical anomaly.  Clearly, additional
Xdata is needed.  Finally, the number of returns contributes pretty
SHAR_EOF
echo "End of part 2"
echo "File doc/Results1 is continued in part 3"
echo "3" > s2_seq_.tmp
exit 0


-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.
Use a domain-based address or give alternate paths, or you may lose out.