[comp.software-eng] use of metrics

cml@cs.umd.edu (Christopher Lott) (06/01/91)

In some-article somebody (Kambic|hlavaty|Showalter) writes:
> ....
> a system where the metrics person is NOT the management chain - in fact,
> it is forbidden for the metrics person to EVER reveal a specific name to a 
> management person.  The reason is exactly as you describe - once anyone got
> burned because of the metric data, the accuracy ... of data is shot

In article <795@tivoli.UUCP> alan@tivoli.UUCP (Alan R. Weiss) writes:
>Watts Humphrey is a very bright person, but its clear .... that 
>Watts never had to answer to the board of directors, stockholders,
>or customers.

I am working with a large set of data on developments right now, and
am struggling with the analysis.  The data isn't perfect, but my more
experienced colleagues tell me that this data (from NASA, btw) is so
much better than data sets at other institutions that I should just be happy.
(don't worry, be hap...., um, sorry.)  Quality of the data you collect is
absolutely vital - or else you'll find yourself analyzing just plain garbage.

The argument I'd like to make is that if you're going to put 2-5% of your
project resources into data collection and validation, you better not shoot
yourself in the foot right off by EVER using your metrics to evaluate your
workers.  (Here you might read ``punish'' for ``evaluate.'')

It's naive to think that a group manager doesn't know already who are
the stellar performers and who are already not so hot -- metrics, if they
are given, will only confirm this knowledge.  The point is that the metrics
must NEVER be used as evaluatory tools FOR PEOPLE.  This is the idea that
management has to buy into (to use b-world jargon) before a metrics program
can succeed.  You want to use metrics to evaluate your processes and products,
not your personnel.

I also don't like Mr Weiss's statement that Mr Humphrey never had to answer
to stockholders et al.  Clearly the metrics person must justify that there
will be some return on the metrics investment, o/w there's no point. 
The payback comes in the form of better awareness of your problem areas and
in the eventual improvement in the way your shop does business.  You can 
leave evaluations of personnel completely out of the picture - and should.

chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

alan@tivoli.UUCP (Alan R. Weiss) (06/05/91)

In article <35121@mimsy.umd.edu> cml@cs.umd.edu (Christopher Lott) writes:
>
>In some-article somebody (Kambic|hlavaty|Showalter) writes:
>> ....
>> a system where the metrics person is NOT the management chain - in fact,
>> it is forbidden for the metrics person to EVER reveal a specific name to a 
>> management person.  The reason is exactly as you describe - once anyone got
>> burned because of the metric data, the accuracy ... of data is shot
>
>In article <795@tivoli.UUCP> alan@tivoli.UUCP (Alan R. Weiss) writes:
>>Watts Humphrey is a very bright person, but its clear .... that 
>>Watts never had to answer to the board of directors, stockholders,
>>or customers.
>
>I am working with a large set of data on developments right now, and
>am struggling with the analysis.  The data isn't perfect, but my more
>experienced colleagues tell me that this data (from NASA, btw) is so
>much better than data sets at other institutions that I should just be happy.
>(don't worry, be hap...., um, sorry.)  Quality of the data you collect is
>absolutely vital - or else you'll find yourself analyzing just plain garbage.

This is true. Mom & Apple pie. 
I don't understand your merging of the Kambic | Hlavaty | Showalter
posting with my own.  They are related, but address different ends
of the problem.  What gives?

>The argument I'd like to make is that if you're going to put 2-5% of your
>project resources into data collection and validation, you better not shoot
>yourself in the foot right off by EVER using your metrics to evaluate your
>workers.  (Here you might read ``punish'' for ``evaluate.'')

Ah, this addresses the Kambic | Hlavaty | Showalter point.

>It's naive to think that a group manager doesn't know already who are
>the stellar performers and who are already not so hot -- metrics, if they
>are given, will only confirm this knowledge.  The point is that the metrics
>must NEVER be used as evaluatory tools FOR PEOPLE.  This is the idea that
>management has to buy into (to use b-world jargon) before a metrics program
>can succeed.  You want to use metrics to evaluate your processes and products,
>not your personnel.


This has been said many times.  We all agree on this.


>I also don't like Mr Weiss's statement that Mr Humphrey never had to answer
>to stockholders et al.  Clearly the metrics person must justify that there
>will be some return on the metrics investment, o/w there's no point. 
>The payback comes in the form of better awareness of your problem areas and
>in the eventual improvement in the way your shop does business.  You can 
>leave evaluations of personnel completely out of the picture - and should.
>
>chris...
>-- 
>Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
>  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>


Now we get to the part that concerns me.  Look, I'm sorry if I offended
your sense of propriety or tarnished a religious icon :-) [grin] but
the fact of the matter is this:  the SEI, which Watts heads, focuses
on long-term quality improvements.  My point, which you have NOT addressed,
is that business people (you know, the ROI-folks) are like Captain James
T. Kirk:  "Dammit, Bones, I want answers *now*!."  If you can't produce
short-term improvements, you're probably in trouble ('specially if your
guardian angel gets re-orged or early-retired!).

Don't misunderstand me:  I LIKE the SEI, and I wish them well.  But, IMHO
and experience they need to handle the business case better, and they need
to address the hellaciously short product life cycles before they get
widespread adoption in the software biz.  As Joe Friday says, "just the facts."

Secondly, I disagree:  like Heisenburg, the mere act of setting up a metrics
program can have an impact on total quality (see the Hawthorn Effect, GE, 1924).
So there IS A POINT in setting up such a program.  Or, in more modern
management theory terms, Tom Peters puts it this way:  "try it , break it,
fix it, repeat."


_______________________________________________________________________
Alan R. Weiss                           TIVOLI Systems, Inc.
E-mail: alan@tivoli.com                 6034 West Courtyard Drive,
E-mail: alan@whitney.tivoli.com	        Suite 210
Voice : (512) 794-9070                  Austin, Texas USA  78730
Fax   : (512) 794-0623
_______________________________________________________________________

kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) (06/06/91)

In article <35121@mimsy.umd.edu>, cml@cs.umd.edu (Christopher Lott) writes:
> In some-article somebody (Kambic|hlavaty|Showalter) writes:
>> ....
>> a system where the metrics person is NOT the management chain - in fact,
>> it is forbidden for the metrics person to EVER reveal a specific name to a 
>> management person.  The reason is exactly as you describe - once anyone got
>> burned because of the metric data, the accuracy ... of data is shot
> 
> In article <795@tivoli.UUCP> alan@tivoli.UUCP (Alan R. Weiss) writes:
>>Watts Humphrey is a very bright person, but its clear .... that 
>>Watts never had to answer to the board of directors, stockholders,
>>or customers.
I believe that Watts Humphrey spent many years at IBM, which would seem to
imply that he know about deadlines and customer commitments.
> 
[...]
> Quality of the data you collect is
> absolutely vital - or else you'll find yourself analyzing just plain garbage.
Yes.  If you're working with GSFC data, you should have good luck.

> The argument I'd like to make is that if you're going to put 2-5% of your
> project resources into data collection and validation, you better not shoot
> yourself in the foot right off by EVER using your metrics to evaluate your
> workers.  (Here you might read ``punish'' for ``evaluate.'')
Can you use metrics to determine if a person needs training?
> 
> It's naive to think that a group manager doesn't know already who are
> the stellar performers and who are already not so hot -- metrics, if they
> are given, will only confirm this knowledge. 
How does the manager determine that in the first place?
> The point is that the metrics
> must NEVER be used as evaluatory tools FOR PEOPLE.  This is the idea that
> management has to buy into (to use b-world jargon) before a metrics program
> can succeed.  You want to use metrics to evaluate your processes and products,
> not your personnel.
[...]
> You can leave evaluations of personnel completely out of the picture - 
> and should.
Please explain how you separate software people from the software process. 

GXKambic
standard disclaimer

cml@cs.umd.edu (Christopher Lott) (06/07/91)

In article <4794.284cfad3@iccgcc.decnet.ab.com> kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) writes:
>Can you use metrics to determine if a person needs training?

I think you can invent a metric to determine, by your standards, anything
you like.  But inventing one to determine if a member of your staff needs
training is not going to win you any friends or cooperation among
your technical staff.   My argument is that this is a wholly inappropriate
use of a software metrics program.

>>I write:
>> It's naive to think that a group manager doesn't know already who are
>> the stellar performers and who are already not so hot -- metrics, if they
>> are given, will only confirm this knowledge. 
>Mr Kambic writes:
>How does the manager determine that in the first place?

You're leading me down a path that I can't quite see.  But here's a response.
I have never managed people, so I can't place myself in that situation.
However, in all my positions, I soon figured out who were the good workers
doing good things and who were the average ones.  (is average pejorative
in this group? ;-)  Surely the manager knows from previous experience who
meets deadlines, meets expectations, produces work that doesn't need to be
redone, etc.  And they talk to their tech people who, in my limited 
experience, seem to know each other's work pretty well.  

>Mr Kambic makes me work by writing:
>Please explain how you separate software people from the software process. 

Clearly, this is a contradiction.  Thank you for bringing it up.  

I think people who discuss metrics programs all too often confuse the ROLE of
the software person with the INDIVIDUAL.  The roles which people play
constitute our work.  Individuals are then cast into these roles and told Go!
So, I argue that when running a metrics program, we want to gather information
about the processes (roles) which people perform and about the artifacts they
produce, not directly info about the individual.  Only by focusing on the
larger picture, the role and the process, can any organizational learning
or improvement happen.

The reasons for not evaluating individuals directly are manifold, but primarily
for the success of the metrics program.  Put another way, we don't live in a
perfect world.  If metrics are used for worker evaluation, the workers will
respond immediately in ways to make the metrics show them in a favorable light.
If no such pressure is exerted, they can afford to be honest.  When you're
collecting human-intensive data, this is vital. 

Say that we care deeply about the amount of computer resources spent on
project X.  But say much of project X work is done on PCs, where automated
collection is basically impossible.  Then the metrics program relies on the
employees to make accurate reports.  Say the reports are way off on the low
side.  When planning the next project, the planner sees that not much machine
time was used previously, and decides that not many machines are necessary.
A situation such as this seems like it would lead to failure, or at least some
interesting recantings of earlier results.  And such a failure is exactly what
the metrics program could have prevented with ease!

I can invent any number of metrics scenarios in which people fake data to make
themselves look better.  Lines of code for productivity?  Hey, blank lines!
Effort to effect repair of a bug?  Hey, low-ball it!  Time spent on the 
computer?  Uh, two or three runs, a few seconds, yeah!  Effort charged to
the project?  What, you mean all those night hours I put in fixing my 
screwups?  forget it!

A metrics program that evaluates roles/processes and products, and does not
take any part in the worker evaluation process, has a chance. Once people 
are convinced that the metrics program will benefit them, they'll be happy
participants.  But it may be a hard sell.  (maybe I understate this point ;-)

chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

cml@cs.umd.edu (Christopher Lott) (06/07/91)

In some article I write:
>>Clearly the metrics person must justify that there
>>will be some return on the metrics investment, o/w there's no point. 

In article <801@tivoli.UUCP> alan@tivoli.UUCP (Alan R. Weiss) writes:
>My point, which you have NOT addressed,
>is that ... If you can't produce
>short-term improvements, you're probably in trouble 

It occurs to me that Mr Weisss is looking for the silver bullet.  Count
this, do that if the number is in this range, Boom, done.  Swell.  But I 
also think I'm selling him short here.  It's a good point, but not a great
one.

Metrics programs are instituted with the goal (as I see it) of improvement
in the way a shop does business.  I believe that it is vital to understand
what you're doing before throwing it away and doing something different.
I hesitate to say that a metrics program is inappropriate to the short-term
payback, because I lack experience.  But I do believe so.

A metrics program must begin with a goal.  [ I refer folks to the 
Goal/Question/Metric paradigm developed by my advisor, previously mentioned
here, published in IEEE TSE some years ago, don't have the ref at home. ]
Your goal could be short term, fine.  Your questions and metrics could
be very limited and inexpensive to answer and collect.  But evaluation
(which is what I think you want) is impossible without some corporate memory,
some record of prior efforts.  Without reference data, you can't evaluate the
numbers you get very well.  Without an evaluation, how can you improve?

Still, the numbers give you insights that you lacked before you measured. 
Maybe you can justify these early insights as a short-term payback, and
a chance to improve.

Once you understand your environment thoroughly (probably as a result of a
long-standing metrics program), the possibilities for improvement will seem
endless.  But characterization does not lend itself well to the short term.
It takes time and a lot of data to allow you to construct an environment
profile, and without a basis for evaluation, your numbers are too likely
to be just numbers, subject to any interpretation you come up with.

chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

jls@netcom.COM (Jim Showalter) (06/07/91)

>Surely the manager knows from previous experience who
>meets deadlines, meets expectations, produces work that doesn't need to be
>redone, etc.

These are all metrics! Every single one of these could be measured,
calibrated, quantified, etc. I think people do this sort of metric
compilation constantly--it's just when someone talksabout making
the process explicit that people claim it can't be done.

>Lines of code for productivity?  Hey, blank lines!
>Effort to effect repair of a bug?  Hey, low-ball it!  Time spent on the 
>computer?  Uh, two or three runs, a few seconds, yeah!  Effort charged to
>the project?  What, you mean all those night hours I put in fixing my 
>screwups?  forget it!

The above assumes that metrics can only be gathered by asking people
to compile them for you. This is, I think, the WORST way to acquire
metrics data, since it puts humans--with all their bias, subjectivity,
and defensiveness--into the loop. Every item listed above could be
acquired automaically with relatively simple tools.
-- 
**************** JIM SHOWALTER, jls@netcom.com, (408) 243-0630 ****************
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

cml@cs.umd.edu (Christopher Lott) (06/07/91)

>>I write:
>>Lines of code for productivity?  Hey, blank lines!
>>Effort to effect repair of a bug?  Hey, low-ball it!  Time spent on the 
>>computer?  Uh, two or three runs, a few seconds, yeah!  Effort charged to
>>the project?  What, you mean all those night hours I put in fixing my 
>>screwups?  forget it!

In article <1991Jun7.074226.12105@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
>The above assumes that metrics can only be gathered by asking people
>to compile them for you. This is, I think, the WORST way to acquire
>metrics data, since it puts humans--with all their bias, subjectivity,
>and defensiveness--into the loop. Every item listed above could be
>acquired automaically with relatively simple tools.

I do NOT assume that metrics can be gathered only by asking people for them,
and I disagree strongly with Mr Showalter about this point.  Many metrics can
be automated, and should be, in order to reduce the load on the personnel. 
BUT you cannot automagically collect data which is invisible.  Kindly explain
to me how a simple tool will determine the amount of unpaid (unrecorded) time
spent?  Or decide how much of the 40 hours spent on error correction last week
was spent on each of the 17 error fixes that were performed?  This is highly
valuable data for the project metrics program, and it only comes from humans.  

Let me add that I also believe that a metrics program absolutely needs an
error / metrics analyst to audit reports for sanity.  Bias, subjectivity,
and other problems will taint metrics reporting, so the reports cannot be
taken verbatim - some careful checking of reports shortly after submission
serves as much-needed validation.

Yes, this can get expensive.  That is a valid concern.  The argument to be
made is that the improvements realized from a good metrics program more than
pay for the overhead associated with the program.

chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

saxena@motcid.UUCP (Garurank P. Saxena) (06/07/91)

cml@cs.umd.edu (Christopher Lott) writes:

>In article <4794.284cfad3@iccgcc.decnet.ab.com> kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) writes: 
>meets deadlines, meets expectations, produces work that doesn't need to be 
>redone, etc.  And they talk to their tech people who, in my limited 
>experience, seem to know each other's work pretty well.  

>>Mr Kambic makes me work by writing:
>>Please explain how you separate software people from the software process. 

>Clearly, this is a contradiction.  Thank you for bringing it up.  

>I think people who discuss metrics programs all too often confuse the ROLE of
>the software person with the INDIVIDUAL.  The roles which people play
>constitute our work.  Individuals are then cast into these roles and told Go!
>So, I argue that when running a metrics program, we want to gather information
>about the processes (roles) which people perform and about the artifacts they
>produce, not directly info about the individual.  Only by focusing on the
>larger picture, the role and the process, can any organizational learning
>or improvement happen.

At the outset, let me state that I am in total agreement with the
fact that metrics programs should be used to improve the process and
*not* be used for evaluating people. However, the real problems with
software is that its done by people and not by machines (ignoring the
translation of programs and requirements by stuff like compilers etc).
Therefore you can *not* dissociate people from the process, as the
people are an integral part of the process. I quote from a paper 
titled "On Building Software Process Models Under the Lamppost" by
Bill Curtis et al, of MCC Software Technology Program, 1987. This paper
describes the empirical results gleaned from studies done by the
MCC team on 19 software projects in various companies and different
project domains. One of the most important factors affecting 
productivity and quality was " project relevant personnel
experience and continuity" and "capability of the personnel assigned
to the project". 

My point is that you simply *cannot* separate software people from
the software process, as long as software continues to be developed
by people. 

-------------------------------------------------------------------
Garurank P. Saxena		"the real problem in software is
CASE Development Group		 that it is done by people".
Motorola Inc,
Arlington Heights, Illinois
Voice: (708)-632-4757		Fax: (708)-632-4430
-------------------------------------------------------------------

alesha@auto-trol.com (Alec Sharp) (06/08/91)

>
>A metrics program must begin with a goal.  [ I refer folks to the 
>Goal/Question/Metric paradigm developed by my advisor, previously mentioned
>here, published in IEEE TSE some years ago, don't have the ref at home. ]

Key point (although I must confess my distaste for the overused word
paradigm :-) ).

Developed by Victor Basili it says: Set your goals.  To know if you
are meeting those goals, you have to ask certain questions.  To answer
those questions you need information, some of which can only come from
measurements.

Preserve your sanity - don't institute metrics for their own sake.
Set well defined, well understood, well accepted goals, and measure
only those things that answer the questions that must be answered.

Alec...


-- 
------Any resemblance to the views of Auto-trol is purely coincidental-----
Don't Reply - Send mail: alesha%auto-trol@sunpeaks.central.sun.com
Alec Sharp           Auto-trol Technology Corporation
(303) 252-2229       12500 North Washington Street, Denver, CO 80241-2404

warren@eecs.cs.pdx.edu (Warren Harrison) (06/08/91)

In article <35346@mimsy.umd.edu> cml@cs.umd.edu (Christopher Lott) writes:
>In article <4794.284cfad3@iccgcc.decnet.ab.com> kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) writes:
>>Can you use metrics to determine if a person needs training?
>
>>>I write:
>>> It's naive to think that a group manager doesn't know already who are
>>> the stellar performers and who are already not so hot -- metrics, if they
>>> are given, will only confirm this knowledge. 
>>Mr Kambic writes:
>>How does the manager determine that in the first place?
>
>You're leading me down a path that I can't quite see.  

I think what George Kambic is getting at is that the manager is using
an implicit measurement system when evaluating his/her staff. You yourself
have just created an implied measurement system that (somehow) assigns
programmers into two classes: "stellar performers" and "not so hot" (there's
probably an "in between" there too). Sadly, implicit measurement systems
are just as easily confounded as explicit ones - my wife used to work for
a manager who thought that a person's committment to the company and
performance was a function of how much overtime they put in. Even though
she consistently outperformed her colleagues (they were mainly retrained
accountants and she had both a BS and MS in Computer Science and several
years of professional experience), when raise time came around, guess what.
She never missed a deadline, but likewise refused to work weekends and
evenings to clean up other people's mistakes.

Had the manager looked at virtually any other explicit metric - number of
functions completed, lines of code written, etc. her performance would
have been obvious. At the same time, much of the deadwood she was expected to
support would have been out in a flash. 

At least if your manager makes his/her metric explicit you can (a) either
convince him it's no good, (b) work to it, or (c) refuse to work for him
or the company. With an implicit metric you can end up working for quite a
while before you find out that you're not doing the right things.

Warren

==========================================================================
Warren Harrison                                          warren@cs.pdx.edu
Center for Software Quality Research                          503/725-3108
Portland State University/CMPS   

jls@netcom.COM (Jim Showalter) (06/08/91)

>BUT you cannot automagically collect data which is invisible.  Kindly explain
>to me how a simple tool will determine the amount of unpaid (unrecorded) time
>spent?

Duration of login?

>Or decide how much of the 40 hours spent on error correction last week
>was spent on each of the 17 error fixes that were performed?  This is highly
>valuable data for the project metrics program, and it only comes from humans.  

Well, an integrated CM system that ties work orders to check-out/check-in of
compilation units does a great job of automating the collection of this
data. I know where you can buy such a system if you're interested.
-- 
**************** JIM SHOWALTER, jls@netcom.com, (408) 243-0630 ****************
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

cml@cs.umd.edu (Christopher Lott) (06/08/91)

In article <1991Jun7.173800.23746@auto-trol.com> alesha@auto-trol.com (Alec Sharp) writes:
>Developed by Victor Basili it says: Set your goals.  To know if you
>are meeting those goals, you have to ask certain questions.  To answer
>those questions you need information, some of which can only come from
>measurements.
>
>Preserve your sanity - don't institute metrics for their own sake.
>Set well defined, well understood, well accepted goals, and measure
>only those things that answer the questions that must be answered.

well summarized!  Better than I could have done on short notice.
Here's the ref:

Basili, Victor R., and H. Dieter Rombach,
"The Tame Project:  Towards Improvement-Oriented Software Environments,"
IEEE Transactions on Software Engineering,
Vol. 14, No. 6, June 1988, pp. 758-773.


chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

mcgregor@hemlock.Atherton.COM (Scott McGregor) (06/08/91)

>BUT you cannot automagically collect data which is invisible.  Kindly explain
>to me how a simple tool will determine the amount of unpaid (unrecorded) time
>spent?  Or decide how much of the 40 hours spent on error correction last week
>was spent on each of the 17 error fixes that were performed?  This is highly
>valuable data for the project metrics program, and it only comes from
humans.  
If we accept that absolute precision in unnecesaary (and with typical
corner cases impossible to uniquely predefine, then *rough estimates* of how
much time were spent on each error might be good enough.  Estimates can
be hypothesized, by systems that pay attention to what work the individual
is doing.  For instance, work done while a copy of the bug report A is being
concurrently displayed in a window, might be more likely to associated with
error A than with error B.  An agent might therefore estimate that more
time was spent working on A than B.  The programmer might take these estimates
and review them for general correctness, making corrections to any
misestimates made by the system (note this keeps ownership of the data
with the engineer--they are free to alter it if they feel data would
result in punishment, which encourages managers to not use the data that
way, since
such use would lead to unusable numbers in the future).  With such
estimates, predictive estimates can be made, of the complexity of
various pieces of work, of their cost to fix, etc.  They won't be
perfect, but may be better than total ignorance.

The point is that very little research has looked at what can be learned
from watching the work process itself.  The past assumption is that the
computer must be explictly told everything, or it must be derivable from
static representations of code, or tests dones in special harnesses. Little
attention to the way people work at a keyboard, mouse and monitor, on a
particular task has been applied to this sort of estimation process.

However, in fact a great deal of information can be gained from watching
users at work.  People doing UI testing have long known this, and have
frequently taken video tapes of people at work.  From watching the
video tape, it is easy to see about how long someone spends on a
task, or how frequently they switch from task to task, and when they seem to
be confused, thinking, etc. You can tell this even without their
explictly saying anything about what they are doing. Interestingly
enough, while watching
facial movements, body movements, etc. helps a lot, I have found
you can discover a lot of information MERELY from watching a video
of the display they are looking at (with no sound, no body movements,
etc.).  But this is just people watching the person at work from outside.

Yet, IF PROPERLY INSTRUMENTED, the same information can be monitored
from the inside of the system. Once the data is collected, it can
not only help generate metrics, but it can also be used to do
pre-caching of menus, databases, automatic raising of help, etc. that
assists the user directly in a way similar to the way company clerk Radar
O'Reilley of M*A*S*H helped his colonel by preparing files in advance, having
information available just before it was needed, etc.  This use of
instrumented UIs to provide additional assitant style help has been
the focus of the work that I have done and presented in my articles
and lectures on "Prescient Agents."  Many other figures in CHI and CSCW
areas have made similar observations.

Of course, almost no one does such monitoring today, and almost no one
thinks about it, or how they would use such a system. Yet
characteristics of X-Windows ICCCM conventions, NFS file systems, ptrace
system calls, ps and "accounting"
monitoring capabilities in today's Unix systems provide a reasonable base
for building such agents. Such agents can be aware of such things as
when windows and sprite locations change, when "X rooms" change,
locations and durations of windows, accesses to files, spawning and
death of child processes, etc.  People work in certain habitual ways,
determined by their neural wiring systems.  For example, if people are
working on multiple projects,
and they have a system with X rooms, then the most common organization of
a project is to have closely related objects in the same  rooms  at the
same time.  On the other hand, it is often inconvenient for people to
work on related things if they are overlapping, so non overlapping
windows in the same "room" are even more likely to be strongly related. 
Not every
window in a room is necessarily related (e.g. a clock window) to the
other objects, but most are, and often the exceptions can be easily
handled (often just by ignoring them!).  Exceptions often even have patterns
that allow them to be easily recognized (for instance a window that is
shown in ALL rooms is probably less likely to be related to any of them.)
Norman, Schneiderman, Malone, Engelbart, Kay and many others have
long noted that people use spatial and temporal proximity and cues to help them
manage complex tasks.  Secretarys, clerks, and others use these
heuristics without concious awareness to try to understand other people's
work, but these observations can also be applied conciously, and even
mechanically by agents to achieve surprising results.

> Let me add that I also believe that a metrics program absolutely needs an
> error / metrics analyst to audit reports for sanity.  Bias, subjectivity,
> and other problems will taint metrics reporting, so the reports cannot be
> taken verbatim - some careful checking of reports shortly after submission
> serves as much-needed validation.

Despite the fact that people are already observed doing their work  by other
people today, I suspect that some people will be wary of their computer
doing this sort of monitoring.  Despite how the computer may actually be able
to use this to actively assist in the tasks being observed, people may be
concerned with other uses of the monitored data by others.  It is clear
that like many other technologies, such monitoring can be quite beneficial
but in other hands can be quite threatening.  This can be especially true
of computer information which many people quite uncritically presume to
be ultra precise and accurate even in situations where data accuracy
are known to be only approximate.     I have recommended that
the individual being monitored/assisted be in control of their data, and
that they be free to provide it, deny it, or alter it before giving it
to others such as managers or metrics analysts.  I perceive this as being
like credit information, that you might want to share with some one you
want credit from, but which you also might not want to share.  You might
also want to correct some mistakes or misleading random features in the data.
It is clear that organizations can be created where the user does not control
nor own their data, and in such a situation monitored data can be used
for tyrannical purposes.  I do not recommend that individuals submit
themselves to such organizations if they have the choice, and I feel badly for
those who do not have the choice, and who may be monitored in more ways
than mechanically (e.g. secret police, etc.).  Information technology
seems to always be a two edged sword.

Scott McGregor
Atherton Technology
mcgregor@atherton.com

alan@tivoli.UUCP (Alan R. Weiss) (06/11/91)

In article <35353@mimsy.umd.edu> cml@cs.umd.edu (Christopher Lott) writes:
>In some article I write:
>>>Clearly the metrics person must justify that there
>>>will be some return on the metrics investment, o/w there's no point. 
>
>In article <801@tivoli.UUCP> alan@tivoli.UUCP (Alan R. Weiss) writes:
>>My point, which you have NOT addressed,
>>is that ... If you can't produce
>>short-term improvements, you're probably in trouble 
>
>It occurs to me that Mr Weisss is looking for the silver bullet.  Count
>this, do that if the number is in this range, Boom, done.  Swell.  But I 
>also think I'm selling him short here.  It's a good point, but not a great
>one.

[the rest of Chris' posting deleted to save time ... plus I agree with
most of it]

Yep, I'm lookin' for LOTS of silver bullets.  But, if you've read
my previous postings, you'll note that I am an overwhelming
believer in people first, systems second.  I came to this realization
the hard way, believe you me :-) 

I don't want to be needlessly didactic here, but I do this for a living,
Chris.  And have for over 10 years.  I'm constantly looking for new
ideas.  If these are characterized as silver bullets, so be it. :-)


_______________________________________________________________________
Alan R. Weiss                           TIVOLI Systems, Inc.
E-mail: alan@tivoli.com                 6034 West Courtyard Drive,
E-mail: alan@whitney.tivoli.com	        Suite 210
Voice : (512) 794-9070                  Austin, Texas USA  78730
Fax   : (512) 794-0623
_______________________________________________________________________

adam@visix.com (06/12/91)

In article <1991Jun7.201655.12088@netcom.COM>, jls@netcom.COM (Jim Showalter) writes:
|> >BUT you cannot automagically collect data which is invisible.  Kindly explain
|> >to me how a simple tool will determine the amount of unpaid (unrecorded) time
|> >spent?
|> 
|> Duration of login?

168 hrs/wk.

Adam

jls@netcom.COM (Jim Showalter) (06/12/91)

adam@visix.com writes:
>In article <1991Jun7.201655.12088@netcom.COM>, jls@netcom.COM (Jim Showalter) writes:

>|> >BUT you cannot automagically collect data which is invisible.  Kindly explain
>|> >to me how a simple tool will determine the amount of unpaid (unrecorded) time
>|> >spent?
>|> 
>|> Duration of login?

>168 hrs/wk.

Duration of login where keystrokes occurred at least once every 30 minutes?
-- 
*** LIMITLESS SOFTWARE, Inc: Jim Showalter, jls@netcom.com, (408) 243-0630 ****
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

G.Joly@cs.ucl.ac.uk (Gordon Joly) (06/13/91)

Jim Showalter <jls@netcom.COM> writes
> adam@visix.com writes:
> >In article <1991Jun7.201655.12088@netcom.COM>, jls@netcom.COM (Jim Showalter) writes:
> 
> >|> >BUT you cannot automagically collect data which is invisible.  Kindly explain
> >|> >to me how a simple tool will determine the amount of unpaid (unrecorded) time
> >|> >spent?
> >|>
> >|> Duration of login?
> 
> >168 hrs/wk.
> 
> Duration of login where keystrokes occurred at least once every 30 minutes?
> --

Aha! At work, we must press keys. None of that sitting at a desk
reading a book or writing a design nonsense:-)

Sorry if this related not to this->thread!

____

Gordon Joly                                       +44 71 387 7050 ext 3716
Internet: G.Joly@cs.ucl.ac.uk          UUCP: ...!{uunet,ukc}!ucl-cs!G.Joly
Computer Science, University College London, Gower Street, LONDON WC1E 6BT

                        Drop a utensil.

styri@cs.hw.ac.uk (Yu No Hoo) (06/13/91)

In article <1991Jun12.003809.24084@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
>>|> >BUT you cannot automagically collect data which is invisible.  Kindly explain
>>|> >to me how a simple tool will determine the amount of unpaid (unrecorded) time
>>|> >spent?
>>|> 
>>|> Duration of login?
>>168 hrs/wk.
>

Gee, this reminds me of the pain I went through trying to create a progress
report sheet that would satisfy the accountant, the project manager, the
guys who did the job, and of course myself - the QA kid.  :-)

Guess there are some things that cannot be recorded automatically. You'll
always have people who work in the lunch room and read a newspaper in the
office. The keyboard is only used to 'record' the work (a tedious task).
(Guess we have too define 'work' here - before paying.)

Anyway, as long as a guy gets paid for recorded time I think unrecorded
time is likely to turn up as a health problem. However, a serious problem
for the project manager is the time recorded incorrectly. ("You see, there
was no more money allocated to this task, so we "used" that task instead."
or... "Well, I was called to this meeting, but there was no place to record
the time I used...")

Tradition says programmers always underestimate. Well, I've experienced the
same with managers - they just forget to multiply. ("Sorry I'm 5 min. late"
- "Well, we've been waiting for an hour", replied the 12 others.) Sometimes
meetings eat a lot of project time.

Sorry for sidetracking the thread from sw-eng theory, but i couldn't resist.
----------------------
Haakon Styri
Dept. of Comp. Sci.              ARPA: styri@cs.hw.ac.uk
Heriot-Watt University          X-400: C=gb;PRMD=uk.ac;O=hw;OU=cs;S=styri
Edinburgh, Scotland

jgautier@vangogh.ads.com (Jorge Gautier) (06/13/91)

In article <1991Jun12.003809.24084@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
> adam@visix.com writes:
> >In article <1991Jun7.201655.12088@netcom.COM>, jls@netcom.COM (Jim Showalter) writes:
> >|> >BUT you cannot automagically collect data which is invisible.  Kindly explain
> >|> >to me how a simple tool will determine the amount of unpaid (unrecorded) time
> >|> >spent?
> >|> 
> >|> Duration of login?

> >168 hrs/wk.

> Duration of login where keystrokes occurred at least once every 30 minutes?

Good to see that you're measuring something.  Remember that if you
measure something, *anything*, something good will happen.  (I forget
exactly what...)  Now everybody, remember to strike that keyboard
while you're thinking! 
--
Jorge A. Gautier| "The enemy is at the gate.  And the enemy is the human mind
jgautier@ads.com|  itself--or lack of it--on this planet."  -General Boy
DISCLAIMER: All statements in this message are false.
"Mommy, where do programs come from?"

kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) (06/14/91)

In article <801@tivoli.UUCP>, alan@tivoli.UUCP (Alan R. Weiss) writes:
> In article <35121@mimsy.umd.edu> cml@cs.umd.edu (Christopher Lott) writes:
[...]
> 
>>It's naive to think that a group manager doesn't know already who are
>>the stellar performers and who are already not so hot -- metrics, if they
>>are given, will only confirm this knowledge.  The point is that the metrics
>>must NEVER be used as evaluatory tools FOR PEOPLE.  This is the idea that
>>management has to buy into (to use b-world jargon) before a metrics program
>>can succeed.  You want to use metrics to evaluate your processes and products,
>>not your personnel.
> 
> This has been said many times.  We all agree on this.

Once again, how does a group manager get to know who the stellar performers
are?  What are their skills in particular areas?  Where do they need 
training?  What do YOU use to evaluate employees?  Do you use nonquantitative
goals?  Do you measure their output in some manner?  If you do not use 
metrics, what do you use, and how do you use it, especially when you are 
deciding the amount of increase each employee should receive?  Fun 
questions.  I am still not saying that I am using or advocate using metrics
to evaluate software engineers.  But I want to know if they differ in any 
significant manner from other employees who are evaluated on quantitative
targets, and if so, how they are different, and then how they should be 
evaluated on these differences.  I still haven't seen these answers.  I have 
my own crazy ideas, on which I shall currently keep silent.
> 
BTW, Humphrey worked at IBM for many years. (20?)
> Don't misunderstand me:  I LIKE the SEI, and I wish them well.  But, IMHO
> and experience they need to handle the business case better, and they need
> to address the hellaciously short product life cycles before they get
> widespread adoption in the software biz.  As Joe Friday says, "just the facts."
> 
Very good point.  A lot of what they do is/will be useful.  Just takes time.

GXKambic metric disclaimer

kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) (06/14/91)

In article <35346@mimsy.umd.edu>, cml@cs.umd.edu (Christopher Lott) writes:
> In article <4794.284cfad3@iccgcc.decnet.ab.com> kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) writes:
>>Can you use metrics to determine if a person needs training?
> 
> I think you can invent a metric to determine, by your standards, anything
> you like.  But inventing one to determine if a member of your staff needs
> training is not going to win you any friends or cooperation among
> your technical staff.   My argument is that this is a wholly inappropriate
> use of a software metrics program.
OK - now justify it.  I am investing time and money in supporting this 
person doing it.  S/he is not doing it correctly because of lack of training
(theorectically).  Now, how do I as a manager justify waiting until I 
"know" the person needs training, and then how do I present it without
useful positive-looking information to back it up?
> 
>>>I write:
>>> It's naive to think that a group manager doesn't know already who are
>>> the stellar performers and who are already not so hot -- metrics, if they
>>> are given, will only confirm this knowledge. 
>>Mr Kambic writes:
>>How does the manager determine that in the first place?
> 
> You're leading me down a path that I can't quite see.  But here's a response.
> I have never managed people, so I can't place myself in that situation.
> However, in all my positions, I soon figured out who were the good workers
> doing good things and who were the average ones.  (is average pejorative
> in this group? ;-)  Surely the manager knows from previous experience who
> meets deadlines, 
A measurable date and deliverable.
> meets expectations, 
Software works, is testable
> produces work that doesn't need to be
> redone, etc.  
No bugs, customer satisfied, pays invoice.
> And they talk to their tech people who, in my limited 
> experience, seem to know each other's work pretty well.  
Peer review technique is very valuable and works well, but also has 
flavor of opinions over quantitation. 
>>Mr Kambic makes me work by writing:
>>Please explain how you separate software people from the software process. 
> 
> Clearly, this is a contradiction.  Thank you for bringing it up.  
> 
> I think people who discuss metrics programs all too often confuse the ROLE of
> the software person with the INDIVIDUAL.  The roles which people play
> constitute our work.  Individuals are then cast into these roles and told Go!
> So, I argue that when running a metrics program, we want to gather information
> about the processes (roles) which people perform and about the artifacts they
> produce, not directly info about the individual.  Only by focusing on the
> larger picture, the role and the process, can any organizational learning
> or improvement happen.
How do you select people for roles?  Not arbitrarily is absolutely true.  You 
select the right people.  But you have to know they are right.
> 
> The reasons for not evaluating individuals directly are manifold, but primarily
> for the success of the metrics program.  Put another way, we don't live in a
> perfect world.  If metrics are used for worker evaluation, the workers will
> respond immediately in ways to make the metrics show them in a favorable light.
> If no such pressure is exerted, they can afford to be honest.  When you're
I know all these points.  Let me ask a question I asked in another post.  
How are software people different from many other people in the world who 
are evaluated on quantitative metrics (sales, no. produced, etc.)?  Please 
elucidate.
> 
> I can invent any number of metrics scenarios in which people fake data to make
> themselves look better.  Lines of code for productivity?  Hey, blank lines!
> Effort to effect repair of a bug?  Hey, low-ball it!  Time spent on the 
> computer?  Uh, two or three runs, a few seconds, yeah!  Effort charged to
> the project?  What, you mean all those night hours I put in fixing my 
> screwups?  forget it!
Yea.  Now how we do convince people that if we are to make money we really 
do have to know these things.
> 
> A metrics program that evaluates roles/processes and products, and does not
> take any part in the worker evaluation process, has a chance. Once people 
> are convinced that the metrics program will benefit them, they'll be happy
> participants.  But it may be a hard sell.  (maybe I understate this point ;-)
Slightly!  8:-)

Fun.

GXKambic
Just try measuring this disclaimer

jls@netcom.COM (Jim Showalter) (06/14/91)

>Now everybody, remember to strike that keyboard
>while you're thinking! 

Okay, okay--I give! Jeez--I came up with automatic ways to measure the
number of changes to a particular unit and to tie those changes to a particular
bug fix, ways to track the error-proneness of particular units, ways to
computer both static and dynamic metrics of various kinds, and so forth,
and all of it possible with off-the-shelf technology I've been working
with for the past four years...and what do people concentrate on? An
off-the-cuff tongue-in-cheek response concerning duration of login. You'd
think people would focus on the meat of the argument, not the gristle.
-- 
*** LIMITLESS SOFTWARE, Inc: Jim Showalter, jls@netcom.com, (408) 243-0630 ****
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

cml@cs.umd.edu (Christopher Lott) (06/14/91)

In article <1991Jun13.235937.24165@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
>I came up with automatic ways to measure the
>number of changes to a particular unit and to tie those changes to a particular
>bug fix, ways to track the error-proneness of particular units, ways to
>compute both static and dynamic metrics of various kinds, and so forth,
>and all of it possible with off-the-shelf technology I've been working
>with for the past four years

I for one am extremely interested in hearing more about your successes,
especially if this is as easy as you make it seem to be.

Let me pose some questions, hopefully more meat than gristle ;-)

1.  how you define a change?  does fixing two bugs in a single module
    constitute 1 or 2 changes?  Is it every word?  Or an editing session?
    if this is collected automatically, I suspect you're using editing
    sessions, because keystrokes would be very difficult to interpret.
    so what if I run 17 editing sessions of the same document, and wind
    up changing only comments 16 of 17 times?  how do you measure granularity
    of the changes?

2.  how do you count errors?  are errors == changes?  are you familiar with
    IEEE standard notation for bugs?  It is:
	error == human misconception
	fault == error as manifested in a document (possibly code)
	failure == visible manifestation of a fault (runtime, etc)
	(note that faults don't always cause failures)

    so is an error really a fault?  Do change forms (do you collect forms?)
    distinguish faults from enhancements?  Do you count failures?

3.  what sort of static and dynamic metrics do you compute?  what is a dynamic
    metric, runtime performance?  profile data?  did you write tools to compute
    these metrics?  

4.  do you see any interesting correlations between data from static analysis
    and error(fault)(?)-pronenes of modules?

Whew, got a little carried away.  Well, I look forward to the discussion.

chris...
-- 
Christopher Lott \/ Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu /\ 4122 AV Williams Bldg  301 405-2721 <standard disclaimers>

kambic@iccgcc.decnet.ab.com (George X. Kambic, Allen-Bradley Inc.) (06/14/91)

In article <35391@mimsy.umd.edu>, cml@cs.umd.edu (Christopher Lott) writes:
> In article <1991Jun7.173800.23746@auto-trol.com> alesha@auto-trol.com (Alec Sharp) writes:
>>Developed by Victor Basili it says: Set your goals.  To know if you
>>are meeting those goals, you have to ask certain questions.  To answer
>>those questions you need information, some of which can only come from
>>measurements.
>>
>>Preserve your sanity - don't institute metrics for their own sake.
>>Set well defined, well understood, well accepted goals, and measure
>>only those things that answer the questions that must be answered.
To set well defined goals, that are well understood and accepted you 
must know exactly where you are now.  That is not the case is sw engineering
at this point in its development.  The paradigm that Basili sets is good, but
I think is well within the confines of standard project management projects,
only applied to software.  IMHO you cannot have a well defined goal in 

Hey Chris - with Basili as your advisor, here's a thesis topic.  Create 
a new software company, get funding, keep it profitable for 4 years, 
win the Macolm Baldridge award, and you walk out with Ph.D.'s in marketing
and software engineering.  8:-).

Jus' random thoughts....

GXKambic
standard sanity preserving disclaimer

jls@netcom.COM (Jim Showalter) (06/15/91)

cml@cs.umd.edu (Christopher Lott) writes:
>In article <1991Jun13.235937.24165@netcom.COM> jls@netcom.COM (Jim Showalter) writes:
>>I came up with automatic ways to measure the
>>number of changes to a particular unit and to tie those changes to a particular
>>bug fix, ways to track the error-proneness of particular units, ways to
>>compute both static and dynamic metrics of various kinds, and so forth,
>>and all of it possible with off-the-shelf technology I've been working
>>with for the past four years

>I for one am extremely interested in hearing more about your successes,
>especially if this is as easy as you make it seem to be.

Without its preceding context, the phrase "I came up with" in the first
sentence above sounds as if I'm claiming to have designed and implemented
all of the stuff listed thereafter. This is NOT how it read in the original
post--in the original post the phrase "I came up with" meant "in response
to a challenge to produce a list of useful metrics that could be computed
automatically, I CAME UP WITH a list that contained the following".

That cleared up, allow me to now respond to your specific questions.
The tools I'm about to describe were invented and implemented by very
clever people working at a company called Rational, with which I was
formerly but am no longer associated. The tools all work, they are a
positive boon to anybody trying to engineer large complex systems,
and they're commercially available...

>Let me pose some questions, hopefully more meat than gristle ;-)

>1.  how you define a change?  does fixing two bugs in a single module
>    constitute 1 or 2 changes?  Is it every word?  Or an editing session?
>    if this is collected automatically, I suspect you're using editing
>    sessions, because keystrokes would be very difficult to interpret.
>    so what if I run 17 editing sessions of the same document, and wind
>    up changing only comments 16 of 17 times?  how do you measure granularity
>    of the changes?

The system available from Rational stores units as a time-dependent series
of line differentials. Each check-out/check-in cycle constitutes a generation:
all changes (stored as line differentials) during a generation can be reverted
or advanced to any designated generation. Reports of all such changes can be
easily obtained from the database, and make for a dandy audit trail for people
into that sort of thing and/or for the basis of computing metrics. Check-out/
check-in cycles are bound to work orders, so that the set of changes to the
set of units required to fix a particular bug can be kept together, automating
bookkeeping chores and putting an army of clerks out on the street (good for
the bottom line, bad for the clerks--such is the march of progress). If desired,
multiple work orders can be tied to changes to the same unit, for cases in which
a change fixes (or contributes to the fixing of) more than one bug.

This greatly increases productivity (by getting rid of all those clerks) while
greatly improving accuracy. It provides direct, quantitative visibility into
the software development process. It is non-invasive from the standpoint of
the programmer, since it is simply all integrated into the editors and
library managers (to check something out you basically hit the check-out
key).

>2.  how do you count errors?  are errors == changes?  are you familiar with
>    IEEE standard notation for bugs?  It is:
>	error == human misconception
>	fault == error as manifested in a document (possibly code)
>	failure == visible manifestation of a fault (runtime, etc)
>	(note that faults don't always cause failures)

>    so is an error really a fault?  Do change forms (do you collect forms?)
>    distinguish faults from enhancements?  Do you count failures?

The system just described is flexible enough that you could support any
classification scheme you desire. Work orders can be tailored to contain
user-defined fields--there is no reason such fields could not be the IEEE
scheme you describe. Or, alternatively, one could group work orders relating
to errors on one list of work orders (work orders belong to work order lists),
those for faults on another list, etc. The system was deliberately designed
to accomodate an arbitrary methodology, so that any site could tailor it
to their particular view of how things should be done.

>3.  what sort of static and dynamic metrics do you compute?  what is a dynamic
>    metric, runtime performance?  profile data?  did you write tools to compute
>    these metrics?  

Static metrics can be computed quite easily, since Rational represents programs
(specifically, Ada programs) using an underlying representation called DIANA:
basically a tree of sufficient richness to completely and unambiguously capture
all of the static syntactic and semantic information of the program. This is
not a particularly radical notion--the p-code system was a similar idea--and
yet, for some mysterious reason, Rational is IT when it comes to doing things
this way. Everybody else, for reasons that completely elude me, chooses to
rebuild this sort of information as in-memory data structures during the run-
time execution of the compiler and then THROW IT AWAY. Rational not only keeps
it around so the system can do things like intelligent recompilation (e.g.
allow the incremental addition/modification/deletion of portions of code
from both specs and bodies without requiring batch "big bang" recompilation
of all clients [transitively] as would be required on other systems),
but, furthermore, provides programmatic access to this information
so that a toolsmith wanting to analyze static metrics can traverse the DIANA
tree and extract whatever information is desired. In short, whereas on most
systems doing code analysis essentially involves writing the front-end for
a compiler, on Rational's equipment it involves writing some basically trivial
applications atop a significant, preexisting abstraction.

Consider a VERY
simple example: someone wants to write a SLOC counter. How does one usually
do this? By searching for semi-colons. But then, of course, this isn't really
accurate because semi-colons could be embedded in strings, so a little more
complexity is needed in the parser, etc, and pretty soon you've had to write
a brain-damaged subset of a scanner/tokenizer/parser. On Rational's system,
you simply traverse the DIANA tree and count up all the nodes that are of
kind Statement. It takes about 10 lines. Considerably more complicated analysis
is possible--it is possible, for example, to flag all exceptions that will
be propagated out of named scope (yes, this can be done statically). Try
doing THAT without DIANA support! Note that there is absolutely no reason
why this same approach could not be used for any other language, particularly
the more modern software engineering oriented ones that have a well-thought-out
structure (e.g. C++, Eiffel, Modula-X), and I am aware of at least one effort
to do this very thing for C++, but, in general, the state of practice is to
default to the lowest common denominator--parsing ASCII source files.

As for dynamic metrics, one uses, of course, performance and coverage analyzers.
There are also tools available that (using DIANA again) can analyze
a compilation unit and automatically generate for it a unit test.

>4.  do you see any interesting correlations between data from static analysis
>    and error(fault)(?)-pronenes of modules?

Depends considerably on which metrics one chooses to use. It has been my
experience that most static analysis performed by most tools is largely
syntactical, since without something like DIANA support that is the only
kind of analysis that is easy enough to do to consider doing. With full
access to all syntactic and semantic information, one can perform statie
analysis of things that have a much stronger correlation with the robustness
of the code (one can, for example, identify unhandled exceptions, exits
from functions without return values, etc).
-- 
*** LIMITLESS SOFTWARE, Inc: Jim Showalter, jls@netcom.com, (408) 243-0630 ****
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

dc@sci.UUCP (D. C. Sessions) (06/17/91)

  There has been some discussion in this thread of the inappropriate use
  of metrics to evaluate individual contributors.  There seems to be a
  misconception that the primary risk lies in using metric-program 
  data to evaluate individuals; would that this were so.  There is, 
  unfortunately, a serious risk in the process itself, as used to 
  improve S/W development procedures, as illustrated:

  At a large and *very* well-regarded institution, the metrics program 
  had determined that successful projects devoted A % of applied time 
  to initial specification, B % to initial design, C % to coding, D % 
  to debugging, etc.  Having determined this, individuals were 
  expected to conform to the profile.  One misfit, for whatever 
  reason, spent more time on the initial phases and (perhaps as a 
  result) converged more rapidly on working product.  (Maybe the 
  relatively short test programs had something to do with the fact 
  that his products used DFT principles, or perhaps with the fact that 
  he spent more time on the test plan -- who knows?)

  As with any nonconformist, his misguided notions eventually resulted 
  in his downfall.  When evaluation time came around, he was upbraided 
  for insufficient attention to debug & test.  The word got around, 
  and other developers who had been thinking of trying to match his 
  consistent on-time low bugcounts rapidly disabused themselves of the 
  notion.

  (Here's another programmer legend in the making!)
-- 
| The above opinions may not be original, but they are mine and mine alone. |
|            "While it may not be for you to complete the task,             |
|                 neither are you free to refrain from it."                 |
+-=-=-    (I wish this _was_ original!)        D. C. Sessions          -=-=-+

jgautier@vangogh.ads.com (Jorge Gautier) (06/19/91)

In article <1205@mgt3.sci.UUCP> dc@sci.UUCP (D. C. Sessions) writes:
>   At a large and *very* well-regarded institution, the metrics program 
>  had determined that successful projects devoted A % of applied time 
>  to initial specification, B % to initial design, C % to coding, D % 
>  to debugging, etc.  Having determined this, individuals were 
>  expected to conform to the profile.  One misfit, for whatever 
>  reason, spent more time on the initial phases and (perhaps as a 
>  result) converged more rapidly on working product.  (Maybe the 
>  relatively short test programs had something to do with the fact 
>  that his products used DFT principles, or perhaps with the fact that 
>  he spent more time on the test plan -- who knows?)
>
>   As with any nonconformist, his misguided notions eventually resulted 
>  in his downfall.  When evaluation time came around, he was upbraided 
>  for insufficient attention to debug & test.  The word got around, 
>  and other developers who had been thinking of trying to match his 
>  consistent on-time low bugcounts rapidly disabused themselves of the 
>  notion.

As I have noted previously in this group, metrics and correlations do
not fully support decision making in software development.  Inductive
reasoning is not truth preserving, it merely points out possible
truths.  You need some kind of deduction to justify conclusions
suggested by induction.  For example, simple association of success or
failure with measurements is not enough.  At some point you need to find
the real cause of the success or failure.  This is of course much more
difficult than measurement and correlation, so it is rarely done, and
we all pay the price.

Some people are blinded and falsely reassured by metrics.  They read a
few articles and books and they think that metrics is the answer to
everything in software development.  They come up with a few
correlations and all of a sudden they think they know what's good and
bad in software development.  They ignore significant but un-metricized
phenomena in favor of measurable but insignificant ones.  And
(unfortunately) sometimes they make decisions about software
development and software developers.  Sad but true.
--
Jorge A. Gautier| "The enemy is at the gate.  And the enemy is the human mind
jgautier@ads.com|  itself--or lack of it--on this planet."  -General Boy
DISCLAIMER: All statements in this message are false.
"Mommy, where do programs come from?"

ivanp@apertus.mn.org (Ivan Peters) (06/20/91)

In article <1991Jun13.174427.4861@iccgcc.decnet.ab.com>, kambic@iccgcc.decnet.ab
.com (George X. Kambic, Allen-Bradley Inc.) writes:

|> How are software people different from many other people in the world who
|> are evaluated on quantitative metrics (sales, no. produced, etc.)?  Please
|> elucidate.

Just an attempt on this question, from my perspective.  I shall re-state the
question to some extent, becoming what are the job charactoristics for a
good software engineer in general?  And hope this is helpful.

Many people consider computers an extension of ones mind...

I tend to think that the main charactoristics for software engineering are
a good reasoning process, and good organizational skills.  (some knowledge
and training concerning computers is also quite helpful :-) )

These skills are also important in other jobs, but it seems that they are
core, to this profession, while in other professions, the ones having
these better skills become obvious due to other reasons. The other
proffessions, tend to have less things that can vary on an average problem.
While in software, just about everything is a "variable", and the more
variables, the more complex the problem, the more one requires good
reasoning and organization.  Training is supposed to enhance/improve these
area's but some people have these skills naturally better than others.

In this profession, where these skills are (assumed) core, the expresion
of ones code, is where these skills are shown, but many times, this
expresion is not possible due to maintaning/cleaning up of someone
elses work/mess and/or time constraints forced upon one, or interfacing with
others on the project, some having more seniority or political clout
(so right or wrong, do things thier way).  I am sure there are others.

Another observation, in this mental (take this word either way :-) process
is that some people may be able to produce code very well on a given
task, better than many others (probably due to his reasoning fits this
problem better) but is unable to change/adapt to other more complicated
or even different problems.  example:

                Medium Task AA  Difficult Task BB
programmer A:      2 weeks       4 months
programmer B:      3 weeks       6 weeks
programmer C:      1 week        unable

If the majority of tasks at ones company are similiar to AA, which
programmer is more productive/better?  What if the task load
changes?  Is measurement unique (to some extent) to a company?
I have seen this variation occur many times.

Another factor that becomes involved at this point is Maintanence, which
is many times done by other people, which makes it hard to relate/measure
to the initial completion of a task.

The input from whoever maintains the software is also a metric, and
many times, a main one, (if more time is spent on maintanence, vrs development),
and should include, not only the number of problems, but how difficult it
was to fix a problem due to the code expresion/design vrs the complex
problem itself.

Aside, if metrics involves reasoning and oganization, for this profession
isn't this sounding like a IQ test?, and one knows how controversial
those are. :-).

Question: Does someone who is good at designing, have better
   organizational skills?  This I have seen in general but
   there have been some exceptions.  Exceptions invalidate
   a metric. (The metric/statement must not be complete?)

Just my 2 cents worth, (2 cents doesn't buy much anymore..).  :-)
Ivan