emuleomo@paul.rutgers.edu (Emuleomo) (11/19/89)
I heard that the average programmer produces 3-4 lines of *finished* code a day! This sounds ridiculously low. Does anybody out there know what the real figure is? Or is it misleading to try and gauge productivity this way? If it is, what are the recommended ways to measure programmer productivity using some sort of metrics! Any hints will be appreciated via email or otherwise. Thanx --Emuleomo O.O. (emuleomo@yes.rutgers.edu) -- ** The ONLY thing we learn from history is that we don't learn from history!
sccowan@watmsg.waterloo.edu (S. Crispin Cowan) (11/20/89)
In article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu> emuleomo@paul.rutgers.edu (Emuleomo) writes: >I heard that the average programmer produces 3-4 lines of *finished* >code a day! >This sounds ridiculously low. Does anybody out there know what the real >figure is? The key words here are _average_ and _finished_. Yes, a good programmer can barf out 1000 lines of code in a day, if it's simple, but it isn't always simple. The claim is that for _average_ problems and _average_ programmers, it will then take 250 days to fully debug and document that code. Also keep in mind that average includes a LOT of DP programmers/problems, which means that those 1000 lines of code have to interface to 1,000,000 lines of OTHER people's code, and get it right. >--Emuleomo O.O. (emuleomo@yes.rutgers.edu) ---------------------------------------------------------------------- Login name: sccowan In real life: S. Crispin Cowan Office: DC3548 x3934 Home phone: 570-2517 Post Awful: 60 Overlea Drive, Kitchener, N2M 1T1 UUCP: watmath!watmsg!sccowan Domain: sccowan@watmsg.waterloo.edu "We have to keep pushing the pendulum so that it doesn't get stuck in the extremes--only the middle is worth having." Orwell, Videobanned -- Kim Kofmel
madd@world.std.com (jim frost) (11/21/89)
emuleomo@paul.rutgers.edu (Emuleomo) writes: >I heard that the average programmer produces 3-4 lines of *finished* >code a day! >This sounds ridiculously low. Does anybody out there know what the real >figure is? Or is it misleading to try and gauge productivity this way? I got the following information from Brooks' _Mythical Man Month_, pp. 90-94. This table was originally from John Harr of Bell Telephone Laboratories: Task | Prog Units | # prgmrs | Yrs | Man-yrs | Program Words | Words/yr -----------+------------+----------+-----+---------+---------------+--------- Operational| 50 | 83 | 4 | 101 | 52,000 | 515 Maintenance| 36 | 60 | 4 | 81 | 51,000 | 630 Compiler | 13 | 9 | 2.25| 17 | 38,000 | 2230 Translator | 15 | 13 | 2.5 | 11 | 25,000 | 2270 There are a couple of points which Brooks brings up with regards to this (and some other) data. First, programmer productivity drops quickly as the number of people that a programmer must communicate with increases. This follows the above data exactly. Second, productivity depends on the complexity of the task. The first two tasks above are control programs, the last two are translators. All four are of similar size, yet control program productivity averaged 500-600 words/man-year, while translators where over 2,200. Brooks says "[several examples of productivity] data all confirm striking differences in productivity related to the complexity and difficulty of the task itself". It is my belief that both points -- groups size and code complexity -- are reflected in the above data, although Brooks didn't look at this particular set of data that way. It would have been interesting to see what smaller groups could have done in the control programming tasks. Other data supplied by Brooks: IBM OS/360 showed 600-800 debugged instructions per man-year in control program groups, and 2000-3000 instructions per man-year in translator groups. These number are virtually identical to the above. MULTICS had a mean of 1200 PL/I statements per man-year, which is right about in the middle of both of the above data sets. Somebody is likely to note that "program words" doesn't correspond too well with "program lines". Brooks says: "[The MULTICS] number is *lines* per man-year, not *words*! Each statement in [the MULTICS] system corresponds to about three to five words of handwritten code! This suggests two important conclusions. o Productivity seems constant in terms of elementary statements, a conclusion that is reasonable in terms of the thought a statement requires and the errors it may include.(1) o Programming productivity may be increased as much as five times when a suitable high-level language is used.(2)" The footnotes are: (1) W.M. Taliaffero also reports a constant productivity of 2400 statements/year in assembler, Fortran, and Cobol. See "Modularity. The key to system growth potential," _Software_, 1, 3 (July 1971) pp. 245-257. (2) E.A. Nelson's System Development Corp. Report TM-3225, _Management Handbook for the Estimation of Computer Programming Costs_, shows a 3-to-1 productivity improvement for high-level language (pp. 66-67), although his standard deviations are wide. >what are the recommended ways to measure programmer productivity >using some sort of metrics! I recommend measuring productivity by whether or not the product works. It's the only measure which is always correct. Almost every other measure is easy to fake or won't give you a good idea of how much effort actually went into the result. In conclusion, 3-4 lines per day sounds about right. I suspect that small groups of people, such as those found in small software companies, do substantially better than that. This could be the result of paying much more attention to programmer ability than a large company working on a large project could do. In support of this theory, a highly-skilled three-man team I was working with wrote 27,000 lines of code in ten months, which is roughly 2.5 man-years, for an average of 10,800 lines per man-year or about 29 lines per man-day of fairly complex code. This code was not completely debugged and I considered the programming rate to be exceptionally high (read: we were killing ourselves to try to make a deadline). Fully debugged would likely have been closer to 15 lines per man-day. I consider these to be pretty much the highest numbers I'll ever see, the result of a close-knit group of skilled people, something which rarely happens in large industry. Hope this is useful, jim frost software tool & die madd@std.com
chrisp@regenmeister.uucp (Chris Prael) (11/21/89)
From article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu>, by emuleomo@paul.rutgers.edu (Emuleomo): > I heard that the average programmer produces 3-4 lines of *finished* > code a day! > This sounds ridiculously low. Does anybody out there know what the real > figure is? Or is it misleading to try and gauge productivity this way? > If it is, what are the recommended ways to measure programmer productivity > using some sort of metrics! Are there metrics to measure the productivity of electronic engineers? No? Then how about measuring the productivity of mechanical engineers? Another blank? Perhaps they measure the productivity of civil engineers some way? No again? Perhaps a pattern can be seen here. The real question here is: can you find a functional definition of programmer productivity? I submit that you cannot. Certainly, lines of finished code fails the test of being meaningful quite thoroughly! What is the point of trying to equate a programmer with a manufacturing robot? > ** The ONLY thing we learn from history is that we don't learn from history! Speak for your self! Chris Prael
baalke@mars.jpl.nasa.gov (Ron Baalke) (11/21/89)
>From article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu>, by emuleomo@paul.rutgers.edu (Emuleomo): >> I heard that the average programmer produces 3-4 lines of *finished* >> code a day! >> This sounds ridiculously low. Does anybody out there know what the real >> figure is? Or is it misleading to try and gauge productivity this way? >> If it is, what are the recommended ways to measure programmer productivity >> using some sort of metrics! Well, it is low if you count only the coding phase of the software development. The coding phase takes up only 10% of the entire software cycle. If you include to time to write the requirements, design, code, parameter & assembly test, integration and final testing, then the lines/day that a programmer produces is not low. Ron Baalke | baalke@mars.jpl.nasa.gov Jet Propulsion Lab M/S 301-355 | baalke@jems.jpl.nasa.gov 4800 Oak Grove Dr. | Pasadena, CA 91109 |
hollombe@ttidca.TTI.COM (The Polymath) (11/21/89)
In article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu> emuleomo@paul.rutgers.edu (Emuleomo) writes: }I heard that the average programmer produces 3-4 lines of *finished* }code a day! }This sounds ridiculously low. Does anybody out there know what the real }figure is? Or is it misleading to try and gauge productivity this way? It is a bit misleading unless you specify the language. On the other hand, the figure isn't unreasonable. Sure, I can crank out a few hundred lines on a good day with a lot of borrowing, BUT that isn't "finished" code. It still has to be debugged, tested, certified and documented. And _that's_ assuming the code is for in-house or personal use. If it's for use by relatively naive customers I'll have to spend a lot of time researching and designing the user interface before I even _begin_ to write the code to run it -- not to mention all the time spent designing the underlying data structures and algorithms. Then there's the days when the system crashes or the test facilities aren't available and I catch up on journals and net news. (-: When you put it that way, 4 lines per day is pretty good "average" coding. -- The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com) Illegitimis non Citicorp(+)TTI Carborundum 3100 Ocean Park Blvd. (213) 452-9191, x2483 Santa Monica, CA 90405 {csun|philabs|psivax}!ttidca!hollombe
dopey%looney@Sun.COM (Can't ya tell by the name) (11/21/89)
From article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu>, by emuleomo@paul.rutgers.edu (Emuleomo): > I heard that the average programmer produces 3-4 lines of *finished* > code a day! > This sounds ridiculously low. Does anybody out there know what the real > figure is? Or is it misleading to try and gauge productivity this way? > If it is, what are the recommended ways to measure programmer productivity > using some sort of metrics! I cannot tell you what the average is but I can say that as you measure the value grows. Human nature as it is says to maximize on that value that you are measured by. So if you want me to code 1000 lines of code a day I can (though it may be do nothing lines). But if you want me to develop a well structured robust code it will take longer AND BE FEWER LINES OF CODE. So how does measuring lines of code relate to programmer productivity? If I were measuring programmer productivity (of the top of my head so don't get to excited if I miss something obvious) I would set strict standards that deal with well written programs (e.g. IMHO a. few if any globals, b. one routine per file, c. well documented, etc.). Then I would use CodeCops or Graders or whatever you want to call them to see that the standards were followed and justify those times that they were not followed (there will ALWAYS be some). These CodeCops can then measure how well they can read the code, how many and the severity of the bug found in the code, how well the code stands up to user abuse, etc. Just the type of things that other engineers (mechanical, electrical etc.) are measured by. This method is more time consuming but it "MAY" measure that which you REALLY WANT to measure and not a programmers typing speed. This is one mans opinion.
jim@mks.com (Jim Gardner) (11/21/89)
In article <1989Nov20.170957.19588@world.std.com> madd@world.std.com (jim frost) writes: > >I got the following information from Brooks' _Mythical Man Month_, pp. >90-94. This table was originally from John Harr of Bell Telephone >Laboratories: I don't want to disparage Mr. Frost or the eminent Mr. Brooks, but The Mythical Man-Month is written about an entirely different age of computing (pre-1975). I recently read the book and was surprised at just how much has changed. For example, Mr. Brooks predicted that programmers would some day move from developing under batch to working extensively with time-sharing systems, and he thought this was a good idea. On the other hand, he deplored the fact that OS/360 wasted 26 whole bytes just to handle leap years correctly; he said that sort of thing could be better left up to the operator. Times have changed, programming techniques have changed, and our ideas of software quality have changed. Many of Brooks' insights remain valid, of course, but things like actual numbers have to be regarded as outdated. (It's possible that the numbers haven't changed, even though the programming environment is utterly different. However, we need more modern studies to see if that's true.) Jim Gardner, Mortice Kern Systems
crm@romeo.cs.duke.edu (Charlie Martin) (11/23/89)
In article <34796@regenmeister.uucp> chrisp@regenmeister.uucp (Chris Prael) writes: >From article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu>, by emuleomo@paul.rutgers.edu (Emuleomo): >> I heard that the average programmer produces 3-4 lines of *finished* >> code a day! Most places now get closer to 10 SLOC a day. > >Are there metrics to measure the productivity of electronic engineers? >No? Sure there are. >Then how about measuring the productivity of mechanical engineers? >Another blank? Nope. Happens all the time. >Another blank? Perhaps they measure the productivity of civil engineers >Perhaps they measure the productivity of civil engineers >some way? No again? Sorry, but civil engineers as well. >Perhaps a pattern can be seen here. Yes, I do think you're right: there is a pattern. It obviously isn't the one that you expect. I'm a little bit sorry to take this approach, but I think you'll find that there is some kind of way to measure the productivity, and to estimate based on that productivity, in every engineering field and many other fields as well (for example, the common rule of thumb for technical writing is one staff day per completed page of camera-ready text.) What's more, I'm only a little bit sorry because I think you have failed the back of the envelope first-principles test, which is one that every engineer ought to consider. If there *were* no way to measure, estimate, and predict the productivity of engineers, what would happen? Therer would be, for example, no way to say "we'll start designing this building on monday, and we'll expect to review the designs with the customer in 16 weeks; if that goes okay, then we'll plan for final review with the contractor in 30 weeks, and construction can start on week 40." It *is* true that many other engineering areas have difficulty making very precise estimates, or measuring progress precisely. ("Good news, dear! Today my staff engineered 0.000301 percent of the new bridge. That's up 0.000007 from yesterday!") To that extent, there is some justification in your statement, and your comparison. What does it mean to say that 90% of the system is done? (Another 90% of the work left, of course....) But to claim that other engineering fields have no measures of estimating techniques, and (by implication) that it is foolish for software engineering to look for suitable ones, is probably suitable for fertilizer after suitable composting. > >The real question here is: can you find a functional definition of >programmer productivity? I submit that you cannot. What is your claim here? That programmers produce nothing, and therefore it isn't measurable? I can offer many functional definitions under an appropriate definition of the word "functional": - number of lines of code - number of machine-level instructions - number of pages of finished product - total cost divided by number of staff days and for maintenance, - number of problem reports solved per staff day - progress of quality measures toward a goal, per staff day of effort These measures have theoretical problems (what is a line of code? a page of product?) but they have all proven useful in practice, and all have some predictive value. >Certainly, lines of >finished code fails the test of being meaningful quite thoroughly! What >is the point of trying to equate a programmer with a manufacturing robot? > What is the point of trying to measure the efforts at all? What does "meaningful" mean here? Empirically, lines of code work reasonably well in practice, although this usually requires a lot of tuning of the weighting factors; lines of code estimates and measures also correlate quite highly with other "better founded" measures, when those measures correlate with observed performance. Or is there so little of science in software engineering that it is impossible to measure its effect at all? Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
schultz@cell.mot.COM (Rob Schultz) (11/23/89)
emuleomo@paul.rutgers.edu (Emuleomo) writes: >I heard that the average programmer produces 3-4 lines of *finished* >code a day! >This sounds ridiculously low. Does anybody out there know what the real >figure is? Or is it misleading to try and gauge productivity this way? >If it is, what are the recommended ways to measure programmer productivity >using some sort of metrics! >Any hints will be appreciated via email or otherwise. >Thanx Our developers spend as much as 6 or 8 months working on functional specs, analysis specs, interface specs, and design specs (HLD and LLD). Then the developer may spend 2 to 3 weeks coding, another week or 2 unit testing, and then spend more time in integration. During the actual coding phase, a developer may crank out 5 or 10 modules of perhaps 30 to 60 NCSL's per day. But averaged over the entire development cycle, yeah, 3 - 4 lines of finished code per day is not unreasonable. Given this, LOC per day is not really a good metric to measure productivity with. However, one could easily use it during the strict coding phase (taking into consideration such things as quality of design, etc). During other phases of development one would have to measure productivity in other ways. I don't know how at this point :-) but then I haven't seen any research on that point. Hope this helps . . . -- Thanks - Rob Schultz, Motorola General Systems Group rms 1501 W Shure Dr, Arlington Heights, IL 60004 708 / 632 - 2875 schultz@mot.cell.COM !uunet!motcid!schultz "There is no innocence - only pain." (Usual disclaimers)
chrisp@regenmeister.uucp (Chris Prael) (11/23/89)
From article <16170@duke.cs.duke.edu>, by crm@romeo.cs.duke.edu (Charlie Martin): > In article <34796@regenmeister.uucp> chrisp@regenmeister.uucp (Chris Prael) writes: >>Are there metrics to measure the productivity of electronic engineers? >>No? > Sure there are. >>Then how about measuring the productivity of mechanical engineers? >>Another blank? > Nope. Happens all the time. >>Perhaps they measure the productivity of civil engineers >>some way? No again? > Sorry, but civil engineers as well. >>Perhaps a pattern can be seen here. > Yes, I do think you're right: there is a pattern. It obviously isn't > the one that you expect. Perhaps I over simplified. I was referring to objective measures commonly used by competant professionals. I was not referring to hypotheses that have never managed to escape academia. Sorry to have confused you. The term metric is generally taken to mean a number arrived at by a completely "objective" and simple algorithm. I believe lines of finished code per day were mentioned in the posting to which I responded. > I'm a little bit sorry to take this approach, but I think you'll find > that there is some kind of way to measure the productivity, and to > estimate based on that productivity, in every engineering field and > many other fields as well First, I did not say, nor imply, that there was no way to estimate a project. There is one and it is commonly used in each of the fields I listed as well as in software engineering. Estimation of engineering projects is a complex process that is practiced effectively by only a small percentage of the more senior professionals in most engineering fields. > (for example, the common rule of thumb for > technical writing is one staff day per completed page of camera-ready > text.) Simple tools for simple users, eh? > But to claim that other engineering fields have no measures of > estimating techniques, and (by implication) that it is foolish for > software engineering to look for suitable ones, is probably suitable for > fertilizer after suitable composting. I claimed, and still do, that other fields of engineering do not attempt to fool themselves with simple minded numbers. Every project lead and manager "measures" the productivity of the staff working with him/her. That measure is one's self. One estimates how long a task would have taken one to do, adjusts for assumed relative competance, and compares that to what has actually happened. I gave some thought to responding to the rest of your posting. I finally remembered that one should not kick puppies or bash kids. Get out in the real world and get yourself some experience. In five to ten years, if you find a good mentor, are fairly bright, and work hard, you will start to have some knowlege of what you are talking about. Good luck. Chris Prael
crm@romeo.cs.duke.edu (Charlie Martin) (11/25/89)
In article <34819@regenmeister.uucp> chrisp@regenmeister.uucp (Chris Prael) writes: >From article <16170@duke.cs.duke.edu>, by crm@romeo.cs.duke.edu (Charlie Martin): >> In article <34796@regenmeister.uucp> chrisp@regenmeister.uucp (Chris Prael) writes: >>>Are there metrics to measure the productivity of electronic engineers? >>>No? >> Sure there are. >>>Then how about measuring the productivity of mechanical engineers? >>>Another blank? >> Nope. Happens all the time. >>>Perhaps they measure the productivity of civil engineers >>>some way? No again? >> Sorry, but civil engineers as well. >>>Perhaps a pattern can be seen here. >> Yes, I do think you're right: there is a pattern. It obviously isn't >> the one that you expect. > >Perhaps I over simplified. I was referring to objective measures >commonly used by competant professionals. I was not referring to >hypotheses that have never managed to escape academia. Sorry to have >confused you. > Chris, do you have any, like, *evidence* that what you are saying is true? Which civil engineers have you quizzed? What books on management of design have you read? If there is some evidence to what you are saying I'd like to hear it. But empty assertions and sneers aren't argument, nor do I count your assertions as being more meaningful than mine because you currently work at a company while I currently attend college. On the side, I'll bet 58 cents I was programming for a living before you graduated from high school. >The term metric is generally taken to mean a number arrived at by a >completely "objective" and simple algorithm. I believe lines of >finished code per day were mentioned in the posting to which I >responded. I can't *imagine* a more objective and simple algorithm than that used to count lines of code in most places I've worked. "Number of non-blank lines" was common. "Number of non-blank, non-comment lines" was also common. It fulfils all the mathematical definitions of a measure, I believe (forgive me that I don't dig up Halmos's book to check.) Yes, the algorithm can be circumvented: someone can write a C statement in a form like x = \n 3 \n + \n 4 \n ; and in theory get 5 lines credit of a simple statement. But -- you work for Sun, you tell me -- would you get away with that kind of crap in your code, independent of the question of productivity measures? I sure hope not. When my friends who had single-digit badge numbers at Sun worked there, you certainly could not. The source line of code measure is used precisely BECAUSE it is simple and objective (see e.g. _Software Engineering Metrics and Models_, Conte et al., or _Software Engineering Econonics_, Boehm.) One reason it is objective is that it presumes that the source code being counted fits other constraints on its form that constrain away pathological cases like the above. This is the same reason that the "unroll the loop" argument won't work: if you unroll 'for(i=1;i<10000;1++)' into 10,000 separate statements, you are not going to make friends with your manager. > >First, I did not say, nor imply, that there was no way to estimate a >project. There is one and it is commonly used in each of the fields I >listed as well as in software engineering. Estimation of engineering >projects is a complex process that is practiced effectively by only a >small percentage of the more senior professionals in most engineering >fields. > Sorry, but that appears to be exactly what you did imply. Estimating a project is not something that only a small percentage of most senior professionals do; even a first-line manager or a senior programmer does it in software engineering. There is indeed a way to estimate a project; in most fields including software engineering, it comes down to something like "how many things (SLOC, drawings, components) will it take to build one of these? How long does it take us to do a thing?" Thing counting and empirical models based on previous experience are precisely the way other fields do it; it's the way it is done in practice in software engineering. Source lines of code has been a useful measure BECAUSE within standards it works; empirical models based on SLOC fit predictions nicely to production, AND turn out to correlate strongly with every other model that is also predictive. (This really isn't much of a surprise if you think about it.) These empirical models of productivity seem to be what you claim is not well founded or does not work. Why then are they predictive? >I claimed, and still do, that other fields of engineering do not attempt >to fool themselves with simple minded numbers. Every project lead and >manager "measures" the productivity of the staff working with him/her. >That measure is one's self. One estimates how long a task would have >taken one to do, adjusts for assumed relative competance, and compares >that to what has actually happened. And I clain, and still do, that only fools think that SLOC is a simple minded number without empirical support. Are you a competent software engineer? Do you know what the state of practice in software engineering is? Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
sccowan@watmsg.waterloo.edu (S. Crispin Cowan) (11/27/89)
In article <16186@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >Why then are they predictive? In the software engineering course that I took, I was given to understand that: -SLOC is about as accurate as any other measure devised -SLOC is accurate to within a factor of 2 to 4 (depending on the application domain). Big deal, so is tummy rubbing (i.e. expert opinion based on experience). Yes, there is a lot of good work already done in software engineering, and a lot of good work yet to be done. We know lots of things, most of them relating to how NOT to do it. Software engineering research has not yet suceeded in making programming an engineering field, it's still an art. We WANT it to be engineering, life would be easier if it were engineering, but I don't believe that it's so, yet. >Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm) ---------------------------------------------------------------------- Login name: sccowan In real life: S. Crispin Cowan Office: DC3548 x3934 Home phone: 570-2517 Post Awful: 60 Overlea Drive, Kitchener, N2M 1T1 UUCP: watmath!watmsg!sccowan Domain: sccowan@watmsg.waterloo.edu "We have to keep pushing the pendulum so that it doesn't get stuck in the extremes--only the middle is worth having." Orwell, Videobanned -- Kim Kofmel
warren@psueea.uucp (Warren Harrison) (11/29/89)
>From article <Nov.18.22.47.26.1989.9685@paul.rutgers.edu>, by emuleomo@paul.rutgers.edu (Emuleomo): >> I heard that the average programmer produces 3-4 lines of *finished* >> code a day! >> This sounds ridiculously low. Does anybody out there know what the real >> figure is? Or is it misleading to try and gauge productivity this way? >> If it is, what are the recommended ways to measure programmer productivity >> using some sort of metrics! > Typically, these figures are based on person-days of effort for the entire project. So if you have a 1,000,000 line project and you spend 1,000 person years on it you come up with about 3 lines per person day. However, those 1,000 person years usually include the effort to do requirements, specs, design, testing, etc., as well as programming. I would personally suspect anyone who claims to have detailed enough information on a large enough sample of software to break down how much was spent "programming" (what exactly *is* programming anyway? inspections? low-level design? thinking? hmmm???) in a meaningful manner. Warren Warren Harrison CSNET: warren@cs.pdx.edu Department of Computer Science UUCP: {ucbvax,decvax}!tektronix!psueea!warren Portland State University Internet: warren%cs.pdx.edu@relay.cs.net Portland, OR 97207-0751
crm@romeo.cs.duke.edu (Charlie Martin) (11/29/89)
In article <31986@watmath.waterloo.edu> sccowan@watmsg.waterloo.edu (S. Crispin Cowan) writes: >In article <16186@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >>Why then are they predictive? >In the software engineering course that I took, I was given to >understand that: > -SLOC is about as accurate as any other measure devised > -SLOC is accurate to within a factor of 2 to 4 (depending on > the application domain). Big deal, so is tummy rubbing (i.e. > expert opinion based on experience). Good point: SLOC is about as good as anything, and SLOC isn't particularly good. My suspicion is that we haven't figured out effectively enough how to standardize our measures for the psychological side (understanding of and complexity of the specifications.) I do think that most people who use an empirically-based weighted model, e.g. COCOMO with weights derived statistically within the organization, find far better accuracy that within a factor of 2 to 4. My wife manages maintenance on about 1.2 megaSLOC and her model is within about 15 percent. But that's just been my experience; I don't have a good study to quote to you. > >.... Software engineering research >has not yet suceeded in making programming an engineering field, it's >still an art. We WANT it to be engineering, life would be easier if >it were engineering, but I don't believe that it's so, yet. I agree. It's fortunate so, because otherwise I might not have a research area :-) Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
kww@cbnews.ATT.COM (Kevin W. Wall) (11/30/89)
In article <16186@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >The source line of code measure is used precisely BECAUSE it is simple >and objective (see e.g. _Software Engineering Metrics and Models_, Conte >et al., or _Software Engineering Econonics_, Boehm.) One reason it is >objective is that it presumes that the source code being counted fits >other constraints on its form that constrain away pathological cases >like the above. This is the same reason that the "unroll the loop" >argument won't work: if you unroll 'for(i=1;i<10000;1++)' into 10,000 >separate statements, you are not going to make friends with your >manager. Counting SLOC is reasonably *simple*, yes. But I'm not sure I understand what you be by objective. E.g., Programmers X & Y both are given the same assignment. Suppose that both write code of comparable quality and take approximately 5 days to write it, but programmer X wrote 500 source lines of code to do it in, and programmer Y does the same task in only 150 source lines of code. So, programmer X wrote more code than programmer Y, so what; big deal! The question we have to answer is "what does this imply?" That programmer X's code is somehow better? No. That programmer X is somehow more productive? Well yes, sort of, in the literal sense. Prog X's productivity was 100 SLOC/day, whereas prog Y's is only 30 SLOC/day. So obviously prog X is the better programmer, right? Again no. Maybe prog X is just a very inefficient coder. Also, there is the whole is of maintenance (often 50% or more of the entire life cycle). Whose code is easier to maintain? SLOC gives no real indication of this, but all OTHER things being equal, I'd have to say *I* would have to say regarding maintenance LESS SLOC is (usually) favorable. Unfortunately, using SLOC as a metric leads to all sorts of (apparent) paradoxes such as these. [See the book, "Programming Productivity", by Capers Jones for an execellant discussion of these and others.] For these reasons, I have a hard time swallowing the SLOC as a metric is "objective". I'll accept it, but only after you define in what sense it is objective. >Thing counting and empirical models based on previous experience are >precisely the way other fields do it; it's the way it is done in >practice in software engineering. Source lines of code has been a >useful measure BECAUSE within standards it works; empirical models based >on SLOC fit predictions nicely to production, AND turn out to correlate >strongly with every other model that is also predictive. (This really >isn't much of a surprise if you think about it.) These empirical models >of productivity seem to be what you claim is not well founded or does >not work. > Well that is ONE reason; however, I thing the main reason that is has been useful is because, as we both agree, it is SIMPLE. It involves almost no effort to count SLOC. (I'm not sure what you mean when you write "within standards it works". [See above.] To what standards are you refering. I need a little clarification here. I personally have seen only poor correlations when using SLOC to estimate remaining project effort. This is for 2 reasons: 1) poor initial estimates regarding the total # of SLOC were way out of line, 2) the assumption that the last X lines of code is just as easy to write as the first X lines of code were. (I believe that the assumption in 2) is false because a) the difficult code is often written last, and b) there is a larger integration problem with the last X lines of code than there is with the first X lines.) >And I clain [sic], and still do, that only fools think that SLOC is a simple >minded number without empirical support. I don't know if I'd call SLOC a "simple minded number without empirical support", but (for all the reasons stated above), I would say it is virtually useless. And in my book, that's about the same thing as calling it the former. I guess that makes me a fool then. Oh well, I've been called worse. :-) -- In person: Kevin W. Wall AT&T Bell Laboratories Usenet/UUCP: {att!}cblpf!kww 6200 E. Broad St. Internet: kww@cblpf.att.com Columbus, Oh. 43213 "Death is life's way of firing you!" -- Hack rumor
sullivan@aqdata.uucp (Michael T. Sullivan) (12/01/89)
I, too, don't count the number of lines in my software as I always thought it was pretty silly. During some job interviews I would mention some systems I had worked on and was asked how many lines of code they were. The interviewer wasn't looking for a precise count, he/she was merely trying to get a feel for how big the projects were. This way, 1000 and 2000 wouldn't have been too different. However, 1000 and 10,000 would have been. A light bulb went off in my head and I realized that lines of code weren't so evil when used in this light. -- Michael Sullivan uunet!jarthur.uucp!aqdata!sullivan aQdata, Inc. aqdata!sullivan@jarthur.claremont.edu San Dimas, CA
crm@romeo.cs.duke.edu (Charlie Martin) (12/01/89)
I didn't include the article I'm replying to because I'm too lazy to do a good job of cutting it. I don't think I'm taking any liberties with the reasoning, but if I did, just let me know. I think there is a hangup between "is an objective metric" and "is a GOOD objective metric." I claim SLOC as an objective metric in the sense (1) that it is objective because when I give an exact and effective description of SLOC, I'll get the same count every try, and anyone else who uses the exact an effective description to count with will also get the same count every try, agreeing both with counts from observation to observation and between observers; and (2) that it has the properties we associate with a metric mathematically (essentially that it is additive and obeys the triangle inequality). I'm pretty well aware of the various paradoxes of SLOC; I used to argue this from the other side before I changed my mind. The reason I changed my mind is that over a large number of observations and a long time, the figure "5--10 source lines of code per staff day" has stood up quite well. Even stood up under the change between assembler and HOL's. (One thing about this sort of figure is that it is natural to read this as meaning "they only write 10 SLOC a day and that is it" when the observation is instead total source lines of code ------------------------------ ~= 10 total staff hours over project Naturally one writes more that 10 SLOC a day; the remainder of the time is spent writing all the other stuff, reviewing it and testing it.) Given approriate weightings derived empirically, SLOC seems to lead to a model that is predictive, with both testable time/cost estimates and testable error margins (admittedly large) that can be verified. More oddly -- and I don't UNDERSTAND this, I've just OBSERVED it -- things like Halstead and McCabe don't seem to do a lot better job than SLOC based models do. Halstead in particular seems only to be predictive in the areas and under the conditions that SLOC is predictive. Many of the paradoxes of SLOC have analogous paradoxes in other metrics (like the confounding examples for cyclomatic complexity.) Most of these things are anywhere beween harder and LOTS harder to compute, but don't seem to model any more effectively. For all of these reasons, SLOC (and the others) aren't very good metrics. I only use any of them because I know of nothing better, and they allow me to estimate eith reasonable accuracy (including knowing what the error margin may be, and when my modles are of no use.) metrics. The original assertion wasn't that these are not good metrics -- all I could say then is "have you got something better? May I have it?" It was instead that it was meaningless to even assert that productivity metrics could be defined or used. Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)