soft-eng@MITRE.ARPA (Alok Nigam) (07/17/88)
Soft-Eng Digest Sat, 16 Jul 88 V: Issue 18 Today's Topics: 10th ICSE Proceedings 3B2 CASE Software abott@aerospace and conklin@cs.utexas.edu C code metrics (5 msgs) cflow --> structure chart (2 msgs) Code Metrics Cynic's Guide, part 5: Bookshelf expressway Fortran follies (5 msgs) Project documentation References sought: Maintenance of Knowledge Based Systems Reporting progress on a software project SVCC Workshop proceedings available ---------------------------------------------------------------------- Date: Mon, 11 Jul 88 11:55:01 SST From: Yeo Chun Cheng <NCBISE2%NUSVM.BITNET@cunyvm.cuny.edu> Subject: 10th ICSE Proceedings Copies of the proceedings for the 10th International Conference on Software Engineering are available for sales. Could you announce this on the Soft-Eng list? The cost is: US$40 for members of IEEE or ACM US$80 for others inclusive of postage. Payment should be made by cheque to: Treasurer, 10th ICSE, c/o National Computer Board, 71 Science Park Drive, Singapore 0511. Enquiries can be directed to me at the above address or thru my bitnet account: ncbise2@nusvm.bitnet or chuncheng@itivax.bitnet regards, chun cheng ------------------------------ Date: 8 Jul 88 20:12:40 GMT From: ecsvax!khj@mcnc.org (Kenneth H. Jacker) Subject: 3B2 CASE Software Our department is interested in locating CASE (Computer Aided Software Engineering) software that will run on AT&T 3B2s. We have multiple 630 MTG graphics terminals as well as an AT&T laser printer. Any information will be appreciated! ------------------------------ Date: 6 Jul 88 15:10:31 GMT From: attcan!utzoo!yunexus!geac!david@uunet.uu.net (David Haynes) Subject: abott@aerospace and conklin@cs.utexas.edu Could you two folk please mail me again about the todo system mentioned here. I have not been able to reach your sites via email, so if you know your uucp address to a well known site, could you include that too? ------------------------------ Date: 28 Jun 88 14:07:00 GMT From: apollo!ulowell!cg-atla!bradlee@beaver.cs.washington.edu (Rob Bradlee X5153) Subject: C code metrics I've recently been reading DeMarco's "Controlling Software Projects" (Yourdon Press 1982), and am now interested in adding some metrics to our project. However, his metrics are designed for EDP projects written in COBOL. We have an interactive system written in C and running UNIX. Can any net readers offer advice about or tools for measuring the following: Code Volume Past attemps to measure the size of code have used number of lines. However, this appears to be pretty inaccurate. DeMarco suggests that counting the number of operands and operators and multiply by the log(base 2) of the number of unique identifiers is a much more accurate measure. Anyone have a program to parse C code and do this? What about including .h files and macros? Code Quality DeMarco suggests analysing code by counting the number of entry and exit points from routines, looking for GOTOs, etc. Anyone have a program that could parse C code and at least give some indication of relative complexity of different modules and perhaps hightlight areas of code that might profit from a code code review? I'm looking both for information explaining how to judge the size and quality of C code, and also for any tools that will automatically perform some analysis. Send me email, and if I get any good info I'll summarize to the net. Thanks in advance. ------------------------------ Date: 30 Jun 88 13:46:24 GMT From: ihnp4!twitch!hoqax!twb@bloom-beacon.mit.edu (T.W. Beattie) Subject: C code metrics In article <4820@cg-atla.UUCP>, bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes: > I'm looking both for information explaining how to judge the size and > quality of C code, and also for any tools that will automatically > perform some analysis. Earlier this year CACM had an excellent article about an NPATH complexity metric. It suggests that the complexity of the code (levels of nesting, etc) are a useful measure of the quality of the code and a particularly good measure of the maintainability of the code. I particularly like this metric because it seems difficult to make the code worse by decreasing the metric. Many other metrics encourage bad coding practices. ------------------------------ Date: 1 Jul 88 20:44:41 GMT From: tektronix!reed!psu-cs!warren@bloom-beacon.mit.edu (Warren Harrison) Subject: C code metrics > I'm looking both for information explaining how to judge the size and > quality of C code, and also for any tools that will automatically > perform some analysis. Send me email, and if I get any good info I'll > summarize to the net. Thanks in advance. Look into PC-METRIC from SET Laboratories [503-289-4758]. Does Halsteads Software Science (the operator/operand strategy you refer to) and Cyclomatic Complexity. Versions for C, Pascal, Modula-2, COBOL, FORTRAN, etc. It was reviewed in this month's Computer magazine from IEEE. ------------------------------ Date: 6 Jul 88 08:22:00 GMT From: mcvax!ukc!stc!datlog!dlhpedg!cl@uunet.uu.net (Charles Lambert) Subject: C code metrics In article <4820@cg-atla.UUCP> bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes: >I'm looking both for information explaining how to judge the size and >quality of C code, and also for any tools that will automatically >perform some analysis. Send me email, and if I get any good info I'll >summarize to the net. Thanks in advance. Can we keep this discussion in the open, please? I know it gets hashed over fairly regularly but it is important and perspectives are changing all the time. At the moment, there are several initiatives in the UK to promote better engineering practices in software; we cannot engineer what we cannot measure, so any discussion of methods is useful. ------------------------------ Date: 13 Jul 88 15:45:34 GMT From: b-tech!umich!neti1!bdr@umix.cc.umich.edu (Brian Renaud) Subject: C code metrics In article <816@dlhpedg.co.uk>, cl@datlog.co.uk (Charles Lambert) writes: > In article <4820@cg-atla.UUCP> bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes: > >I'm looking both for information explaining how to judge the size and > >quality of C code, and also for any tools that will automatically > >perform some analysis. Send me email, and if I get any good info I'll I have some programs (shell/awk/C, etc.) which provide some metrics for C programs. As I remember, they consist of: line counter - print actual lines of code (useful for COCOMO modeling) as well as blank lines, number of comments, comment lines halstead - provide various software science metrics, based on Halstead's work mccabe - program complexity analyzer, based on McCabe, etc. I also have some rather arcane scripts which will run these tools on a specified set of programs and produce some flat files containing only the metrics I have found to be statistically significant. (You may find that these metrics are not right for you, I only analyzed two 30K DSI projects.) If you would like these let me know. Email is the best way to reach me, since my access to this news feed is rather shaky. Brian Renaud bdr%huron.uucp@umix.cc.umich.edu umix!huron!bdr ------------------------------ Date: 6 Jul 88 23:59:41 GMT From: portal!cup.portal.com!Jeffrey_J_Vanepps@uunet.uu.net Subject: cflow --> structure chart If no one has written this little beast before, I guess I'll do it, but I thought I would ask. I'm looking for something to take the output of "cflow" and create (with pic/troff/whatever) a structure chart. Pointers, anyone? ------------------------------ Date: 8 Jul 88 21:10:45 GMT From: tektronix!orca!stank@bloom-beacon.mit.edu (Stan Kalinowski) Subject: cflow --> structure chart The Tektronix CASE division has a tool that will read C code and create a structure chart, perhaps this will suit your needs. The tool is called "codetosc" and it's part of the TekCASE Designer tools package. The product nomenclature is SDT1, SDT2, SDT3, or SDT4 depending upon which compute platform you are using. We use this tool to fill in the gaps in our code documentation. (Not all modules were complicated enough to justify creating structure charts during the design phase.) I find that it does a very good first cut at creating the structure diagrams. It is usually a simple matter to jump into the structure chart editor and make the layout of the charts pretty. Tektronix recently sold the CASE division to Mentor Graphics, another smaller company here in Oregon, but I believe it is still possible to buy the package through Tek. At any rate, contact your local Tek field office and they can fill in the details. ------------------------------ Date: 8 Jul 88 18:23:21 GMT From: necntc!dandelion!ulowell!cg-atla!bradlee@ames.arpa (Rob Bradlee X5153) Subject: Code Metrics Some time ago I put out this plea for help: >I'm looking both for information explaining how to judge the size and >quality of C code, and also for any tools that will automatically I suggested response by email with a latter summary to the group. Unfortunately, I was underwhelmed by the number of responses, but I have gotten some very good tips from those kind enough to reply. Here's a summary: From: Randy Neff <neff@shasta.stanford.edu> Code quality is not something that is apparent in the simplistic manipulation of lines, characters, semi-colons, etc. The Code metrics are snake oil attempts to find simple things to measure. Avoid the charlatans. From: uunet!pdn!bob (Bob Hickle) I suggest measuring code complexity by measuring open/close brace pairs. >From: warren@psu-cs.UUCP (Warren Harrison) Look into PC-METRIC from SET Laboratories [503-289-4758]. Does Halsteads Software Science (the operator/operand strategy you refer to) and Cyclomatic Complexity. Versions for C, Pascal, Modula-2, COBOL, FORTRAN, etc. It was reviewed in this month's Computer magazine from IEEE. >From: cl@datlog.co.uk (Charles Lambert) Can we keep this discussion in the open, please? I know it gets hashed over fairly regularly but it is important and perspectives are changing all the time. At the moment, there are several initiatives in the UK to promote better engineering practices in software; we cannot engineer what we cannot measure, so any discussion of methods is useful. ********************************************************** There was also a comp.software-eng entry suggesting I look at the Feb 88 ACM Communications for "NPATH: a measure of exectuon path complexity and its applications" by Brian Nejmeh. This is a good article that has led me to several others both pro and con. Also I received a call from Keith Wible at Analytics (301-381-4300). Seems Keith has just written a program for C software metrics. He will be mailing me info. I have ordered the PC-METRICS stuff today ($99 PCs only). This seems like a very interesting field, how about some net input? Anybody out there use an metrics in their projects past or present? Are you for or against their use? Speak your piece! And to all those that contributed, many thanks for the input. ------------------------------ Date: Sun, 26 Jun 88 16:01:04 edt From: shull@scrolls.wharton.upenn.edu (Christopher E. Shull) Subject: Cynic's Guide, part 5: Bookshelf I have a couple of books to add to the Soft-Eng bookshelf: James Gleick, !Chaos: Making a New Science!, Viking Penguin 1987, ISBN: 0-670-81178-5 Edward R. Tufte, !The Visual Display of Quantitative Information!, 1983, Graphics Press, Box 430, Cheshire, Connecticut, 06410 no ISBN on my copy. ------------------------------ Date: 27 Jun 88 19:57:18 GMT From: ubc-cs!alberta!calgary!hole@beaver.cs.washington.edu (Steve Hole) Subject: expressway Has anyone ever heard of a software development enironment named expressway(???). I had heard that it is a research project currently underway at Standford. If anyone has any information on what its capabilities are and what its progress is, I would appreciate it if they would send it to me. Myself and another person are currently doing similar research and are interested in the directions that other projects have taken. Thanks. ------------------------------ Date: 25 Jun 88 21:05:44 GMT From: garth!smryan@unix.sri.com (Steven Ryan) Subject: Fortran follies >I'm not sure about that. Vectorizers will only rarely need the largest >dimension since it does not appear in the addressing arithmetic. It is critical for dependency analysis. Given a loop like for i from m to n a[xi]:=f a[yi] dependency analysis determines if xi=yj for m<=i<j<=n. (which means a value is computed and the result subsequently used--on a vector machine the results might still be in flight.) In practice, many subscript functions x and y have solutions for i<j if they are otherwise unbounded. Hence it is critical to get good values for m and n. They can be used directly from the loop, but the resulting expressions may be nasty. If Cyber 205 Fortran is unable to safely determine recursion with the actual loop bounds it will try again with array bounds. Hence the assumption that the array bounds are valid. The fact that the largest dimension does not affect address is irrelevant--it is iteration size that is needed. > Furthermore, unless the bound >is hardwired as a constant, it won't be very useful anyway. The vectoriser handles constant bounds as a special case. It uses symbolic expressions for loop bounds, array dimensions, and subscript expressions. > If you >see reduced vectorization it may be due to an assumption that the >dimension is short and hence vectorization would be unprofitable. The Cyber 205's breakeven vector length is from 20 to 50 elements. To get large enough vectors the compiler has always concentrated on vectorising a loop nest rather than the innermost loop. (Cray, Kuck, the Good Folks at Rice only worry about the innermost loop according to the literature.) So..... If you have loop nest like, for i to m scalar := .... a[i] := .... for j to n b[i,j] := .... c[i] := scalar + .... If everything is otherwise vectorisable, the j loop can be vectorised even if n>hardware vector length by surrounding it with scalar stripmining loop. If m*n<=hardware vector length, the entire nest can be vectorised. But if m*n>hardware vector length, the i-loop as written cannot be vectorised. If the loops are split it is possible, but such a split must correctly handle the promoted scalar which is defined above the split and used below. Finally to the point: if m and n are expressions, it difficult or impossible to compare m*n to the hardware limit. In this case, FTN200 agains hunts for constant bounds of the array. If it can find an upper bound for m*n less than 65535, it will vectorise the entire loop nest. If greater than 65535 or a constant upper bound is not known, it can only vectorise the innermost. ------------------------------ Date: 27 Jun 88 22:54:30 GMT From: garth!smryan@unix.sri.com (Steven Ryan) Subject: Fortran follies >The Cyber 205's breakeven vector length is from 20 to 50 elements. [A person asked where this number came from. I really don't know how to respond personally (I only learned about *f* and *F* by accidents) through this strange network, so....] That is the number Arden Hills always gave us. Where did they get? I'm not sure, but I think it was murkily derived from benchmark tests. The vector math library routines are rather arcane. They start by checking the vector length. If less than 20, they use scalar loops unrolled by a factor of three (the memory handles up to three concurrent load/stores). Otherwise they use vector instructions. ------------------------------ Date: 28 Jun 88 14:50:53 GMT From: s.cc.purdue.edu!ags@h.cc.purdue.edu (Dave Seaman) Subject: Fortran follies >>The Cyber 205's breakeven vector length is from 20 to 50 elements. I have found the breakeven length to vary from about 5 to 50 elements, depending on the type of operations being performed. For a simple vector add, the breakeven length is around 5 or 6. ------------------------------ Date: 1 Jul 88 00:36:23 GMT From: osu-cis!killer!tness7!tness1!nuchat!sugar!ssd@ohio-state.arpa (Scott Denham) Subject: Fortran follies In article <801@garth.UUCP>, smryan@garth.UUCP writes: > Actually, you want the compiler to know if you want really snazzy dependency > analysis. (Ah, yes, see this diophantine equation has a solution for n=xxx. > But my vectors ar only yyy long. Oh, no problem.) Of course nobody has > dependency analysis quite that snazzy. YOW - perhaps it's a good thing that nobody does, too!! I've used those sorts of tricks when writing AP microcode and have found that though they may yield impressive performance when done right, may also lead to strange and not-so-wonderful things happening when someone get in there and tweaks a bit. Still, I wouldn't turn down a compiler with that kind of snazzy analysis if it were offered!! :} ------------------------------ Date: 2 Jul 88 22:04:40 GMT From: garth!smryan@unix.sri.com (Steven Ryan) Subject: Fortran follies >YOW - perhaps it's a good thing that nobody does, too!! I've used those >sorts of tricks when writing AP microcode and have found that though >they may yield impressive performance when done right, may also lead >to strange and not-so-wonderful things happening when someone get in >there and tweaks a bit. Obviously the compiler and hardware people have to talk to each other. Because engineers are not willing to make guarentees, this trick is not used. If the vectoriser is done right, it just means stuffing in an upper bound. That is already done, in principle, but always with +infinity. ------------------------------ Date: Thu, 7 Jul 88 19:44:52 BST From: Gordon Howell <mcvax!hci.hw.ac.uk!gordon@uunet.UU.NET> Subject: Project documentation As a designer of (primarily) user interfaces, I have found that software systems require documentation in each of the following roles: 1. User tutorial --- how to quickly operate the basics. Intended for novice users. Must be an integral document. Must include a basic glossary of terms and concepts, a 'script' for each common operation, and a finctional description of remaining documentation. 2. System documentation --- all the comments, internals, diagrams, etc. that go into producing the system. Intended for programmers. Frequently 3. User's guide and reference manual --- answers all questions about using the software. Must be heavily indexed and can be split over several volumes. (often a single general concept per volume) May expand on tutorial (say to include more glossary; more advance scripts, etc.) I intentionally make no reference to on/off line issues. Ideally all doc should exist in both forms (wouldn't you *love* a hypertext implementation of the system documentation!), but the basic ideas apply to both implementations (for the purposes of this general treatment --- I do believe there are some fundamentatal differences in on-line documentation that are outside the scope of this discussion) The order here is crucial. Writing the tutorial first is about the best method I know for ironing out user interface problems before you even go to the drawing board. Think of it as a design document itself. At the other end, the User's Guide should be a pulling together of notes made in the system documentation. While not strictly of the "literate programming" school of thought, I treat documentation as an integral part of system development. Usually this is in the form of notes to imaginary programmers and (from UI design) "this is what the user should be thinking/doing now" mini-scenarios. Several manifestations of this practice: 1. I *design* documentation like I would design a system --- modular, single-concept, usually order independent (ie. the user should not need to read A,B,C and D in order to get useful information from E). I once even did a data flow diagram for documentation, but that may be overkill. Essentially any structured design technique ought to resolve in well structured documentation. Everything I write requires a 'makefile'. 2. As I write code (or design documents), I adopt a standards for comments, which include standard nomenclature for terms, concepts, etc. For example, I put all words that I feel require definition in double quotes, to enable me to extract them later into a glossary. References to other sections of code go in [] and so on... This helps greatly in organizing (and achieving closure) in both system and user documentation. When possible, build commenting standards into your editor (easy to do in GNUemacs C mode for example). 3. Inversely, thinking about how I am going to explain something can influence how I design it... (There is a zen-like saying to the effect that "the true master is he who can communicate his mastery to others". For some reason, I think this applies here) 4. As a result of the above, software engineers should be expected to participate in documentation. [but don't neglect the technical writer in the process... I know you aren't interested in style, but I am convinced that poor writing style; poor typography and poor organization are responsible for most documentation failures] One goal is to attempt to produce complete, understandable documention; but I feel the *efficient* (and economical) production of documentation is equally important --- time saved in actually producing the doc is time that can be spent proofreading it... I have never applied these techniques to large projects --- 3 person/one year at the most. I am very interested to hear more on this, publically or privately. (Thus, I better stop and let someone else have a word...) ------------------------------ Date: 30 Jun 88 18:11:30 GMT From: dgw@mimsy.umd.edu (Daniel Winkowski) Subject: References sought: Maintenance of Knowledge Based Systems I am planning a paper contrasting life cycle support for traditional software systems and knowledge based (expert) systems. References on similar papers, or those considered classic in either area would be appreciated. ------------------------------ Date: 30 Jun 88 12:40:22 GMT From: mcvax!ukc!stl!stc!datlog!dlhpedg!cl@uunet.uu.net (Charles Lambert) Subject: Reporting progress on a software project In article <1378@wor-mein.UUCP> pete@wor-mein.UUCP (Pete Turner) writes: >any suggestion that the schedule >must slip will be either ignored or thrown right back in the face of the >suggestee, > >The usual solution was to pressure all developers to work 80+ hour weeks at >no extra pay and then refuse to discuss comp time because the next project >schedule can't be slipped (hence their previousness). I think you've found the only answer to this (alarmingly widespread) kind of management machismo - make them previous! Only a high turnover of staff due to discontent and nervous breakdowns will eventually penetrate the skull of the corporate Gengis Khan. Tragically, there are too many eager young hawks who buy the lie that working 80+ hours without "grasping" for compensation shows you're "a professional"; what it really shows is that you're "cannon fodder". Fortunately, there are some employers who recognise their staff as valuable assets rather than recalcitrant pack-animals. Enough! I can feel my arms beginning to wave... ------------------------------ Date: 11 Jul 1988 10:18:52-EDT From: Peter.Feiler@sei.cmu.edu Subject: SVCC Workshop proceedings available International Workshop on Software Version and Configuration Control, Grassau 1988-Jan-27/29 At last, the final version (sic !) of the proceedings is available : J.F.H. Winkler (ed.) Proceedings of the International Workshop on Software Version and Configuration Control B.G.Teubner Stuttgart, 1988 ISBN 3-519-02671-6, 466 pp. The price is DM 78.- The proceedings can be ordered at the following address B.G.Teubner Stuttgart Postfach 80 10 69 D-7000 Stuttgart 80 Fed.Rep.of Germany Perhaps, we will meet at the 2nd SVCC which will be announced soon. ------------------------------ End of Soft-Eng Digest ******************************