[comp.software-eng] Soft-Eng digest v5n18

soft-eng@MITRE.ARPA (Alok Nigam) (07/17/88)
Soft-Eng Digest             Sat, 16 Jul 88       V: Issue  18         

Today's Topics:
                        10th ICSE Proceedings
                          3B2 CASE Software
              abott@aerospace and conklin@cs.utexas.edu
                       C code metrics (5 msgs)
                  cflow --> structure chart (2 msgs)
                             Code Metrics
                  Cynic's Guide, part 5:  Bookshelf
                              expressway
                       Fortran follies (5 msgs)
                        Project documentation
      References sought: Maintenance of Knowledge Based Systems
               Reporting progress on a software project
                 SVCC Workshop proceedings available
----------------------------------------------------------------------

Date: Mon, 11 Jul 88 11:55:01 SST
From: Yeo Chun Cheng <NCBISE2%NUSVM.BITNET@cunyvm.cuny.edu>
Subject: 10th ICSE Proceedings

Copies of the proceedings for the 10th International Conference on Software
Engineering are available for sales.  Could you announce this on the
Soft-Eng list?

The cost is:  US$40 for members of IEEE or ACM
              US$80 for others
inclusive of postage.

Payment should be made by cheque to: Treasurer, 10th ICSE, c/o National
Computer Board, 71 Science Park Drive, Singapore 0511.
Enquiries can be directed to me at the above address or thru my bitnet
account:  ncbise2@nusvm.bitnet    or    chuncheng@itivax.bitnet
regards, chun cheng

------------------------------

Date: 8 Jul 88 20:12:40 GMT
From: ecsvax!khj@mcnc.org  (Kenneth H. Jacker)
Subject: 3B2 CASE Software

Our department is interested in locating CASE (Computer
 Aided Software Engineering) software that will run on
 AT&T 3B2s.  We have multiple 630 MTG graphics terminals
 as well as an AT&T laser printer.

Any information will be appreciated!

------------------------------

Date: 6 Jul 88 15:10:31 GMT
From: attcan!utzoo!yunexus!geac!david@uunet.uu.net  (David Haynes)
Subject: abott@aerospace and conklin@cs.utexas.edu

Could you two folk please mail me again about the todo
system mentioned here.

I have not been able to reach your sites via email, so
if you know your uucp address to a well known site, could
you include that too?

------------------------------

Date: 28 Jun 88 14:07:00 GMT
From: apollo!ulowell!cg-atla!bradlee@beaver.cs.washington.edu  (Rob Bradlee X5153)
Subject: C code metrics

I've recently been reading DeMarco's "Controlling Software Projects"
(Yourdon Press 1982), and am now interested in adding some metrics
to our project.  However, his metrics are designed for EDP projects
written in COBOL.  We have an interactive system written in C and
running UNIX.  Can any net readers offer advice about or tools for
measuring the following:

        Code Volume
                Past attemps to measure the size of code have used
                number of lines.  However, this appears to be pretty
                inaccurate.   DeMarco suggests that counting the
                number of operands and operators and multiply by
                the log(base 2) of the number of unique identifiers
                is a much more accurate measure.  Anyone have a
                program to parse C code and do this?  What about
                including .h files and macros?

        Code Quality
                DeMarco suggests analysing code by counting the
                number of entry and exit points from routines,
                looking for GOTOs, etc.  Anyone have a program
                that could parse C code and at least give
                some indication of relative complexity of
                different modules and perhaps hightlight
                areas of code that might profit from a code
                code review?


I'm looking both for information explaining how to judge the size and
quality of C code, and also for any tools that will automatically
perform some analysis.  Send me email, and if I get any good info I'll
summarize to the net.  Thanks in advance.

------------------------------

Date: 30 Jun 88 13:46:24 GMT
From: ihnp4!twitch!hoqax!twb@bloom-beacon.mit.edu  (T.W. Beattie)
Subject: C code metrics

In article <4820@cg-atla.UUCP>, bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes:
> I'm looking both for information explaining how to judge the size and
> quality of C code, and also for any tools that will automatically
> perform some analysis.

Earlier this year CACM had an excellent article about an NPATH complexity
metric.
It suggests that the complexity of the code (levels of nesting, etc) are
a useful measure of the quality of the code and a particularly good measure of
the maintainability of the code.

I particularly like this metric because it seems difficult to make the
code worse by decreasing the metric.
Many other metrics encourage bad coding practices.

------------------------------

Date: 1 Jul 88 20:44:41 GMT
From: tektronix!reed!psu-cs!warren@bloom-beacon.mit.edu  (Warren Harrison)
Subject: C code metrics

> I'm looking both for information explaining how to judge the size and
> quality of C code, and also for any tools that will automatically
> perform some analysis.  Send me email, and if I get any good info I'll
> summarize to the net.  Thanks in advance.

Look into PC-METRIC from SET Laboratories [503-289-4758].  Does Halsteads
Software Science (the operator/operand strategy you refer to) and Cyclomatic
Complexity.  Versions for C, Pascal, Modula-2, COBOL, FORTRAN, etc.  It was
reviewed in this month's Computer magazine from IEEE.

------------------------------

Date: 6 Jul 88 08:22:00 GMT
From: mcvax!ukc!stc!datlog!dlhpedg!cl@uunet.uu.net  (Charles Lambert)
Subject: C code metrics

In article <4820@cg-atla.UUCP> bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes:
>I'm looking both for information explaining how to judge the size and
>quality of C code, and also for any tools that will automatically
>perform some analysis.  Send me email, and if I get any good info I'll
>summarize to the net.  Thanks in advance.

Can we keep this discussion in the open, please?  I know it gets hashed over
fairly regularly but it is important and perspectives are changing all the
time.  At the moment,  there are several initiatives in the UK to promote
better engineering practices in software;  we cannot engineer what we cannot
measure,  so any discussion of methods is useful.

------------------------------

Date: 13 Jul 88 15:45:34 GMT
From: b-tech!umich!neti1!bdr@umix.cc.umich.edu  (Brian Renaud)
Subject: C code metrics

In article <816@dlhpedg.co.uk>, cl@datlog.co.uk (Charles Lambert) writes:
> In article <4820@cg-atla.UUCP> bradlee@cg-atla.UUCP (Rob Bradlee X5153) writes:
> >I'm looking both for information explaining how to judge the size and
> >quality of C code, and also for any tools that will automatically
> >perform some analysis.  Send me email, and if I get any good info I'll

I have some programs (shell/awk/C, etc.) which provide some metrics
for C programs.   As I remember, they consist of:

   line counter - print actual lines of code (useful for COCOMO modeling)
                  as well as blank lines, number of comments, comment lines

   halstead     - provide various software science metrics, based on Halstead's
                  work

   mccabe       - program complexity analyzer, based on McCabe, etc.

I also have some rather arcane scripts which will run these tools on
a specified set of programs and produce some flat files containing only
the metrics I have found to be statistically significant.  (You may find
that these metrics are not right for you, I only analyzed two 30K DSI
projects.)

If you would like these let me know.  Email is the best way to reach me,
since my access to this news feed is rather shaky.

Brian Renaud    bdr%huron.uucp@umix.cc.umich.edu
                umix!huron!bdr

------------------------------

Date: 6 Jul 88 23:59:41 GMT
From: portal!cup.portal.com!Jeffrey_J_Vanepps@uunet.uu.net
Subject: cflow --> structure chart

If no one has written this little beast before, I guess I'll do it, but
I thought I would ask. I'm looking for something to take the output of
"cflow" and create (with pic/troff/whatever) a structure chart. Pointers,
anyone?

------------------------------

Date: 8 Jul 88 21:10:45 GMT
From: tektronix!orca!stank@bloom-beacon.mit.edu  (Stan Kalinowski)
Subject: cflow --> structure chart

The Tektronix CASE division has a tool that will read C code and
create a structure chart, perhaps this will suit your needs.  The tool
is called "codetosc" and it's part of the TekCASE Designer tools
package. The product nomenclature is SDT1, SDT2, SDT3, or SDT4
depending upon which compute platform you are using.  We use this tool
to fill in the gaps in our code documentation.  (Not all modules were
complicated enough to justify creating structure charts during the
design phase.)  I find that it does a very good first cut at creating
the structure diagrams.  It is usually a simple matter to jump into
the structure chart editor and make the layout of the charts pretty.

Tektronix recently sold the CASE division to Mentor Graphics, another
smaller company here in Oregon, but I believe it is still possible to
buy the package through Tek.  At any rate, contact your local Tek
field office and they can fill in the details.

------------------------------

Date: 8 Jul 88 18:23:21 GMT
From: necntc!dandelion!ulowell!cg-atla!bradlee@ames.arpa  (Rob Bradlee X5153)
Subject: Code Metrics

Some time ago I put out this plea for help:
>I'm looking both for information explaining how to judge the size and
>quality of C code, and also for any tools that will automatically

I suggested response by email with a latter summary to the group.
Unfortunately, I was underwhelmed by the number of responses, but I have
gotten some very good tips from those kind enough to reply.  Here's a
summary:

From: Randy Neff <neff@shasta.stanford.edu>

Code quality is not something that is apparent in the simplistic manipulation
of lines, characters, semi-colons, etc.   The Code metrics are snake oil
attempts to find simple things to measure.   Avoid the charlatans.

From: uunet!pdn!bob (Bob Hickle)

I suggest measuring code complexity by measuring open/close brace pairs.

>From: warren@psu-cs.UUCP (Warren Harrison)

Look into PC-METRIC from SET Laboratories [503-289-4758].  Does Halsteads
Software Science (the operator/operand strategy you refer to) and Cyclomatic
Complexity.  Versions for C, Pascal, Modula-2, COBOL, FORTRAN, etc.  It was
reviewed in this month's Computer magazine from IEEE.

>From: cl@datlog.co.uk (Charles Lambert)

Can we keep this discussion in the open, please?  I know it gets hashed over
fairly regularly but it is important and perspectives are changing all the
time.  At the moment,  there are several initiatives in the UK to promote
better engineering practices in software;  we cannot engineer what we cannot
measure,  so any discussion of methods is useful.

**********************************************************

There was also a comp.software-eng entry suggesting I look at the Feb 88
ACM Communications for "NPATH: a measure of exectuon path complexity and its
applications" by Brian Nejmeh.  This is a good article that has led me to
several others both pro and con.

Also I received a call from Keith Wible
at Analytics (301-381-4300).  Seems Keith has just written a program
for C software metrics.  He will be  mailing me info.

I have ordered the PC-METRICS stuff today ($99 PCs only).

This seems like a very interesting field, how about some net input?
Anybody out there use an metrics in their projects past or present?
Are you for or against their use?  Speak your piece!  And to all
those that contributed, many thanks for the input.

------------------------------

Date: Sun, 26 Jun 88 16:01:04 edt
From: shull@scrolls.wharton.upenn.edu (Christopher E. Shull)
Subject: Cynic's Guide, part 5:  Bookshelf

I have a couple of books to add to the Soft-Eng bookshelf:

James Gleick, !Chaos: Making a New Science!, Viking Penguin 1987,
     ISBN: 0-670-81178-5

Edward R. Tufte, !The Visual Display of Quantitative Information!,
     1983, Graphics Press, Box 430, Cheshire, Connecticut, 06410
     no ISBN on my copy.

------------------------------

Date: 27 Jun 88 19:57:18 GMT
From: ubc-cs!alberta!calgary!hole@beaver.cs.washington.edu  (Steve Hole)
Subject: expressway

Has anyone ever heard of a software development enironment named
expressway(???).  I had heard that it is a research project currently
underway at Standford.   If anyone has any information on what its
capabilities are and what its progress is, I would appreciate it if they
would send it to me.  Myself and another person are currently doing
similar research and are interested in the directions that other
projects have taken.  Thanks.

------------------------------

Date: 25 Jun 88 21:05:44 GMT
From: garth!smryan@unix.sri.com  (Steven Ryan)
Subject: Fortran follies

>I'm not sure about that. Vectorizers will only rarely need the largest
>dimension since it does not appear in the addressing arithmetic.

It is critical for dependency analysis.

Given a loop like
           for i from m to n
             a[xi]:=f a[yi]
dependency analysis determines if xi=yj for m<=i<j<=n. (which means a
value is computed and the result subsequently used--on a vector machine
the results might still be in flight.) In practice, many subscript functions
x and y have solutions for i<j if they are otherwise unbounded. Hence it
is critical to get good values for m and n. They can be used directly from the
loop, but the resulting expressions may be nasty.

If Cyber 205 Fortran is unable to safely determine recursion with the
actual loop bounds it will try again with array bounds. Hence the assumption
that the array bounds are valid. The fact that the largest dimension does not
affect address is irrelevant--it is iteration size that is needed.

>                                     Furthermore, unless the bound
>is hardwired as a constant, it won't be very useful anyway.

The vectoriser  handles constant bounds as a special case.  It uses symbolic
expressions for loop bounds, array dimensions, and subscript expressions.

>                                                            If you
>see reduced vectorization it may be due to an assumption that the
>dimension is short and hence vectorization would be unprofitable.

The Cyber 205's breakeven vector length is from 20 to 50 elements. To get large
enough vectors the compiler has always concentrated on vectorising a loop nest
rather than the innermost loop. (Cray, Kuck, the Good Folks at Rice only worry
about the innermost loop according to the literature.) So.....

If you have loop nest like,
      for i to m
        scalar := ....
        a[i] := ....
        for j to n
            b[i,j] := ....
        c[i] := scalar + ....

If everything is otherwise vectorisable, the j loop can be vectorised
even if n>hardware vector length by surrounding it with scalar stripmining loop.

If m*n<=hardware vector length, the entire nest can be vectorised. But if
m*n>hardware vector length, the i-loop as written cannot be vectorised. If the
loops are split it is possible, but such a split must correctly handle the
promoted scalar which is defined above the split and used below.

Finally to the point: if m and n are expressions, it difficult or impossible
to compare m*n to the hardware limit. In this case, FTN200 agains hunts for
constant bounds of the array. If it can find an upper bound for m*n less than
65535, it will vectorise the entire loop nest. If greater than 65535 or a
constant upper bound is not known, it can only vectorise the innermost.

------------------------------

Date: 27 Jun 88 22:54:30 GMT
From: garth!smryan@unix.sri.com  (Steven Ryan)
Subject: Fortran follies

>The Cyber 205's breakeven vector length is from 20 to 50 elements.

[A person asked where this number came from. I really don't know how to respond
personally (I only learned about *f* and *F* by accidents) through this strange
network, so....]

That is the number Arden Hills always gave us. Where did they get? I'm not
sure, but I think it was murkily derived from benchmark tests.

The vector math library routines are rather arcane. They start by checking the
vector length. If less than 20, they use scalar loops unrolled by a factor
of three (the memory handles up to three concurrent load/stores). Otherwise
they use vector instructions.

------------------------------

Date: 28 Jun 88 14:50:53 GMT
From: s.cc.purdue.edu!ags@h.cc.purdue.edu  (Dave Seaman)
Subject: Fortran follies

>>The Cyber 205's breakeven vector length is from 20 to 50 elements.

I have found the breakeven length to vary from about 5 to 50 elements,
depending on the type of operations being performed.  For a simple vector
add, the breakeven length is around 5 or 6.

------------------------------

Date: 1 Jul 88 00:36:23 GMT
From: osu-cis!killer!tness7!tness1!nuchat!sugar!ssd@ohio-state.arpa  (Scott Denham)
Subject: Fortran follies

In article <801@garth.UUCP>, smryan@garth.UUCP writes:
> Actually, you want the compiler to know if you want really snazzy dependency
> analysis. (Ah, yes, see this diophantine equation has a solution for n=xxx.
> But my vectors ar only yyy long. Oh, no problem.) Of course nobody has
> dependency analysis quite that snazzy.

YOW - perhaps it's a good thing that nobody does, too!! I've used those
sorts of tricks when writing AP microcode and have found that though
they may yield impressive performance when done right, may also lead
to strange and not-so-wonderful things happening when someone get in
there and tweaks a bit.
Still, I wouldn't turn down a compiler with that kind of snazzy
analysis if it were offered!! :}

------------------------------

Date: 2 Jul 88 22:04:40 GMT
From: garth!smryan@unix.sri.com  (Steven Ryan)
Subject: Fortran follies

>YOW - perhaps it's a good thing that nobody does, too!! I've used those
>sorts of tricks when writing AP microcode and have found that though
>they may yield impressive performance when done right, may also lead
>to strange and not-so-wonderful things happening when someone get in
>there and tweaks a bit.

Obviously the compiler and hardware people have to talk to each other.
Because engineers are not willing to make guarentees, this trick is not used.

If the vectoriser is done right, it just means stuffing in an upper bound.
That is already done, in principle, but always with +infinity.

------------------------------

Date: Thu, 7 Jul 88 19:44:52 BST
From: Gordon Howell <mcvax!hci.hw.ac.uk!gordon@uunet.UU.NET>
Subject: Project documentation

As a designer of (primarily) user interfaces, I have found
that software systems require documentation in each of the following
roles:
1. User tutorial --- how to quickly operate the basics.  Intended for
    novice users.  Must be an integral document.  Must include a basic
    glossary of terms and concepts, a 'script' for each common operation,
    and a finctional description of remaining documentation.
2. System documentation --- all the comments, internals, diagrams,
    etc. that go into producing the system.  Intended for programmers.
    Frequently
3. User's guide and reference manual --- answers all questions about
    using the software.  Must be heavily indexed and can be split over
    several volumes.  (often a single general concept per volume)  May
    expand on tutorial (say to include more glossary; more advance
    scripts, etc.)

I intentionally make no reference to on/off line issues.  Ideally all
doc should exist in both forms (wouldn't you *love* a hypertext
implementation of the system documentation!), but the basic ideas
apply to both implementations (for the purposes of this general
treatment --- I do believe there are some fundamentatal differences in
on-line documentation that are outside the scope of this discussion)

The order here is crucial.  Writing the tutorial first is about the
best method I know for ironing out user interface problems before you
even go to the drawing board.  Think of it as a design document itself.

At the other end, the User's Guide should be a pulling together of
notes made in the system documentation.  While not strictly of the
"literate programming" school of thought, I treat documentation as an
integral part of system development.  Usually this is in the form of
notes to imaginary programmers and (from UI design) "this is what the
user should be thinking/doing now" mini-scenarios.

Several manifestations of this practice:

1. I *design* documentation like I would design a system --- modular,
single-concept, usually order independent (ie.  the user should not
need to read A,B,C and D in order to get useful information from E).
I once even did a data flow diagram for documentation, but that may be
overkill.  Essentially any structured design technique ought to
resolve in well structured documentation.  Everything I write requires
a 'makefile'.

2. As I write code (or design documents), I adopt a standards for
comments, which include standard nomenclature for terms, concepts,
etc.  For example, I put all words that I feel require definition in
double quotes, to enable me to extract them later into a glossary.
References to other sections of code go in [] and so on...  This helps
greatly in organizing (and achieving closure) in both system and user
documentation.  When possible, build commenting standards into your
editor (easy to do in GNUemacs C mode for example).

3. Inversely, thinking about how I am going to explain something
can influence how I design it...  (There is a zen-like saying to the
effect that "the true master is he who can communicate his mastery to
others".  For some reason, I think this applies here)

4. As a result of the above, software engineers should be expected to
participate in documentation.  [but don't neglect the technical writer
in the process...  I know you aren't interested in style, but I am
convinced that poor writing style; poor typography and poor
organization are responsible for most documentation failures]

One goal is to attempt to produce complete, understandable
documention; but I feel the *efficient* (and economical) production of
documentation is equally important --- time saved in actually
producing the doc is time that can be spent proofreading it...

I have never applied these techniques to large projects --- 3
person/one year at the most.  I am very interested to hear more on
this, publically or privately.  (Thus, I better stop and let someone
else have a word...)

------------------------------

Date: 30 Jun 88 18:11:30 GMT
From: dgw@mimsy.umd.edu  (Daniel Winkowski)
Subject: References sought: Maintenance of Knowledge Based Systems

        I am planning a paper contrasting life cycle support for
traditional software systems and knowledge based (expert) systems.
References on similar papers, or those considered classic in either
area would be appreciated.

------------------------------

Date: 30 Jun 88 12:40:22 GMT
From: mcvax!ukc!stl!stc!datlog!dlhpedg!cl@uunet.uu.net  (Charles Lambert)
Subject: Reporting progress on a software project

In article <1378@wor-mein.UUCP> pete@wor-mein.UUCP (Pete Turner) writes:
>any suggestion that the schedule
>must slip will be either ignored or thrown right back in the face of the
>suggestee,
>
>The usual solution was to pressure all developers to work 80+ hour weeks at
>no extra pay and then refuse to discuss comp time because the next project
>schedule can't be slipped (hence their previousness).

I think you've found the only answer to this (alarmingly widespread) kind
of management machismo - make them previous!  Only a high turnover of
staff due to discontent and nervous breakdowns will eventually penetrate
the skull of the corporate Gengis Khan.  Tragically,  there are too many
eager young hawks who buy the lie that working 80+ hours without "grasping"
for compensation shows you're "a professional";  what it really shows is
that you're "cannon fodder".

Fortunately,  there are some employers who recognise their staff as valuable
assets rather than recalcitrant pack-animals.

Enough!  I can feel my arms beginning to wave...

------------------------------

Date: 11 Jul 1988 10:18:52-EDT
From: Peter.Feiler@sei.cmu.edu
Subject: SVCC Workshop proceedings available

International Workshop on Software Version and Configuration
Control,  Grassau    1988-Jan-27/29


At last, the final version (sic !) of the proceedings
is available :

      J.F.H. Winkler (ed.)
      Proceedings of the International Workshop
      on Software Version and Configuration Control
      B.G.Teubner Stuttgart, 1988  ISBN  3-519-02671-6, 466 pp.

      The price is DM 78.-

The proceedings can be ordered at the following address
      B.G.Teubner Stuttgart
      Postfach 80 10 69
      D-7000 Stuttgart 80
      Fed.Rep.of Germany

Perhaps, we will meet at the 2nd SVCC which will be
announced soon.

------------------------------

End of Soft-Eng Digest
******************************