[alt.sources.wanted] WANTED: "C" code line counter program

dcavasso@ntpal.uucp (Dana Cavasso) (03/07/91)

     I need a "C" code line counter program, preferably written in
"C".  It will be used on several platforms, so solutions involving
shell scripts and other UNIX utilities won't work.  I'm not very 
picky (although I'd like something that did a little more than count 
newlines :-) 

     With the growing trend toward gathering metrics, I expect
such beasts are out there in force.  If you would be willing to
share your source, let me know.

-- 
Dana Cavasso                            | "A rock pile ceases to be a rock pile
dcavasso%ntpal@egsner.cirr.com          | the moment a single man contemplates 
ntpal!dcavasso@egsner.cirr.com          | it, bearing within him the image of a
...!cs.utexas.edu!egsner!ntpal!dcavasso | cathedral." - Antoine de Saint-Exupery

theo.bbs@shark.cs.fau.edu (Theo Heavey) (03/09/91)

dcavasso@ntpal.uucp (Dana Cavasso) writes:

> 
>      I need a "C" code line counter program, preferably written in
> "C".  It will be used on several platforms, so solutions involving
> shell scripts and other UNIX utilities won't work.  I'm not very 
> picky (although I'd like something that did a little more than count 
> newlines :-) 
> 
Why not use the "wc" program on the UNIX systems. It gives a line
count --- not very sophisticated BUT the source may be available
for altering!

Theo Heavey
Florida Atlantic University

raj@crosfield.co.uk (Ray Jones) (03/11/91)

In article <1991Mar6.214157.18633@ntpal.uucp> dcavasso@ntpal.uucp (Dana Cavasso) writes:
>
>     I need a "C" code line counter program, preferably written in
>"C".  It will be used on several platforms, so solutions involving
>shell scripts and other UNIX utilities won't work.  I'm not very 
>picky (although I'd like something that did a little more than count 
>newlines :-) 

So how about one that counts semi-colons :-)

Ray

-- 
- raj@cel.co.uk           - Ray Jones, Crosfield Electronics, -
- raj@crosfield.co.uk     - Hemel Hempstead, HP2 7RH  UK      -

lws@comm.wang.com (Lyle Seaman) (03/12/91)

dcavasso@ntpal.uucp (Dana Cavasso) writes:
>     I need a "C" code line counter program, preferably written in
>"C".  It will be used on several platforms, so solutions involving

Counting semi-colons is a pretty good approach, as that counts C 
statements.  Lines is kind of less meaningful.  Counting '{' is 
an interesting one, too.

-- 
Lyle 	508 967 2322  		"We have had television problems directly
lws@capybara.comm.wang.com 	 attributable to something not understandable"
Wang Labs, Lowell, MA, USA 	 - unnamed believer in poltergeists

bhoughto@pima.intel.com (Blair P. Houghton) (03/12/91)

In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
>Counting semi-colons is a pretty good approach, as that counts C 
>statements.  Lines is kind of less meaningful.  Counting '{' is 
>an interesting one, too.

{{{{{{{{printf("Oh, if I were a rich man... ;;;;;;;;;;;;;;;;;;;;;;;\n");}}}}}}}}

				--Blair
				  "Currently sleeping with my eyes open."

pyoung@axion.bt.co.uk (Pete Young) (03/13/91)

From article <2969@inews.intel.com>, by bhoughto@pima.intel.com (Blair P. Houghton):
> In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
>>Counting semi-colons is a pretty good approach, as that counts C 
>>statements.  Lines is kind of less meaningful.  Counting '{' is 
>>an interesting one, too.

> {{{{{{{{printf("Oh, if I were a rich man... ;;;;;;;;;;;;;;;;;;;;;;;\n");}}}}}}}}


Tee Hee.

Good point though.

Counting lines, or semicolons, or braces is much more meaningful if
you have some kind of standard to compare your figures with. In this
instance such a standard might take the form of a set of guidelines
about the use of symbols in comments, layout and indentation of code
etc. Or even a machine to generate the code from a specification
(don't scoff too loud, it might happen one day!)

It seems to me (although I am quite prepared to admit I'm wrong) that
there are two generic questions about gathering metrics. The first is,
"what do I want to know about this
program/specification/bridge/whatever?" The second is "What can I
measure to get this information?"

Counting statements is a possible answer to the second question. So,
has the first question been satisfactorily answered? In many cases, I
suspect not. But counting lines of code is a lot easier than thinking
about useful measures of the size and complexity of a program.



  ____________________________________________________________________
  Pete Young         pyoung@axion.bt.co.uk        Phone +44 473 645054
  British  Telecom  Research Labs,  Martlesham Heath   IPSWICH IP5 7RE

cpm00@duts.ccc.amdahl.com (Craig P McLaughlin) (03/13/91)

In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
>dcavasso@ntpal.uucp (Dana Cavasso) writes:
>>     I need a "C" code line counter program, preferably written in
>>"C".  It will be used on several platforms, so solutions involving
>
>Counting semi-colons is a pretty good approach, as that counts C 
>statements.  Lines is kind of less meaningful.  Counting '{' is 
>an interesting one, too.
>

  Counting semi-colons may miscount setups like the one below:

    while(condition)
        do_this;

  That's two, I think. :)  What about counting newlines, but ignoring those
that immediately follow another newline (ie, skip blank lines)?

Craig McLaughlin   cpm00@duts.ccc.amdahl.com   V:(408)737-5502
   I think it's time to come up with a witty signature and disclaimer...

session@uncw.UUCP (Zack C. Sessions) (03/14/91)

cpm00@duts.ccc.amdahl.com (Craig P McLaughlin) writes:

|In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
||dcavasso@ntpal.uucp (Dana Cavasso) writes:
|||     I need a "C" code line counter program, preferably written in
|||"C".  It will be used on several platforms, so solutions involving
||
||Counting semi-colons is a pretty good approach, as that counts C 
||statements.  Lines is kind of less meaningful.  Counting '{' is 
||an interesting one, too.
||

|  Counting semi-colons may miscount setups like the one below:

|    while(condition)
|        do_this;

|  That's two, I think. :)  What about counting newlines, but ignoring those
|that immediately follow another newline (ie, skip blank lines)?

|Craig McLaughlin   cpm00@duts.ccc.amdahl.com   V:(408)737-5502

Counting newlines may not be the way to go either. It is perfectly
legitimate for a statement to span multiple source lines. Take a
complex if() condition, for example, which for readability, you
span a few lines with it. A true C source line counter would almost
have to be the front end to a full compiler.

Zack Sessions
session@uncw.UUCP

mwb@ulysses.att.com (Michael W. Balk) (03/14/91)

In article <1991Mar11.182848.26693@comm.wang.com>, lws@comm.wang.com (Lyle Seaman) writes:
> dcavasso@ntpal.uucp (Dana Cavasso) writes:
> >     I need a "C" code line counter program, preferably written in
> >"C".  It will be used on several platforms, so solutions involving
> 
> Counting semi-colons is a pretty good approach, as that counts C 
> statements.  Lines is kind of less meaningful.  Counting '{' is 
> an interesting one, too.


If you just count semi-colons, then in for-loops such as

	for(i = 0; i < 10; i++)
	  {
	     ...
	  }


i = 0; and i < 10; will be counted as individual statements.
In fact they are, but if you want to count for( ... ) as a single statement
then count the semi-colons and correct the count by subtracting 1 for every
for-statement.  There might be other cases like this that you may want to
consider.  Then again, in most cases this is just probably nit-picking.

dlee@pallas.athenanet.com (Doug Lee) (03/14/91)

In article <dcda02id05Q.01@JUTS.ccc.amdahl.com> cpm00@DUTS.ccc.amdahl.com (PUT YOUR NAME HERE) writes:
>What about counting newlines, but ignoring those
>that immediately follow another newline (ie, skip blank lines)?

My first thought was to skip all comments (single- and multi-line) and then
count only lines containing characters other than whitespace.  This should
be close, though it will still overcount on constructs like
    if (( <long_condition_1> ) ||
        ( <long_condition_2> ) ||
        ... )
Then again, maybe a line that long *should* count as more than one line.  We
also run into the somewhat common declaration syntax
    char *
    foo()
which, by my method, counts as two lines.

Unfortunately, I see no quick way to give a consistent line count regardless
of program syntax.  Counting lines ending in '{' or ';' (after removing
comments and trailing whitespace) would catch most loops and function
definitions without counting them more than once, but constructs like
    while (line = get_next_line(file))
        (void) process_line(line);
would still count only once unless the braces were included (not a bad idea,
imho).  We need a more precise definition of "line" for this, I fear.

Does this remind anyone else of _The Mythical Man Month_?  :-)

-- 
Doug Lee  (dlee@athenanet.com or {bradley,uunet}!pallas!dlee)

lws@comm.wang.com (Lyle Seaman) (03/15/91)

bhoughto@pima.intel.com (Blair P. Houghton) writes:

>In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
>>Counting semi-colons is a pretty good approach, as that counts C 
>>statements.  Lines is kind of less meaningful.  Counting '{' is 
>>an interesting one, too.

>{{{{{{{{printf("Oh, if I were a rich man... ;;;;;;;;;;;;;;;;;;;;;;;\n");}}}}}}}}

Yeah, but that could just as easily be written:
{
{
{
{
{
{
{
{
printf
(
"Oh, if I were a rich man... ;;;;;;;;;;;;;;;;;;;;;;;\n"
)
;
}
}
}
}
}
}
}
}

So either simple approach is susceptible to intentional obfuscation
(but then, most such schemes are).  No one claimed that counting semis 
and curlies was foolproof.  You've demonstrated that it isn't.  On the
other hand, seasoned coders don't usually use ; and } to such excess.
(Yes, there are *occasional* duplicates).

However, they do usually include quite a few redundant newlines.  Comments,
preprocessor directives and white space are very common, and apparently
the original poster didn't wish to count them.  I stand by my suggestion.

-- 
Lyle 	508 967 2322  		"We have had television problems directly
lws@capybara.comm.wang.com 	 attributable to something not understandable"
Wang Labs, Lowell, MA, USA 	 - unnamed believer in poltergeists

jpc@fct.unl.pt (Jose Pina Coelho) (03/18/91)

In article <1991Mar14.192419.1576@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:

   bhoughto@pima.intel.com (Blair P. Houghton) writes:

   >In article <1991Mar11.182848.26693@comm.wang.com> lws@comm.wang.com (Lyle Seaman) writes:
   >>Counting semi-colons is a pretty good approach, as that counts C 
   >>statements.  Lines is kind of less meaningful.  Counting '{' is 
   >>an interesting one, too.

   >{{{{{{{{printf("Oh, if I were a rich man... ;;;;;;;;;;;;;;;;;;;;;;;\n");}}}}}}}}

[... How to fool char counters ...]

Why not compile the sources and checking the size of object code ?

--
Jose Pedro T. Pina Coelho   | BITNET/Internet: jpc@fct.unl.pt
Rua Jau N 1, 2 Dto          | UUCP: ...!mcsun!unl!jpc
1300 Lisboa, PORTUGAL       | Home phone: (+351) (1) 640767

- If all men were brothers, would you let one marry your sister ?

carl@p4tustin.UUCP (Carl W. Bergerson) (03/19/91)

dcavasso@ntpal.uucp (Dana Cavasso) writes:
> 
>      I need a "C" code line counter program, preferably written in
> "C".  It will be used on several platforms, so solutions involving
> shell scripts and other UNIX utilities won't work.  I'm not very 
> picky (although I'd like something that did a little more than count 
> newlines :-) 

In the October or November issue of Unix World the Wizard's Grabbag column
contained three programs for removing comments from C and C++ code. I
believe that one of them was in C.

Once you have the comments removed, you can use the wc program that is
listed in "Software Tools in Pascal" by Kernighan and (memory fails me).
Translating to C shouldn't be all that difficult.
-- 
Carl Bergerson                                           uunet!p4tustin!carl  
Point 4 Data Corporation                                     carl@point4.com
15442 Del Amo Avenue                                   Voice: (714) 259 0777
Tustin, CA 92680-6445                                    Fax: (714) 259 0921

hammes@dill.informatik.uni-kl.de (Stefan Hammes (HiWi Mattern)) (03/20/91)

In article <JPC.91Mar17162220@terra.fct.unl.pt>, jpc@fct.unl.pt (Jose
Pina Coelho) writes:
|>
|>In article <1991Mar14.192419.1576@comm.wang.com> lws@comm.wang.com
(Lyle Seaman) writes:
|>
|>   bhoughto@pima.intel.com (Blair P. Houghton) writes:
|>
|>   >In article <1991Mar11.182848.26693@comm.wang.com>
lws@comm.wang.com (Lyle Seaman) writes:
|>   >>Counting semi-colons is a pretty good approach, as that counts C 
|>   >>statements.  Lines is kind of less meaningful.  Counting '{' is 
|>   >>an interesting one, too.
|>
|>   >{{{{{{{{printf("Oh, if I were a rich man...
;;;;;;;;;;;;;;;;;;;;;;;\n");}}}}}}}}
|>
|>[... How to fool char counters ...]
|>
|>Why not compile the sources and checking the size of object code ?

Because this is very machine, compiler and linker dependent!

Stefan
                      
+---------------------------------------+-------------------------------------+
| Stefan Hammes                         | e-Mail: hammes@informatik.uni-kl.de |
| FB Informatik, SFB 124-D1             +-------------------------------------+
| Universitaet Kaiserslautern, P.O.Box 3049, D-W6750 Kaiserslautern, Germany  |
+-------------+------------------------------------------------+--------------+
              | Language definition: Recursion - see recursion |
              +------------------------------------------------+

jgautier@vangogh.ads.com (Jorge Gautier) (03/21/91)

In article <1991Mar20.114450.19653@rhrk.uni-kl.de> hammes@dill.informatik.uni-kl.de (Stefan Hammes (HiWi Mattern)) writes:
>   |>[... How to fool char counters ...]
>   |>
>   |>Why not compile the sources and checking the size of object code ?
>
>   Because this is very machine, compiler and linker dependent!

So is the size of the source code.

jpc@fct.unl.pt (Jose Pina Coelho) (03/21/91)

In article <1991Mar20.114450.19653@rhrk.uni-kl.de>
	hammes@dill.informatik.uni-kl.de (Stefan Hammes (HiWi Mattern)) writes:
   In article <JPC.91Mar17162220@terra.fct.unl.pt>, jpc@fct.unl.pt (Jose
   Pina Coelho) writes:
   |>
   |>In article <1991Mar14.192419.1576@comm.wang.com> lws@comm.wang.com
   (Lyle Seaman) writes:
   |>
   |>
   |>[... How to fool char counters ...]
   |>
   |>Why not compile the sources and checking the size of object code ?

   Because this is very machine, compiler and linker dependent!


It's probably the best [better fiability/work ratio ] bet, you can
compare by compiling all programs on the same architecture with the
same compiler.

Another bet [machine, compiler and linker independent]:

	Get a C grammar and count tokens.



On the other hand, you need a method that is programer-independent because:
	- Programer A came from Fortran, he isn't used to:
	  - ``Creative'' for's :-) .
	  - Recursivity.
        - Programer B came from C
	
B is going to produce the same funcionality with a lower ammount of
code.




--
Jose Pedro T. Pina Coelho   | BITNET/Internet: jpc@fct.unl.pt
Rua Jau N 1, 2 Dto          | UUCP: ...!mcsun!unl!jpc
1300 Lisboa, PORTUGAL       | Home phone: (+351) (1) 640767

- If all men were brothers, would you let one marry your sister ?