[comp.software-eng] complexity question

marick@m.cs.uiuc.edu (08/03/90)

I have a question related to program complexity vs size.  But, first,
my terms:

foo(a)
{
  if (global1 == global2)
    global1 = a;
}

This has 4 data references but only 3 distinct references (since there
are two duplicate references to global1).

Now, if I had to guess, I'd guess that both the number of references
and the number of distinct references grow linearly with program size
(measured in lines of code or in number of tokens).  Does anyone know?
Number of distinct references is bounded below by the number of
variables in the program -- only a bound because of array references
-- so I'd be interested in figures on that, too.

Thanks.  I'm sure I've seen something like this somewhere.

Brian Marick
Motorola @ University of Illinois
marick@cs.uiuc.edu, uiucdcs!marick

warren@eecs.cs.pdx.edu (Warren Harrison) (08/03/90)

In article <39400115@m.cs.uiuc.edu> marick@m.cs.uiuc.edu writes:
>
>I have a question related to program complexity vs size.  But, first,
>my terms:
>  if (global1 == global2)
>    global1 = a;
>This has 4 data references but only 3 distinct references (since there
>are two duplicate references to global1).
The distinct references are the Software Science "number of unique
operands" - n2 and the data references are the Software Science
"number of total operands" - N2.
>
>Now, if I had to guess, I'd guess that both the number of references
>and the number of distinct references grow linearly with program size
>(measured in lines of code or in number of tokens).  Does anyone know?
Most research suggests that Software Science length is highly correlated
with lines of code (Length= N1 + N2 - where N1 is the total number of
operators used) - this indicates that as the code size grows, more
operators/operands are being referenced (for this I got a PhD? :-).
What is more interesting is the n1/n2 relationship (unique references
to operators/operands). As program size grows, unique operators reach
a limit because a language only has so many operators - or at least a
programmer has a favored "working set of operators". In fact after
some point in size, one might want to consider n1 a constant. However,
program vocabulary (n1+n2) is also correlated with program size, which
suggests as you do that operands are added to a program as it increases
in size.

Warren



==========================================================================
Warren Harrison                                          warren@cs.pdx.edu
Department of Computer Science                                503/725-3108
Portland State University

murphyn@motcid.UUCP (Neal P. Murphy) (08/03/90)

marick@m.cs.uiuc.edu writes:

>I have a question related to program complexity vs size.  But, first,
>my terms:

>foo(a)
>{
>  if (global1 == global2)
>    global1 = a;
>}

>This has 4 data references but only 3 distinct references (since there
>are two duplicate references to global1).

>Now, if I had to guess, I'd guess that both the number of references
>and the number of distinct references grow linearly with program size
>(measured in lines of code or in number of tokens).  Does anyone know?

Well, if I may hazard a guess, I will say that you are correct, provided
your program is a simple filter or a simple pipe. But once you start
manipulating the data, the number of distinct references should fall
significantly behind the total number of references. And once you start
using the given data to create new data (lies, damn lies, statistics,
fitted curves and other such mathematical stuff), the total number of
references should grow even faster (arithmetically perhaps?).

Just a guess, though.

NPN