[comp.software-eng] Effect of execution-speed on reliability/testing

zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy) (12/19/90)

I know this may be an unpopular opinion, but I have personal experiences
to back up the position:

R   |          /---------------\
e   |         /                 \
l   |        /                   \
i   |       /                     \
a   |      /                       \
b   |     /                         \
i   |    /                           \
l   |   /                             \
i   |  /                               \
t   | /                                 \
y   |/                                   \
    +------------------------------------------>
    slow       Execution Speed       faster

I don't know what the fine detail of this graph should look like, but
personal experience has shown the general trends to be true.  Namely,
both extremenly fast and extremely slow programs suffer from reliability
problems.

Every textbook that I have read, and every software engineering professor
that I have spoken with has emphasized the right-hand side of this graph,
but they have all neglected the left-hand side.  The impression that I 
have gotten from most professionals in the software trade is something
like: "Speed isn't important.  The hardware will speed up and take care
of our inefficiencies."  I beg to differ.  Speed has a direct impact on
reliability.  Systems that are too slow to test thoroughly, just don't
get tested.  


Case in point:
    Last August I rewrote a significant portion of one of our systems 
    in C, (After hours, on my own time, of course.)  Here are some rough
    statistics comparing the two implementations:

                         C            Database Package

  *1* Lines of Code:     5k                5k

  *2*   Reusability:    20%                2%

   Development time:     4wks              ?
  *3*  Testing time:     3days             2wks 

    Number of Files:     7                60

  *4* Service Calls:     1                28 

        Speed N=10k:    20min.            90min.
        Speed N=50k:    45min.           360min. (6 hrs!)
  *5*  Speed N=250k:   120min.          1440min. (24 hrs!)


  *1* These estimates include comments and blank lines, which are
      heavy in the C version, and non-existent in the database
      version.  The original database version, which is used for
      these comparisons, was very poorly written.  A cleaned-up
      implementation, written with the same database package had 
      only about 3k lines.  

  *2* About 1k of the C version is a btree-module which has been LINKed 
      into about 20 programs by myself and others since this system was
      released, including ?-many programs in 3 separate production systems,
      and numerous "one-shots".  Certain sections of the database version 
      are duplicated in other parts of the system.  I don't know which 
      came first.  They are all cut-and-paste copies due to the inter- 
      pretative nature of the database package, and every single copy 
      has to be manually updated, re-"compiled", and retested whenever 
      a bug is detected, or dependent factors change.  I don't rate 
      this kind of reuse very highly.  (I tried to promote a macro-
      preprocessor when I came on board here, but management is too
      conservative.  They don't want to *change* anything.  Apparently,
      employing 5 full-time programmers to continually patch their
      baling-wire and bubble-gum 70k lines of source without adding
      any functionality is cost-effective.)  

  *3* I don't think the database version was ever thoroughly tested,
      but it took two weeks to use it to generate the output for 
      comparison with the C version.  

  *4* The C version was installed at the end of August.  One bug has 
      been detected so far (fence-post) in three and a half months of
      daily use at four sites.  The database version was generating at 
      least two service calls a week due to: its dependencies on numerous 
      files; its slow speed (forcing users to abort it while running to 
      handle priority items, leaving it exposed to power-loss problems for 
      extended periods of time, etc., etc.); and the numerous latent bugs 
      that it contained which were never uncovered because it was too slow 
      to test thoroughly.  

  *5* Times taken from production runs.  Average production run: N=100k.

    I put a lot of extra effort into the C implementation to match the
    output from the database system byte-for-byte, duplicating bugs and 
    all, (to automate testing with file-comparisons,) so the systems are 
    functionally identical.  Before I undertook the task of rewriting
    this application in C, I reimplemented it with the database package
    first.  I gained roughly a factor of two in performance by cleaning
    up the code, but it was still so incredibly slow that it would have
    taken at least four weeks to run all the necessary tests.  What's
    more, I tried three or four different approaches, using both the 
    highest- and the lowest- level features of the package to try to 
    squeeze more performance out of it.  -- It just wasn't there.

I was able to run 5-8 tests of the C version in the amount of time
that it took to run 1 test of the database version.  I was able to
test the C version much more thoroughly than the database version had
ever been tested.  Because the C version ran so much faster than the
database version, it was far less vulnerable to user-aborts, power-losses,
etc..  

Many people in the software field promote an attitude that performance
is not important, or that increasing performance degrades reliability.
These statements are misleading.  Highly optimized code which sacrifices
legibility for performance can seriously degrade reliability, but extremely
slow code is just as bad -- or worse.

Regards, 
			--- Paul...

burley@pogo.ai.mit.edu (Craig Burley) (12/19/90)

In article <1990Dec19.102005.11830@engin.umich.edu> zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy) writes:

   Speed has a direct impact on
   reliability.  Systems that are too slow to test thoroughly, just don't
   get tested.  

Great example you give.  I have a similar, though funnier I think, example.

At one company I was at, they had an in-house bulletin board system kind of
like this newsgroup system, but less sophisticated and using the built-in
file networking of the systems instead of "real" network calls.

Anyway, it was written and maintained on an "on the programmer's free time"
basis, and the "programmer" at one point was someone who apparently had
little free time except to "keep shoveling", as it were, to keep his head
above the water, since the system required several kinds of maintenance
(dealing with problem postings, creating new categories, and so on).

One of the tasks, an annual one, was to run a "renumber" program that would
renumber all the postings so searches would go much faster.

Problem was, this program was written in the system's command language,
the equivalent of a unix shell script or an MS-DOS .BAT (?) file I guess,
and that made it a) slow, and b) unable to deal with simple errors.

I had mentioned this to the programmer in charge, but he said that because
he was so busy, he couldn't take the two weeks or so he thought it might
take to rewrite it in a compiled language.  (And, since he was a compiler
person or some such thing, that made sense, because he wasn't an expert
on the OS-level file system calls that would comprise most of the
application.)

So, one night around the New Year, he had started the thing running over
the central bulletin board data base at around 7 pm.  This task was
projected to take over 6 hours, based on earlier runs, and prevented people
from using the bulletin board during that time.

Meanwhile, in our group we had a local bulletin board data base that also
needed renumbering, and I was kind of in charge of that one, having set it
up.

Rather than just run the renumber program, I decided to figure it out.  After
doing that, I decided to write a new version in a high-level language.  (Want
a good laugh?  I used PL/I Subset G.)  I then used a copy of our smaller
local data base to debug and test it, including the error recovery features
I implemented that weren't in the original.  Because it ran fast enough, it
was easy to do this -- purposely trash postings, user profiles, whatever,
in the sample data base, run my program, and make sure it did something
sensible.  When I was confident in the program, I ran it on our live local
data base, and it worked great.

So I decided to give it a really big test.  I made a copy of the huge central
data base and ran my program on that.  Some 5-10 minutes later (maybe it was
15), it was done, and had fixed some "problems" it encountered that I had
programmed it to deal with (and let me know it had done so, so I could check
the original and make sure it did the right thing).

The result was a new renumber program that was around two orders of magnitude
faster than the old one, just as easy to maintain (I think), far better
at error recovery, and not much larger (the executable image -- though this
wasn't an important consideration).

How long did it take me to do this?  Well, at the time I knew the OS-level
file system calls quite well -- I was writing a manual on these in fact --
so it didn't take me long at all.  In fact, it took only a few hours!  When
I was finished, the "production" run of the original renumber program was
still going on the original data base.  So I left my copy of the "renumbered"
data base around in case the other programmer wanted it, sent him mail about
it and my new program, and prepared to head home.

Before I got home, I took another look at the system either running the
program or with the disk holding the live central data base, and noticed the
operations staff was just about to shut it down -- thus terminating the
program, which would require restarting it from (essentially) the beginning.

But there wasn't any need to ask operations to delay shutdown -- the program
had already crashed!  It had encountered one of those problems that the new,
faster version of the renumber program also had encountered, but instead of
fixing it, it just died.

The other programmer was appreciative, because not only did he have a new
renumber program -- one he'd wanted to write himself, but hadn't had the
time -- but he also had a ready-to-use renumbered central data base to
replace the old one with, so people could use the bulletin board system again,
and he wouldn't have to deal with the deluge of email asking why it wasn't
back again!

The moral?  Just because something is slow, unwieldy, old, and appears to get
the job done, doesn't mean it isn't worth rewriting it to be fast, new,
and better able to do the same job.

Even if the program gets run only once or so per year!
--

James Craig Burley, Software Craftsperson    burley@ai.mit.edu

oman@groucho (12/20/90)

In article <1990Dec19.102005.11830@engin.umich.edu> zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy) writes:

> The impression that I 
>have gotten from most professionals in the software trade is something
>like: "Speed isn't important.  The hardware will speed up and take care
>of our inefficiencies."  I beg to differ.  Speed has a direct impact on
>reliability.  Systems that are too slow to test thoroughly, just don't
>get tested.  
>
>Many people in the software field promote an attitude that performance
>is not important, or that increasing performance degrades reliability.
>These statements are misleading.  Highly optimized code which sacrifices
>legibility for performance can seriously degrade reliability, but extremely
>slow code is just as bad -- or worse.
>
IEEE Software Magazine recognized this problem over a year ago and 
decided to do something about it.  We have scheduled a special issue
on software for perfomance analysis.  Following is the call for papers.

----------------------------Call for Papers---------------------------


 Call for Papers
 IEEE Software Magazine
 Theme Issue on
 Software for Performance Analysis
 
 
Performance analysis is rapidly becoming a hot topic in software engineering.
Performance analysis is meant to answer the questions beyond "Is it correct?".
The focus is on whether the system under study is "fast enough" or "reliable 
enough" and "what is the effect of making changes in the system?".   These 
questions are important in several key emerging areas:
 
* Graphics workstations have made it possible to create sophisticated software 
systems with visually-oriented interfaces,  but response time is a critical 
performance attribute.
* Real-time embedded systems must meet requirements of timing and reliability.
* High-speed networks are being designed to meet the demands of distributed 
computing and of connecting ultra-fast CPUs.  Whether such networks can be made
fast enough at a reasonable cost must be addressed before significant resources 
are devoted to them.
* Parallel computing is highly performance-oriented: significant speedups or 
significantly larger problems must be acheived to make the additional software, 
hardware, and programming overhead cost effective.
  
These are four widely disparate areas, but all require performance analysis.  
 
In the past, performance has either been ignored or has been carried out by 
running discrete and repetitious experiments resulting in lengthy dumps of 
metrics which were analyzed by hand.  Today's complex systems require more 
than simple dumps of metrics.  Systems are studied through instrumenting, 
modeling, and simulating software and hardware.  The overwhelming amount of 
data produced by such systems and the numerous options involved require that 
performance information be given to the user in a digested form, possibly with 
some analysis done or important data highlighted.  These methodologies and 
techniques for the study of  software performance need to be shared.  This 
issue of IEEE Software will provide a forum for recent developments in software
methods and techniques that serve to expedite performance analysis.
 
This is a call for papers on performance analysis tools and reports of 
particular software studies presented as a systematic approach to measurement 
and analysis of software performance.  Papers describing both case studies of 
general importance and tools of proven effectiveness are solicited.  In 
particular, innovative methodologies and novel presentations of performance 
data that have a potential for wide application are sought.
 
 Submit 6 copies of manuscript by February 1, 1991 to:
 
 	Kathleen M. Nichols	or	Paul Oman
 	Apple Computer, Inc.		Computer Science Dept.
 	20525 Mariani Avenue		College of Engineering
 	M/S 76-3K		        University of Idaho
 	Cupertino, CA 95014		Moscow, ID 83843
 	(408) 974-1136		        (208) 885-6589
 	nichols@apple.com		oman@ted.cs.uidaho.edu
 
 For further information, contact Kathleen or Paul at the above addresses.

bertrand@eiffel.UUCP (Bertrand Meyer) (12/25/90)

From <1990Dec19.102005.11830@engin.umich.edu>
by zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy):

> Every textbook that I have read, and every software engineering professor
> that I have spoken with has emphasized the right-hand side of this graph,
> but they have all neglected the left-hand side.  The impression that I 
> have gotten from most professionals in the software trade is something
> like: "Speed isn't important.  The hardware will speed up and take care
> of our inefficiencies."

This is quite interesting. I don't remember ever reading any such
comment. In fact, what comes to mind is one of the classic books
about algorithm design (Aho, Hopcroft and Ullman's ``The Design and
Analysis of Computer Algorithms'', Addison-Wesley, 1974) and
its famous introductory chapter explaining that faster machines
make efficient algorithms more crucial, not less.

I also do not recall meeting any ``software professionals'' who
were not concerned about efficiency in both time and space.

Coming back to the published literature, here is a a question
to Mr. McCarthy: can you quote a publication in the field
which states that ``speed is not important''?

-- Bertrand Meyer
bertrand@eiffel.com

davecb@yunexus.YorkU.CA (David Collier-Brown) (12/27/90)

From <1990Dec19.102005.11830@engin.umich.edu>
by zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy):
[...]						 The impression that I 
| have gotten from most professionals in the software trade is something
| like: "Speed isn't important.  The hardware will speed up and take care
| of our inefficiencies."

bertrand@eiffel.UUCP (Bertrand Meyer) writes:
| This is quite interesting. I don't remember ever reading any such
| comment. In fact, what comes to mind is one of the classic books
| about algorithm design (Aho, Hopcroft and Ullman's ``The Design and
| Analysis of Computer Algorithms'', Addison-Wesley, 1974) and
| its famous introductory chapter explaining that faster machines
| make efficient algorithms more crucial, not less.

  Ah yes, the snake-oil problem once again!  
  Many moderately experienced people in the commercial world believe that
optimization is the work of the devil.  Not premature optimization, not
careful consideration of algorithm speed at design time, but **all**
optimization, all consideration of time/space tradeoffs.

  This comes from a basic misunderstanding of the purpose of such
consideration, which in turn comes mostly (IMHO) from rote-learning.  And
that learning comes variously from the ``become a programmer in 9 months''
courses, universities with weak CS departments and, most notably, from
people who put on ``upgrading'' courses for companies.
  I've personal experience with the latter: I had to bowdlerize a course
**I** was giving from someone else's lesson plans because it was full of
plausible, malicious lies[1]. 

  In some companies (Xanaro, Geac, Honeywell CCSC, etc), they were recognized
as oversimplifications. In others (Honeywell TSDC, NDX, etc), they were
believed.  The latter group of companies had **serious** deliverability
problems.

  One of the best, and also most memorable, restatements came from Dick
McMurray and Ashok Patel of Xanaro:

	First you make it right.
	Then you make it fast.
	Then you make it small.


| I also do not recall meeting any ``software professionals'' who
| were not concerned about efficiency in both time and space.

	Alas, your use of quotation marks[2] was exactly correct: those
who are not and never will be professional will quote whatever
cant is to their personal advantage, no matter who they harm in the
process.  Those who are will worry about correctness, timelyness and
efficiency.

--dave
[1] PML's are actually what you use to test programs with. Alas,
	some people assume that they're the same as simplifying
	assumptions.
[2] For those who don't know, quotation marks around a non-quotation
	are to emphasize that the speaker **does not** wish the reader
	to think he believes that quoted statement. For ``software
	professional'' read snake-oil salesman.
-- 
David Collier-Brown,  | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave or just
Willowdale, Ontario,  | postmaster@{nexus.}yorku.ca
CANADA. 416-223-8968  | work phone (416) 736-5257 x 22075

foster@gdc.portal.com (Sharon Foster) (12/27/90)

In article <470@eiffel.UUCP>, bertrand@eiffel.UUCP (Bertrand Meyer) writes:
> by zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy):
> 
>> Every textbook that I have read, and every software engineering professor
>> that I have spoken with has emphasized the right-hand side of this graph,
>> but they have all neglected the left-hand side.  The impression that I 
>> have gotten from most professionals in the software trade is something
>> like: "Speed isn't important.  The hardware will speed up and take care
>> of our inefficiencies."
> 
> This is quite interesting. I don't remember ever reading any such
> comment. In fact, what comes to mind is one of the classic books
> about algorithm design (Aho, Hopcroft and Ullman's ``The Design and
> Analysis of Computer Algorithms'', Addison-Wesley, 1974) and
> its famous introductory chapter explaining that faster machines
> make efficient algorithms more crucial, not less.
> 
> I also do not recall meeting any ``software professionals'' who
> were not concerned about efficiency in both time and space.
> 
> Coming back to the published literature, here is a a question
> to Mr. McCarthy: can you quote a publication in the field
> which states that ``speed is not important''?
> 
> -- Bertrand Meyer
> bertrand@eiffel.com

How about Michael A. Jackson's two rules of optimization:

Rule 1: Don't do it.

Rule 2: Don't do it yet.

from _Principles of Program Design_, chapter 12.


-- 
/************************************************************************/
/* Sharon Foster....First Generation Trekkie  *  Internet:              */
/* "So many books, so little time!"           *   foster@gdc.portal.com */
/************************************************************************/

locke@nike.paradyne.com (Richard Locke) (01/05/91)

In article <470@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes:
>From <1990Dec19.102005.11830@engin.umich.edu>
>by zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy):

>> ...The impression that I 
          ^^^^^^^^^^
>> have gotten from most professionals in the software trade is something
>> like: "Speed isn't important.  The hardware will speed up and take care
>> of our inefficiencies."
>
>This is quite interesting. I don't remember ever reading any such
>comment....
>Coming back to the published literature, here is a a question
>to Mr. McCarthy: can you quote a publication in the field
>which states that ``speed is not important''?

I think Mr. Mccarthy [sic] was just exagerating a bit for the sake
of discussion.  One might restate things as "Good design and coding
practices are more important than execution speed, in part because
hardware will speed up..."

One could get Mccarthy's impression even by a superficial glance at Meyer's
book, where the quality factors of correctness, robustness, extendibility,
reusability, and compatibility are showcased.  Execution speed is
group with "other qualities" -- though "good use of...resources...
is of course an essential requirement...." ;-)  Anyway, that's really
the point -- competent professionals will always strive to make
good use of resources.

In some applications execution speed is clearly crital and needs to
be considered as an integral part of the requirements; not so in
others.  If you want to read the comments of a real speed freak,
check out Bill Gates in "Programmers at Work".  When he started out
programming on slow machines, he clearly was correct in placing
strong emphasis on speed.  Such an obsession is obviously uncalled for
in a minor and infrequently used feature of a system running on a fast
processor.  It all depends on what you're trying to accomplish.

Boris Beizer (author of "Software Testing Techniques") has some
inflammatory things to say about performance.  In his test class
notes, he says:

	good algorithms far more important than code details
	programmer's perspective on performance usually wrong
	do it straight and simple and redo it if it turns out
		to be a problem
	justify all horrors in the name of performance by a formal
		quantitative, statistically valid, in-context, model
	no module accounts for more than 5% of execution time yet
		every programmer "optimizes" his code (anecdote)
	programmers confuse the need for path with the execution
		probability of the path and so all paths are
		"optimized"
	execution time is still being preached and trained even
		though it is important only in exeptional cases
	detailed modeling and measurement of software shows that
		programmers haven't the slightest knowledge of
		the performance impact (or lack thereof) of their
		program

Boris says the three best ways to destroy quality are to
squeeze the schedule, squeeze the code (memory), and squeeze
execution time.


-dick

flint@gistdev.gist.com (Flint Pellett) (01/08/91)

>From <1990Dec19.102005.11830@engin.umich.edu>
>by zarnuk@caen.engin.umich.edu (Paul Steven Mccarthy):

>> Every textbook that I have read, and every software engineering professor
>> that I have spoken with has emphasized the right-hand side of this graph,
>> but they have all neglected the left-hand side.  The impression that I 
>> have gotten from most professionals in the software trade is something
>> like: "Speed isn't important.  The hardware will speed up and take care
>> of our inefficiencies."

If there is someone who says they aren't concerned with efficiency of the
algorithms, then I'd have to wonder how experienced that person is in the
real world of business.  I've noted two things:

1.  In large multi-user systems, the power of the system often does
not expand any faster than the demands on it do.  (The day you double
the power of the system is the same day the administrators decide that
the system is capabable of supporting 2.0 times as many users at once. 
(Since income is often based on number of users, this is a natural
reaction.)  It is also the same day that the users discover that now
they can run more programs at once, or more complex programs: the work
the machine is asked to do expands to consume the power available.) 
It's a little like conserving gas: if you knew your gas supplies were
always increasing, you might argue that nobody should care if all the
cars made only get 12 miles per gallon.  But if you try to increase the
number of cars being run at once faster than the gas supply is increasing,
you're headed for a shortage.  If you're smart, you'll work on both ends:
increasing the supply, (the cpu power available) and reducing the demand
(the cpu power consumed by each program.)

2.  You have to compete with the rest of the world.  For example, if
it takes you 3 days of computing to figure out what the weather is
going to be like tomorrow, the results aren't useful at all.  You then
have two choices: wait until someone develops a machine that is 4
times faster that you can afford, or improve your algorithm so that it
can produce results before they are obsolete.  You won't stay in
business very long if you decide to wait for better hardware when your
competition is improving their algorithms. 
-- 
Flint Pellett, Global Information Systems Technology, Inc.
1800 Woodfield Drive, Savoy, IL  61874     (217) 352-1165
uunet!gistdev!flint or flint@gistdev.gist.com

gbeary@uswat.uswest.com (Greg Beary) (01/08/91)

In article <826@gdc.portal.com> foster@gdc.portal.com (Sharon Foster) writes:

.... Stuff Deleted 
>> which states that ``speed is not important''?
>> 
>> -- Bertrand Meyer
>> bertrand@eiffel.com
>
>How about Michael A. Jackson's two rules of optimization:
>
>Rule 1: Don't do it.
>
>Rule 2: Don't do it yet.
>
>from _Principles of Program Design_, chapter 12.
>
>

My personal experience in realtime systems is just the opposite, and
just as mis-leading. When working for a IBM-compatible hardware company
the Disk-folks were selecting new micro-processors. The criteria was 
"the fastest is the best". 

This begged the question, "how fast does it have to be". The answer was
"as fast as we can get". I think we need to add a healthy dose of
engineering into this discussion, so we can understand "how fast does it
have to be". 



--
Greg Beary 				|  (303)889-7935
US West Advanced Technology  		|  gbeary@uswest.com	
6200 S. Quebec St.         		| 
Englewood, CO  80111			|

foster@gdc.portal.com (Sharon Foster) (01/08/91)

In article <14141@uswat.UUCP>, gbeary@uswat.uswest.com (Greg Beary) writes:
> In article <826@gdc.portal.com> foster@gdc.portal.com (Sharon Foster) writes:
> 
> .... Stuff Deleted 
>>> which states that ``speed is not important''?
>>> 
>>> -- Bertrand Meyer
>>> bertrand@eiffel.com
>>
>>How about Michael A. Jackson's two rules of optimization:
>>
>>Rule 1: Don't do it.
>>
>>Rule 2: Don't do it yet.
>>
>>from _Principles of Program Design_, chapter 12.
>>
>>
> 
> My personal experience in realtime systems is just the opposite, and
> just as mis-leading. When working for a IBM-compatible hardware company
> the Disk-folks were selecting new micro-processors. The criteria was 
> "the fastest is the best". 
> 
> This begged the question, "how fast does it have to be". The answer was
> "as fast as we can get". I think we need to add a healthy dose of
> engineering into this discussion, so we can understand "how fast does it
> have to be". 
> 
> 
> 
> --
> Greg Beary 				|  (303)889-7935
> US West Advanced Technology  		|  gbeary@uswest.com	
> 6200 S. Quebec St.         		| 
> Englewood, CO  80111			|

Well, I didn't say I agreed with him, but I also work in real-time
systems, and I do try to adhere to Rule #2.  Actually, my rule is really
"First you make it right, then you make it fast."
-- 
/************************************************************************/
/* Sharon Foster....First Generation Trekkie  *  Internet:              */
/* "So many books, so little time!"           *   foster@gdc.portal.com */
/***** These are my own Biased Personal Opinions and no one else's! *****/

EGNILGES@pucc.Princeton.EDU (Ed Nilges) (01/08/91)

In article <15809.278a0d04@levels.sait.edu.au>, xtbjh@levels.sait.edu.au writes:

>
>I agree strongly that software is an engineering discipline.  Any program
>is a trade off between many competing forces, including correctness,
>reliability, reusability, maintainability as well as speed, memory, disk etc.

Hmph?  I am not sure that you can "trade off" various levels of
correctness.  A software system is either correct, or else it isn't.
Also, correctness UNDERLIES reliability, reusability, maintainability
and even speed: an incorrect program is ipso facto unreliable, com-
petent programmers are loth to reuse incorrect software, and when you
maintain such software your first job is to make it correct...only
then should you make changes to it.  As to speed, you shouldn't even
worry about it until you have a correct program, and in my experience
minor uncorrected bugs can in certain circumstances be a drag on speed.

I know that in the so-called real world "today is more important than
correct."  Okay, ship with minor bugs if your customer says ship (and
buy malpractice insurance.)  But just as Richard Slomka, in NO NONSENSE
MANAGEMENT says that the fact that "the numbers" aren't perfect means
you should not stop your search for correct numbers, the fact that
real software almost always has some bugs does not mean that bug removal
is not an honorable calling, and that correctness is not the goal.

xtbjh@levels.sait.edu.au (01/09/91)

In article <1047@gistdev.gist.com>, flint@gistdev.gist.com (Flint Pellett) writes:

[...]
> 
> If there is someone who says they aren't concerned with efficiency of the
> algorithms, then I'd have to wonder how experienced that person is in the
> real world of business.  I've noted two things:
> 
> 1.  In large multi-user systems, the power of the system often does
> not expand any faster than the demands on it do. [...]
> 
> 2.  You have to compete with the rest of the world. [...] 

I agree strongly that software is an engineering discipline.  Any program 
is a trade off between many competing forces, including correctness, 
reliability, reusability, maintainability as well as speed, memory, disk etc.

Our main product is a fuel management system that sits alongside petrol 
pumps in refuelling bays.  Each time we ship a system, we have to pay for 
the parts and assembly of the hardware, whereas we can copy the software 
essentially for free.  Since the market is very price-sensitive and 
competition is fairly fierce, we choose to trade high-powered hardware for 
more effort on software, which in turn forces attention onto the performance 
of the algorithms.  

The art of producing "good" systems will always turn out to be the art of 
engineering "good" interfaces (hardware, software, user).  Efficiency is 
always an issue in the technology involved in the interface and in the 
engine that is behind the interface.  Many existing interfaces - and in 
particular procedural languages such as Algol, C etc - are quite inefficient 
since they force parallel tasks into a series of sequential operations.  Any 
interface overhead (procedure call, opcode fetch, signal propogation) 
is then magnified many times.

> -- 
> Flint Pellett, Global Information Systems Technology, Inc.
> 1800 Woodfield Drive, Savoy, IL  61874     (217) 352-1165
> uunet!gistdev!flint or flint@gistdev.gist.com

--
Brenton Hoff (behoffski)		xtbjh@levels.sait.edu.au
Senior Software Engineer
Transponder Australia

dave@cs.arizona.edu (Dave P. Schaumann) (01/09/91)

In article <15809.278a0d04@levels.sait.edu.au> xtbjh@levels.sait.edu.au writes:
> [...]
>
>I agree strongly that software is an engineering discipline.

I agree too.  Of course, programming also includes aspects of science and
mathematics, too.  (Just my $.02...)

>--
>Brenton Hoff (behoffski)		xtbjh@levels.sait.edu.au
>Senior Software Engineer
>Transponder Australia


Dave Schaumann		| You are in a twisty maze of little
dave@cs.arizona.edu	| C statements, all different.

Chris.Holt@newcastle.ac.uk (Chris Holt) (01/09/91)

In article <12212@pucc.Princeton.EDU> EGNILGES@pucc.Princeton.EDU writes:
>In article <15809.278a0d04@levels.sait.edu.au>, xtbjh@levels.sait.edu.au writes:
>>
>>I agree strongly that software is an engineering discipline.  Any program
>>is a trade off between many competing forces, including correctness,
>>reliability, reusability, maintainability as well as speed, memory, disk etc.
>
>Hmph?  I am not sure that you can "trade off" various levels of
>correctness.  A software system is either correct, or else it isn't.

Not quite right.  Case 1: the results of a system are not critical,
or the time between production of results and their use is long; and
it is feasible to determine by other means whether or not they are
right, once they are produced.  Then, you chug away, and every time
you check and find one wrong, you sigh and use some different longer
algorithm instead.

Case 2: a real time system has a choice of different algorithms
that yield different levels of accuracy, or an iterative procedure
is used that converges on the "right" answer.  Then as soon as
you need an answer of some sort, you take the most accurate one
available; here you trade off time and accuracy.

You might argue that in the latter case, the time/accuracy
tradeoff is part of the specification, and so the program as
a whole is correct; but the real world isn't very good at
formal specifications that incorporate several attributes of
this kind, much less specifications of tradeoffs.
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "Between the dark and the daylight, when the net is beginning to lower..."

tgl@g.gp.cs.cmu.edu (Tom Lane) (01/10/91)

In article <14141@uswat.UUCP>, gbeary@uswat.uswest.com (Greg Beary) writes:
> >
> >How about Michael A. Jackson's two rules of optimization:
> >Rule 1: Don't do it.
> >Rule 2: Don't do it yet.
> >from _Principles of Program Design_, chapter 12.
> 
> My personal experience in realtime systems is just the opposite, and
> just as mis-leading. When working for a IBM-compatible hardware company
> the Disk-folks were selecting new micro-processors. The criteria was 
> "the fastest is the best". 
> 
> This begged the question, "how fast does it have to be". The answer was
> "as fast as we can get". I think we need to add a healthy dose of
> engineering into this discussion, so we can understand "how fast does it
> have to be". 

Amen.  Here's a relevant engineering opinion:

P. J. Plauger has an excellent article on just this topic in the Jan '91
issue of _Embedded Systems Programming_.  He puts forward something he
calls the Seventy Percent Rule:

	The cost of programming any resource rises rapidly once you
	use up about 70% of that resource.

For "resource" read "available CPU time", "disk space", etc.  You spend
your time in trying to shave usage, improving resource-exhaustion error
recovery algorithms, and so on.  As you get closer to 100% utilization,
development cost and system unreliability increase exponentially.

A corollary to this is that you need to choose an adequately powerful
mechanism: fast CPU chip, large disk, etc.  Plauger illustrates this
with the issue of operations that have to be performed within some
specified time after receipt of a trigger signal.  Among the possible
software mechanisms (assuming a Unix-like OS) are:
  1. Interrupt handlers inside the kernel;
  2. A waiting process, locked in core;
  3. A process ready to run, but possibly swapped out;
  4. A program (or even shell script) to be started on demand.
Plauger says you should "choose the slowest mechanism that is fast
enough.  By fast enough, I imply that you allow plenty of headroom, in
accordance with the 70% rule."  This gives you the easiest-to-program
alternative, because (a) you don't use a more restrictive mechanism than
necessary, and (b) you don't get into the expensive close-to-the-
resource-limit style of programming.  Picking a level that is just a bit
too slow is a common mistake: it looks like it will work, but the costs
of programming at 95% (or 105%) of the resource limit will kill you.

I'm not sure where Plauger got the 70% figure from; he cites no
references.  Maybe the curve doesn't really take off until 90%; or maybe
you want to figure 50% to give yourself some room for estimation error.
In any case the general principle is surely correct.

This can actually be read as supporting Jackson's rules against
optimization, if you define "optimization" to mean the kind of marginal
fiddling that people tend to do when they are close to a resource limit.
Plauger is advocating intelligent consideration of performance
requirements in the initial system design, so as to *avoid the need for*
that kind of optimization.  Unfortunately, as a previous poster already
pointed out, some people think proper design means no consideration of
performance at any time.  That is a recipe for failure.

It's not clear to me how Plauger's principle applies in situations where
there's not a hard resource limit.  In real-time-response systems there's
usually a very clear speed threshold: when you miss incoming data, or
fail to stop the conveyor belt in time, you're too slow.  In systems
where 5% slower than the originally specified speed is not catastrophic,
the exponential effort curve may not apply.  Any comments?

-- 
				tom lane
Internet: tgl@cs.cmu.edu
UUCP: <your favorite internet/arpanet gateway>!cs.cmu.edu!tgl
BITNET: tgl%cs.cmu.edu@cmuccvma
CompuServe: >internet:tgl@cs.cmu.edu

gbeary@uswat.uswest.com (Greg Beary) (01/10/91)

In article <11541@pt.cs.cmu.edu> tgl@g.gp.cs.cmu.edu (Tom Lane) writes:
>In article <14141@uswat.UUCP>, gbeary@uswat.uswest.com (Greg Beary) writes:
>> >
>> >How about Michael A. Jackson's two rules of optimization:
>> >Rule 1: Don't do it.
>> >Rule 2: Don't do it yet.
>> >from _Principles of Program Design_, chapter 12.
>> 
>> My personal experience in realtime systems is just the opposite, and
>> just as mis-leading. When working for a IBM-compatible hardware company
>> the Disk-folks were selecting new micro-processors. The criteria was 
>> "the fastest is the best". 
>> 
>> This begged the question, "how fast does it have to be". The answer was
>> "as fast as we can get". I think we need to add a healthy dose of
>> engineering into this discussion, so we can understand "how fast does it
>> have to be". 
>
>Amen.  Here's a relevant engineering opinion:
>
>P. J. Plauger has an excellent article on just this topic in the Jan '91
>...lines deleted
>	The cost of programming any resource rises rapidly once you
>	use up about 70% of that resource.
>
>
>I'm not sure where Plauger got the 70% figure from; he cites no
>references.  Maybe the curve doesn't really take off until 90%; or maybe
>you want to figure 50% to give yourself some room for estimation error.
>In any case the general principle is surely correct.
>

Many years ago I did a bit of capacity planning for large data centers.
The 75% mark was noted as a decision point. It was at that
point in time that you had better be planning what your next upgrade was
going to be. By the time you reached 85% utilization the 
problem was very apparent to your users (spikes took utilization above 85%).
As you approached 90-95% utilization things got really bad, and above 
that they got to be damn near ridiculous. I don't know where Plauger
got his numbers, but there was lots of empirical data around the 
capacity planning world to support the idea. 

I return to my central point, whether it be real-time or commercial 
software, most designers have no idea what the resource constraints 
are. Further, its been my experience that performance estimation is 
difficult and not much fun, so people don't do it. The folks I knew 
chickened-out and just assumed that if they didn't know how fast it
had to be...buying the fastest available was the best they could do
(ie. no one would be able to moan about the choice).

This is similiar to buying a Ferrari to commute to work in... if late,
you might want to get there in a hurry. This allows for three results:
You may never be late. The Ferrari, at 160 mph, gets you there. Or,
you're	2 hours late and even the fastest car can't get you there on
time. 

Most of us won't opt for such an expensive solution. It's funny though
when other people's money is on the line (ala shareholders) the same 
frugallity doesn't seem to be shown. If we want to be treated like
engineers we ought to act like engineers. 


--
Greg Beary 				|  (303)889-7935
US West Advanced Technology  		|  gbeary@uswest.com	
6200 S. Quebec St.         		| 
Englewood, CO  80111			|

rtm@christmas.UUCP (Richard Minner) (01/12/91)

In article <11541@pt.cs.cmu.edu> tgl@g.gp.cs.cmu.edu (Tom Lane) writes:
>P. J. Plauger ... the Seventy Percent Rule:
>	The cost of programming any resource rises rapidly once you
>	use up about 70% of that resource.
>I'm not sure where Plauger got the 70% figure from; he cites no
>references.  ...

I'd guess he used the Dave Barry method: "based on a study in which
[he] wrote down numbers until one of them looked about right."

But seriously, Plauger says "about 70%", and I don't doubt he
has enough experience to make such general claims.  It looks
about right to me.  Good Rule -- I'll remember it.
-- 
Richard Minner  rtm@island.COM  {uunet,sun,well}!island!rtm
Island Graphics Corporation  Sacramento, CA  (916) 736-1323

bernie@metapro.DIALix.oz.au (Bernd Felsche) (01/12/91)

In <12212@pucc.Princeton.EDU> EGNILGES@pucc.Princeton.EDU (Ed Nilges) writes:

[.. notes about "CORRECTNESS" deleted ]
>                               ...   As to speed, you shouldn't even
>worry about it until you have a correct program, and ....

I beg to differ!  I don't believe one line of code should be 
written until the most efficient method of calculating the
correct result has been determined.

As with all engineering problems, there is always more than
one solution to a problem.  If you don't seriously consider at least
half a dozen significantly different ones, then you are doing
yourself and your client a dis-service.

Far too much code gets generated which produces the "right" numbers,
but takes forever to crunch.  Some more thought about the problem,
before the keyboard got bashed, would have revealed much more 
efficient and manageable code, giving results in a far shorter time.

I have seen code to drive graphics displays, which used trig
functions to compute pixel positions for drawing circles.  (OK,
that's given away that I'm not exactly new to this game.) There 
is nothing wrong with this, but a more efficient, sufficiently 
correct solution is obvious to anybody skilled in the art.

Often the excuse is:  The client wants it now.  But consider the
fact that what the client gets will be an inferior product.  Put
yourself in the client's position, and see how much confidence it
inspires!  (Gee, why'd I have to buy a Stardent when something mucch
like it runs quicker on my Sinclair QL?)

QUALITY is more than just getting the right numbers. More EFFICIENT
code is not always less reliable.
-- 
______________________________Bernd_Felsche______________________________
Metapro Systems, 328 Albany Highway, Victoria Park Western Australia 6100
Phone: +61 9 362 9355   Fax: +61 9 472 3337   bernie@metapro.DIALix.oz.au

rlk@telesoft.com (Bob Kitzberger @sation) (01/13/91)

In article <11541@pt.cs.cmu.edu>, tgl@g.gp.cs.cmu.edu (Tom Lane) writes:
>
> Plauger says you should "choose the slowest mechanism that is fast
> enough.  By fast enough, I imply that you allow plenty of headroom, in
> accordance with the 70% rule."  

While arguably a worthwhile heuristic, Plauger's approach still begs 
the question of how fast the system should be.  Seat-of-the-pants
methods won't get us reliable mission-critical systems of the scale
of the new air-traffic control system or Space Station.

Anybody using Rate Monotonic Scheduling for filling hard real-time 
requirements?  Any success?

	.Bob.
-- 
Bob Kitzberger               Internet : rlk@telesoft.com
TeleSoft                     uucp     : ...!ucsd.ucsd.edu!telesoft!rlk
5959 Cornerstone Court West, San Diego, CA  92121-9891  (619) 457-2700 x163
------------------------------------------------------------------------------
"Civilization rests on two things," said Hitzig; "the discovery that
fermentation produces alcohol, and voluntary ability to inhibit defecation."

				-- Robertson Davies, "The Rebel Angels"

petergo@microsoft.UUCP (Peter GOLDE) (01/13/91)

In article <12212@pucc.Princeton.EDU> EGNILGES@pucc.Princeton.EDU writes:
>In article <15809.278a0d04@levels.sait.edu.au>, xtbjh@levels.sait.edu.au writes:
>
>Hmph?  I am not sure that you can "trade off" various levels of
>correctness.  A software system is either correct, or else it isn't.
>Also, correctness UNDERLIES reliability, reusability, maintainability
>and even speed: an incorrect program is ipso facto unreliable, com-
>petent programmers are loth to reuse incorrect software, and when you
>maintain such software your first job is to make it correct...only
>then should you make changes to it.  As to speed, you shouldn't even
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>worry about it until you have a correct program, and in my experience
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>minor uncorrected bugs can in certain circumstances be a drag on speed.

I find statements like this bizarre, especially in a group devoted
to software engineering.  Essentially you are saying that one of the
key requirements of your software (speed) should be completely ignored
until after implementation is complete.  This is contrary to virtually
all software engineering advice that I have ever heard.  You have to
consider ALL the requirements, including performance, during the design
and implementation of software.  In my experience, hacking in either
performance or correctness "after the fact" is very difficult.

--Peter Golde
Microsoft Corp.
petergo%microsoft@uunet.uu.net

dave@cs.arizona.edu (Dave P. Schaumann) (01/14/91)

In article <1991Jan12.093726.4935@metapro.DIALix.oz.au> bernie@metapro.DIALix.oz.au (Bernd Felsche) writes:
>In <12212@pucc.Princeton.EDU> EGNILGES@pucc.Princeton.EDU (Ed Nilges) writes:
>
>[.. notes about "CORRECTNESS" deleted ]
>>                               ...   As to speed, you shouldn't even
>>worry about it until you have a correct program, and ....
>
>I beg to differ!  I don't believe one line of code should be 
>written until the most efficient method of calculating the
>correct result has been determined.
>
>As with all engineering problems, there is always more than
>one solution to a problem.  If you don't seriously consider at least
>half a dozen significantly different ones, then you are doing
>yourself and your client a dis-service.

I would have to disagree.  In a situation where there you have no deadline for
completion, you might be able to do this.  You might never finish, too.  In the
real world (even in school) you have intense pressure to get it done.
A bubble sort that works now is better than a heap sort (or maybe a quick sort?
or maybe some other sort?) that will work in a month or a year.  (Yes, I
realize this is a contrived situation).

Writing good code involves a thorough understanding of the problem.
Unfortunately, you don't really have a complete understanding of a
(non-trivial) problem until you've written a program for it.  It's been my
experience that the best way to write programs is to use 20-20 hindsite --
that is, write something that works, look at it, and say "how can this be made
better?"

If you try to see the best solution without writing any code, you could get
hoplessly mired in your problem.  How do you know you have the best solution?
You really can't know how good a piece of code is until you see it in the
context of the rest of the program.  Maybe you'll discover you will never need
to sort more than 10 items, in which case a bubble sort will be perfectly
adequate, and using a O(nlogn) sort would be overkill.

>Far too much code gets generated which produces the "right" numbers,
>but takes forever to crunch.  Some more thought about the problem,
>before the keyboard got bashed, would have revealed much more 
>efficient and manageable code, giving results in a far shorter time.

I agree that it is easy to implement the first thing that pops into your head.
Thinking about different ways to approach a problem is a Good Thing.  But
searching for *the best* solution before you write any code is impossible.

>______________________________Bernd_Felsche______________________________
>Metapro Systems, 328 Albany Highway, Victoria Park Western Australia 6100
>Phone: +61 9 362 9355   Fax: +61 9 472 3337   bernie@metapro.DIALix.oz.au

Dave Schaumann      | We've all got a mission in life, though we get into ruts;
dave@cs.arizona.edu | some are the cogs on the wheels, others just plain nuts.
						-Daffy Duck.

price@helios.unl.edu (Chad Price) (01/14/91)

petergo@microsoft.UUCP (Peter GOLDE) writes:

>In article <12212@pucc.Princeton.EDU> EGNILGES@pucc.Princeton.EDU writes:
>>In article <15809.278a0d04@levels.sait.edu.au>, xtbjh@levels.sait.edu.au writes:
>>
>>Hmph?  I am not sure that you can "trade off" various levels of
>>correctness.  A software system is either correct, or else it isn't.
>>Also, correctness UNDERLIES reliability, reusability, maintainability
>>and even speed: an incorrect program is ipso facto unreliable, com-
>>petent programmers are loth to reuse incorrect software, and when you
>>maintain such software your first job is to make it correct...only
>>then should you make changes to it.  As to speed, you shouldn't even
>                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>worry about it until you have a correct program, and in my experience
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>minor uncorrected bugs can in certain circumstances be a drag on speed.

>I find statements like this bizarre, especially in a group devoted
>to software engineering.  Essentially you are saying that one of the
>key requirements of your software (speed) should be completely ignored

much criticism deleted. 

It seems that some gross assumptions about the original poster's meaning
are being made here. 

Given that one has decided upon an algorithm which is 1) correct for the
purpose, and 2) the most efficient that one can find; there are still
many ways to code that algorithm.

I agree that efficiency should not be considered until after coding is
finished and a correct result is apparent (via testing ...). Then one
should take the algorithm and analyze the method by which it is
implemented and see if there are better ways to accomplish the same
algorithm. 

Thus these steps seem the best way to approach the problem:

1) Find the best algorithm for the purpose 
2) Code it in the most obviously simple and reliable way 
3) If that works correctly, optimize the resultant code.

This method does NOT imply that no prior thought has gone in to the
question of efficiency and speed, only that the best algorithm can still
be coded in different ways, some of which may be noticably faster than
others.

A case in point occured some years ago when I was working on a Pick
system coding in Pick Basic, which has static arrays as well as dynamic
(undimensioned) arrays which locate data based on delimiters embedded in
the arrays. My initial coding of a problem prodced correct results, but
slowly using the dynamic arrays. By dimensioning the arrays at compile
time, the run-time of the program decreased by an order of magnitude,
using the same algorithm. All that was changed was the way in which the
data structure was declared (or not as the case turned out).

chad price
price@fergvax.unl.edu

rst@cs.hull.ac.uk (Rob Turner) (01/17/91)

Bernd Felsche writes:

>I beg to differ! I don't believe one line of code should be
>written until the most efficient method of calculating the
>correct result has been determined.

If we take this to extremes, no software would ever get written, as
for most problems there is no way of knowing whether you have the
"most efficient" (whatever that means) method of solution.

> ... If you don't seriously consider at least
>half a dozen significantly different ones, then you are doing
>yourself and your client a dis-service.

I'm not sure where you picked up "half a dozen" from, but I generally
agree with this. The number of possibilities must change considerably
from problem to problem, though.

>I have seen code to drive graphics displays, which used trig
>functions to compute pixel positions for drawing circles.

This is an interesting example. For graphic displays, speed is of the
essence (far more so than most pieces of software), so you should be
looking at the fastest answers even though it might mean not many
people can understand your code. I suspect "software engineering"
flies out of the window for tasks such as drawing circles. It only
gets applied at the higher levels like designing window managers.

You have to work out the time it would take you to work out all the fancy
algorithms for plotting circles, against the trigonometric solution,
which requires virtually no thought, as most people are aware of the
equations for drawing circles.

In a given time, more (admittedly slower) software will be written if
you choose the conventional, well-known or obvious techniques rather
than spending a lot of time making software which doesn't need to run
fast go like a bullet. The software will also be easier to understand.
The choice is yours.

With the advent of large software engineering environments, the
implementation of a lot of programming tasks is being taken out of the
hands of the programmer. People have been saying that we need code
reusability, and they have to realise the consequences. The
consequences are that half (could be any proportion, actually) the
code in your program will not have been implemented by you. You then
have to hope that the person who *did* code it made it fast enough for
your requirements. To make things worse, I suspect that most of the
code that already exists as libraries (or any fancy name you like)
within the environment will be very general purpose code (i.e. slow)
by definition, as it is going to be used in a lot of different
situations.

Rob

petergo@microsoft.UUCP (Peter GOLDE) (01/19/91)

In article <price.663815908@helios> price@helios.unl.edu (Chad Price) writes:
>Given that one has decided upon an algorithm which is 1) correct for the
>purpose, and 2) the most efficient that one can find; there are still
>many ways to code that algorithm.
>
>I agree that efficiency should not be considered until after coding is
>finished and a correct result is apparent (via testing ...). Then one
>should take the algorithm and analyze the method by which it is
>implemented and see if there are better ways to accomplish the same
>algorithm. 

I agree this can be a reasonable approach to some problems, especially
those with difficult or tricky algorithms, which is small and localized. 
I myself have used this approach, for example by first coding a routine 
in C and then recoding it in assembly code later.  

However, this wasn't really the point I was addressing in my 
original post.  I was responding to a claim that one should not
consider efficiency until one is done writing a _program_.  Since
a program is likely to take at least a year to complete, this involves
going back to code not considered for a very long time, possibly
written by someone else, and speeding it up, without breaking any
other code in the program at the same time.  Since most real speedups
are obtained by changing data structures, not by minor code tweaks, this
results in the all-too-common nightmare of trying to find all the
places that a given change impacts.  At best, you have to retest the
whole program; at worst, you can't make any changes at all because
you can't find all the places that depend on it.  Abstract data
types and object-oriented techniques can partially alleviate these
problems, but it is better not to have them at all.

This is the reason that I consider statements like "I won't worry
about efficiency now, I'll speed it up later" dangerous and 
unprofessional.  If speed (or space -- even harder to address after
the fact) is important then it is important enough to worry about
from the beginning.

--Peter Golde
petergo%microsoft@uunet.uu.net

stevebr@microsoft.UUCP (Steve BRANDLI) (01/19/91)

In article <70101@microsoft.UUCP> petergo@microsoft.UUCP (Peter GOLDE) writes:
>However, this wasn't really the point I was addressing in my 
>original post.  I was responding to a claim that one should not
>consider efficiency until one is done writing a _program_.  Since
>a program is likely to take at least a year to complete, this involves
>going back to code not considered for a very long time, possibly
>written by someone else, and speeding it up, without breaking any
>other code in the program at the same time.  Since most real speedups
>are obtained by changing data structures, not by minor code tweaks, this
>results in the all-too-common nightmare of trying to find all the
>places that a given change impacts.  At best, you have to retest the
>whole program; at worst, you can't make any changes at all because
>you can't find all the places that depend on it.  Abstract data
>types and object-oriented techniques can partially alleviate these
>problems, but it is better not to have them at all.
>
>This is the reason that I consider statements like "I won't worry
>about efficiency now, I'll speed it up later" dangerous and 
>unprofessional.  If speed (or space -- even harder to address after
>the fact) is important then it is important enough to worry about
>from the beginning.

I'm sure there are software projects that are understood well enough that
you know which sections of the code are worth optimizing.  But I haven't
worked on one (of decent size) yet where this was obvious.

I think a good developer can/should write relatively efficient code from a
local viewpoint, but attempting to be globally optimal early in the
development cycle can result in time spent in the wrong areas.  So, in
my opinion:

1) Spec., design, etc.

2) Write the program (or separatable section)

3) Determine where extra work will provide a benefit relative to the cost

4) Perform that work

/Steve (uunet!microsoft!stevebr)

rsd@sei.cmu.edu (Richard S D'Ippolito) (01/23/91)

In article <10061@as0c.sei.cmu.edu> Bruce Benson writes:

>In article <10060@as0c.sei.cmu.edu> Richard S D'Ippolito writes:
>>In article <654@caslon.cs.arizona.edu> Dave P. Schaumann writes:

[Text eliminated -- see references]

 >I think Dave expresses a deeper and more practical point: it is easier to
 >build a bridge, if you've built one before.  Building the first one may be
 >as much education as engineering, regardless of how good your engineering
 >handbook is.  Dave suggests expressing ones ideas into an implementation to
 >see what it will look like.  This is exactly (or close enough to) the
 >scientific method used when one is searching for solutions to a problem (or
 >testing a hypothesis).  

There is some confusion here between the roles of engineering and science.
Science is discovery through hypothesis testing (in books, anyway, where
serendipity is ignored); engineering fabricating from the known.  Engineers
build what they already know how to build, with only small differences
allowed between successive designs.  We would never award a contract to a
engineering firm that has never build a bridge before.

However, we let programmers hack away at things they've never built before,
on our money, and express surprise at the results.


>>A proper engineering anaylsis will generate and evaluate several solutions
>>whose implementations are fully known BEFORE specific implementation
>>(coding) begins.  Imagine buying, shipping, and erecting the steel before
>>the bridge is designed!
>
>This seems to be the key: are there well known solutions to the problem?
>Dave makes the point that it is tough to know if one even understands the
>problem.

If I don't have a well known solution to the problem, I shouldn't call what
I'm doing 'engineering'.  Engineering is problem setting, not problem
solving.


>"Print all numbers from 1 to a million" seems easy to code and has well
>known solutions, but the average programmer may not appreciate how big one
>million is.  

Whoa!  This is implementation-level stuff.  If this were a critical part of
a design, it would have not made it past PDR unless there were a bounded
solution.  And, if it's a critical task, we wouldn't give it to an 'average
programmer'.


>...So, if I don't recognize that I'm building a bridge, I may not look at the
>appropriate solutions.

If I don't recognize that I'm supposed to be building a bridge, perhaps I
shouldn't be in the gorge except to fish.


>>There are real engineering methods which help to answer the "best solution"
>>question above, but alas, we teach them only to engineering students.
>>Parnas took a good shot at this in the Jan. 90 issue of "Computer".
>
>In our rush to achieve engineering in software, we shouldn't automatically
>disdain the exploratory methods that are so useful when problems are not
>well understood.

A bit of non sequitor...please read the Parnas article!


Rich

cml@tove.cs.umd.edu (Christopher Lott) (01/24/91)

In article <10185@as0c.sei.cmu.edu> rsd@sei.cmu.edu (Richard S D'Ippolito) writes:
>Science is discovery through hypothesis testing (in books, anyway, where
>serendipity is ignored); engineering fabricating from the known.  Engineers
>build what they already know how to build, with only small differences
>allowed between successive designs.  We would never award a contract to a
>engineering firm that has never build a bridge before.

I agree with Mr. D. up to a point, but....

someone built the first cable-suspension bridge.  the first hydroelectric dam.
the first stealth bomber.  the first rotary engine.  the first vlsi chip.
the first tunnel across the english channel.  even the first light bulb.

I would call these engineering problems, of various sizes.  Yet they certainly
involved much empirical testing (e.g., lightbulb).   My argument is that
there's a good mix of engineering and science in these large projects.
I claim the same thing for software projects.

If one argues that having built a large jet plane gives one the skills to
build a stealth, then a similar argument goes for software.  Hey, it's
only software, right?  )-:


chris...
--
Christopher Lott    Dept of Comp Sci, Univ of Maryland, College Park, MD 20742
  cml@cs.umd.edu    4122 AV Williams Bldg  301-405-2721 <standard disclaimers>