[comp.arch] Bloat costs

chip@tct.uucp (Chip Salzenberg) (05/30/90)

According to ian@sibyl.OZ (Ian Dall):
>Indeed, these days, an employer could be justified in berating a
>programmer for wasting time reducing the size of a program instead of
>increasing it's speed or functionality or getting it to market
>earlier!

Well, I'm not advocating a 16K limit on code size, either.  :-)

However, I consider program size vs. programmer time to be a
false dichotomy.  I would argue that, once gained, the skill of
writing programs that begin and stay small makes a programmer
more productive and valuable.  It's a part of the overall
"think small, do one thing at a time" mentality that produced
Software Tools, Research Unix and Plan Nine.  (But not System V
or BSD, unfortunately.)

I agree that it may not be worth the time to modify a working
program just for the purpose of making it smaller.  However, it
is my discouraged opinion that many programmers -- perhaps a
majority -- don't take resource usage into consideration when
designing and writing *new* code unless forced to do so.  The
apparent omission of resource considerations from the planning
stages of programming is what I detest.

The "think small" approach is, in my opinion, a big win over
the "just write it the most obvious way" approach.  It seems,
however, that the latter has a patina of respectibility among
new programmers by virtue of all the bloated programs they use
as they learn.  "GNU Emacs is huge, X is huge, SunView, System
V, BSD, SunOS, AIX are all huge -- why should I have to think
about how to use less memory?"  Sigh.
-- 
Chip Salzenberg at ComDev/TCT   <chip%tct@ateng.com>, <uunet!ateng!tct!chip>

chip@tct.uucp (Chip Salzenberg) (05/30/90)

According to jca@pnet01.cts.com (John C. Archambeau):
>chip@tct.uucp (Chip Salzenberg) writes:
>>Competent C compilers can be written in small model.  I once worked on
>>a C compiler that ran on a PDP-11, which as everyone knows, is limited
>>to 64K of data under most (all?) Unix implementations.
>
>Which brings forth the argument in favor of progress.  How many people
>actually use PDP-11's anymore?

PDP-11 usage statistics matter not at all.  The point is
that it can be done, but some people would have you think
that it can't be done, so they can escape the mental effort
required to do it.

The "What do you want to do, return to the dark ages?"
retort reminds me of a quote from Theodor Nelson, who in
turn was quoting a computer consultant of the 70s:

    "If it can't be done in COBOL,
     I just tell them it can't be done by computer.
     It saves everyone a lot of time."

Obviously this consultant was a trogolodyte.  One would
hope that such attitudes are a thing of the past.

Substitute "four megabytes of RAM" for "COBOL", however,
and you get a depressingly accurate summary of the attitude
of the day.  Am I implying that that 4M-or-die programmers
are trogolodytes as well?  You bet your data space I am.
-- 
Chip Salzenberg at ComDev/TCT   <chip%tct@ateng.com>, <uunet!ateng!tct!chip>

jtc@van-bc.UUCP (J.T. Conklin) (05/31/90)

In article <2662D045.3F02@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>Substitute "four megabytes of RAM" for "COBOL", however,
>and you get a depressingly accurate summary of the attitude
>of the day.  Am I implying that that 4M-or-die programmers
>are trogolodytes as well?  You bet your data space I am.

Although I agree with Chip in general, there are some cases where
using memory is better than scrimping on principle.

I'm sure that many faster algorithms had to be passed by because
of limited address space.  Some of the GNU equivelents of UNIX
programs are many times faster because of the faster, yet more
memory intensive, algorithms.

I don't think I have to mention another optimization that ``wastes''
memory: large lookup tables.  It was quite common to be required to
re-compute indexes each iteration because there wasn't enough memory.

Another unrelated application is high resolution image processing.  Is
procesing 16MB frame-buffer with kerjillions of processors doing ray-
tracing wasting mmoryy?

On the other hand, there is something to be said about giving
beginning programmers 6 MHz Xenix/286 machines to work on.  I
think you'd be suprised at the small, fast, and portable code
that can come out of that enviornment.  I recomend it, as the
good habits that result will last for life.

To summarize, I have written programs that need 4M to run --- only
because it takes 4M to do the job.  Programs that require less, take
less. I do not consider myself a trogolodyte.

	--jtc

-- 
J.T. Conklin	UniFax Communications Inc.
		...!{uunet,ubc-cs}!van-bc!jtc, jtc@wimsey.bc.ca

pjg@acsu.Buffalo.EDU (Paul Graham) (05/31/90)

chip@tct.uucp (Chip Salzenberg) writes:

|According to ian@sibyl.OZ (Ian Dall):
|>Indeed, these days, an employer could be justified in berating a
|>programmer for wasting time reducing the size of a program instead of
|>increasing it's speed or functionality or getting it to market
|>earlier!

|Well, I'm not advocating a 16K limit on code size, either.  :-)

|However, I consider program size vs. programmer time to be a
|false dichotomy.  I would argue that, once gained, the skill of
|writing programs that begin and stay small makes a programmer
|more productive and valuable.  It's a part of the overall
|"think small, do one thing at a time" mentality that produced
|Software Tools, Research Unix and Plan Nine.  (But not System V
|or BSD, unfortunately.)

|The "think small" approach is, in my opinion, a big win over
|the "just write it the most obvious way" approach.  It seems,
|however, that the latter has a patina of respectibility among
|new programmers by virtue of all the bloated programs they use

woof.  i can't buy any of this.  as someone who spent his formative
years on an lsi-11 and thought you could do anything in 8k i find such
discourse on discipline silly.  i've run sunos, x, gnu and other
typical things all at the same time in 4M, and in 16M.  i'm not going
back.  nor do i write programs that are overly large.  i've used paper
tape based tools too, with 1k assemblers. i'm still not going back.

while not wishing to encourage waste in any particular thing i don't
really find it the least bit productive for various folks to keep
harping about how things are soooo huge, all the programmers are sooo
lazy and trillions of bytes are going to waste.  in the grand scheme
of things (and in this newsgroup) it sounds more like petulance than
discussion.  particularly when couched in an inflammatory or
agressive style generously leavened with loaded terms like bloat.

now if one wishes to discuss the relative merits of writing straight
forward code (the obvious way) versus bumming every last byte or
whether life-cycle costs are better served by long term maintenance
issues or short term resource issues it would seem another group
is in order.
--
"as a user i'll take speed over features."  hmmm, me i'll take both.
or was that bourbon over ice.

khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) (05/31/90)

In article <26798@eerie.acsu.Buffalo.EDU> pjg@acsu.Buffalo.EDU (Paul Graham) writes:

...
   while not wishing to encourage waste in any particular thing i don't
   really find it the least bit productive for various folks to keep
   harping about how things are soooo huge, all the programmers are sooo
   lazy and trillions of bytes are going to waste.  in the grand scheme

....

Amen.

If we assume that Amdahl was even slightly right .... each "MIP"
requires 1Mb main memory and 1Mb/sec of "disk" channel capacity.

For the sake of argument, accept vendor's claims of MIP ratings at
face value. The result is that many, if not most,  of the nifty
machines announced over the last several months should have 10+ Mb
RAM. 

The ever growing need for RAM is, IMHO, directly related to CPU
performance gains. Software bloat would appear to be a second order
effect.
--
Keith H. Bierman    |*My thoughts are my own. !! kbierman@Eng.Sun.COM
It's Not My Fault   | MTS --Only my work belongs to Sun* khb@chiba.Eng.Sun.COM
I Voted for Bill &  | Advanced Languages/Floating Point Group (415 336 2648)   
Opus                | "When the going gets Weird .. the Weird turn PRO"

davecb@yunexus.UUCP (David Collier-Brown) (05/31/90)

>chip@tct.uucp (Chip Salzenberg) writes:
[or possibly ``According to ian@sibyl.OZ (Ian Dall)'': the origin is unclear]
>|However, I consider program size vs. programmer time to be a
>|false dichotomy.  I would argue that, once gained, the skill of
>|writing programs that begin and stay small makes a programmer
>|more productive and valuable.  It's a part of the overall
>|"think small, do one thing at a time" mentality that produced
>|Software Tools, Research Unix and Plan Nine.  (But not System V
>|or BSD, unfortunately.)

   In a previous life, I worked for a company that had to get
a 5-subsystem large program package to market in 8 months or less.
The approach was really simple:

	First you make it right,
	Then you make it fast, 
	Then you make it small.[1]

  This spawned ``let's get smaaaaaaalllllll'' as an
invitation to drinks and dinner (:-))

--dave
[1] Attributed to Dick McMurray & Ashok Patel.

-- 
David Collier-Brown,  | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | "And the next 8 man-months came up like
CANADA. 416-223-8968  |   thunder across the bay" --david kipling

chip@tct.uucp (Chip Salzenberg) (06/01/90)

According to pjg@acsu.Buffalo.EDU (Paul Graham):
>now if one wishes to discuss the relative merits of writing straight
>forward code (the obvious way) versus bumming every last byte or
>whether life-cycle costs are better served by long term maintenance
>issues or short term resource issues it would seem another group
>is in order.

I consider comp.arch to be the right place.  Among the
forces driving computer architecture are the decisions
programmers make.  And, of course, computer architecture
drives many decisions programmers make.

And I would like to know whether anyone agrees with Mr.
Graham that "resource issues" are "short term" while
"maintenance issues" are "long term."  Resources are used
as long as you run a program; maintenance is over
whenever you decide to stop maintaining it.  The latter
often happens before the former.
-- 
Chip, the new t.b answer man    <chip%tct@ateng.com>, <uunet!ateng!tct!chip>

chip@tct.uucp (Chip Salzenberg) (06/01/90)

According to jtc@van-bc.UUCP (J.T. Conklin):
>I'm sure that many faster algorithms had to be passed by because
>of limited address space.  Some of the GNU equivelents of UNIX
>programs are many times faster because of the faster, yet more
>memory intensive, algorithms.

However, as has been pointed out before, the memory isn't
free, paging takes time, swap space isn't free, etc.  At the
very least, where practical, programs with memory-eating
algorithms should include a more frugal algorithm as an
option.  IMHO, of course.

>Another unrelated application is high resolution image processing.  Is
>procesing 16MB frame-buffer with kerjillions of processors doing ray-
>tracing wasting mmoryy?

Well, there are exceptions to every rule.  :-)

>On the other hand, there is something to be said about giving
>beginning programmers 6 MHz Xenix/286 machines to work on.

Amen.
-- 
Chip, the new t.b answer man    <chip%tct@ateng.com>, <uunet!ateng!tct!chip>

wsd@cs.brown.edu (Wm. Scott `Spot' Draves) (06/02/90)

In article <266577FA.6D99@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
   According to jtc@van-bc.UUCP (J.T. Conklin):

   >On the other hand, there is something to be said about giving
   >beginning programmers 6 MHz Xenix/286 machines to work on.

   Amen.

If you are suggesting that novice programmers be given slow/obsolete
hardware so that they learn to write efficient code, I would disagree
with you strongly.

Efficiency is just one of many attributes that are generally
desirable in programs.  Learning to program on a machine that is
slower than the state of the art will artificially skew the importance
of eff. programming.

One of the wonderful things about 20Mip 32Mb workstations is that I
don't have to worry about eff. when writing most code.  I can
concentrate on other issues such as clarity of code, speed of
execution, speed of development, fancy features, ...

by "eff." i mean "frugal of code and data".

--

Scott Draves		Space... The Final Frontier
wsd@cs.brown.edu
uunet!brunix!wsd

wwm@pmsmam.uucp (Bill Meahan) (06/02/90)

In article <WSD.90Jun1130958@miles.cs.brown.edu> wsd@cs.brown.edu (Wm. Scott `Spot' Draves) writes:
>In article <266577FA.6D99@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>   According to jtc@van-bc.UUCP (J.T. Conklin):
>
>  [stuff deleted]
>
>One of the wonderful things about 20Mip 32Mb workstations is that I
>don't have to worry about eff. when writing most code.  I can
>concentrate on other issues such as clarity of code, speed of
>execution, speed of development, fancy features, ...
>
>by "eff." i mean "frugal of code and data".
>

May I be among the first to say HORSEPUCKY!

There seems to be a mindset among many CS majors that
"memory is cheap and hardware is fast, so why worry about efficiency?"

This kind of thinking is the result of looking only at chip prices and
the latest hot-rod announcements.  In truth, only a SMALL subset of the
(potential) customers for any given piece of software are running the
'latest and greatest' with beaucoup RAM.  The rest of us are running on
whatever we've got now and often this is older equipment or 'bare-bones'
versions of the hotter stuff because that was all we could afford.

There is a simple financial reality that is often overlooked:

	1) Regardless of the **theoretical prices**, if I don't HAVE 'it'
	   I have to go buy it.
	2) The money I have to go buy 'it' with could also go towards
	   the purchase of other things.
	3) Therefore, I have to demonstrate (to myself, my spouse,
	   my manager, the bean-counters, etc) that buying 'it' has
	   sufficient return on investment to justify THAT purchase
	   instead of some other.
	4) It is very hard to justify continual upgrades of equipment
	   just to get the 'latest and greatest' features, unless these
	   features translate DIRECTLY into some real benefit.
	5) If the latest and greatest is not directly upwards compatible
	   with my current configuration, there is an ADDITONAL hidden cost
	   associated with converting/replacing my current installed base
	   of software and hardware.
	6) Even 'cheap' upgrades get expensive if you have to buy a lot
	   of copies.  (This site has over 250 PC's, think the Controller
	   wants to spend $500 each to upgrade the memory just to get some
	   fancier display?)
	7) Customers DON'T CARE how clear/modular/elegant your code is
	   unless the clarity/elegance/whatever has some demonstratable
	   benefit to THEM!

Maybe all CS majors should be forced to take a few economics courses along
with the rest of their curriculum!

FAST, SMALL, CHEAP   <--- Pick any 2, you can't have all 3.
-- 
Bill Meahan  WA8TZG		uunet!mailrus!umich!pmsmam!wwm
I speak only for myself - even my daughter's cat won't let me speak for her!

mike@thor.acc.stolaf.edu (Mike Haertel) (06/02/90)

In article <266577FA.6D99@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>According to jtc@van-bc.UUCP (J.T. Conklin):
>>On the other hand, there is something to be said about giving
>>beginning programmers 6 MHz Xenix/286 machines to work on.
>
>Amen.

Not a 286!  If you want to teach someone about memory constraints give
them a PDP-11 running UNIX v7.  A much cleaner architecture.

The problem is, people all too often assume that their past experience
defines how things "should" be, and so when they in turn design things in
the future they apply their preconceptions.  We don't need any intellectual
descendents of the 286.
--
Mike Haertel <mike@acc.stolaf.edu>
``There's nothing remarkable about it.  All one has to do is hit the right
  keys at the right time and the instrument plays itself.'' -- J. S. Bach

merriman@ccavax.camb.com (06/03/90)

In article <266576A7.6D17@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes:

[stuff deleted]

> 
> And I would like to know whether anyone agrees with Mr.
> Graham that "resource issues" are "short term" while
> "maintenance issues" are "long term."  Resources are used
> as long as you run a program; maintenance is over
> whenever you decide to stop maintaining it.  The latter
> often happens before the former.
> -- 

Where I work (the New York financial community), it seems that maintenance
never ends. We are usually rushing to move an application from an old (hardware
and/or software) platform to a newer one before the old one runs out of steam
or falls apart, while tearing our hair out trying to keep the old system
running. These applications are always tangled up in some way with other
systems, many of which are not under any local control. Hardware is so much
cheaper than manpower that it is foolish to spend any time worrying about
saving a byte here or there. In fact, a lot of our maintenance headaches are
caused by scrimping on resources ("Why use a longword here instead of a word?
We'll never see a value of more than a few thousand." Ha!). Projects seem to be
a year late before they even get started. You may call this bad management, but
the forces driving the applications are beyond the control of the enterprise,
and probably beyond anyone's control. Straightforward, easy-to-maintain code is
of the utmost importance in this environment, especially considering that the
people who wrote the code are often long gone by the time you end up having to
fix something.

> Chip, the new t.b answer man    <chip%tct@ateng.com>,<uunet!ateng!tct!chip>

George Merriman, Cambridge Computer Associates

jonah@cs.toronto.edu (Jeff Lee) (06/04/90)

khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) writes:
>If we assume that Amdahl was even slightly right .... each "MIP"
>requires 1Mb main memory and 1Mb/sec of "disk" channel capacity.

Amdahl was speaking of shared mainframes, not single-user workstations.
He was also comparing systems of the same generation.  The analysis
might not be transferrable across different types and generations of
systems.

wsd@cs.brown.edu (Wm. Scott `Spot' Draves) writes:
>One of the wonderful things about 20Mip 32Mb workstations is that I
>don't have to worry about eff. when writing most code.  I can
>concentrate on other issues such as clarity of code, speed of
>execution, speed of development, fancy features, ...

>by "eff." i mean "frugal of code and data".

Being frugal with code and data often leads to clarity of code, speed of
execution, and speed of development.  You can then concentrate on
optimizing the 10% of the code that consumes 80% of the time.  The one
thing it doesn't get you is fancy features.  Give everyone a 20Mip 32Mb
workstation and they tend to forget about simplicity, modularity,
compatibility, and just stick in a Lisp extension language or 3D buttons
instead.  Designing for the general case instead of the special case
takes a bit bit longer but you save in the long run on reusability.
The trouble is often that most people look at every problem as being a
"special case".

merriman@ccavax.camb.com (George Merriman) writes:
>In fact, a lot of our maintenance headaches are
>caused by scrimping on resources ("Why use a longword here instead of a word?
>We'll never see a value of more than a few thousand." Ha!).

This is the wrong sort of "space efficiency", but alas, the most prevalent.
On the other hand, if the task is inherently sequential, why load *all*
the data into memory, process it, and then write out the result?  This is
one of the *worst* ways to use virtual memory, but is becoming much more
common.  People often use the size and speed of modern systems as an excuse
to not worring about picking the right *algorithm*.  As a result, the
solutions often don't scale well.

Jeff Lee -- jonah@cs.toronto.edu or utai!jonah

chip@tct.uucp (Chip Salzenberg) (06/05/90)

According to merriman@ccavax.camb.com:
>In fact, a lot of our maintenance headaches are caused by scrimping
>on resources ("Why use a longword here instead of a word?  We'll
>never see a value of more than a few thousand." Ha!).

Such "scrimping" is often done in the name of saving memory.  However,
there are smarter ways to save memory, such as keeping only one record
of a file in memory instead of the whole file, etc.

Just because some programmers are penny wise and pound foolish does
NOT mean that frugality is counterproductive.
-- 
Chip Salzenberg at ComDev/TCT     <chip@tct.uucp>, <uunet!ateng!tct!chip>

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/05/90)

In article <266A93A8.528F@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes:
> According to merriman@ccavax.camb.com:
> >In fact, a lot of our maintenance headaches are caused by scrimping
> >on resources ("Why use a longword here instead of a word?

> Such "scrimping" is often done in the name of saving memory.  However,
> there are smarter ways to save memory, such as keeping only one record
> of a file in memory instead of the whole file, etc.

Just on the subject of bloat, tradeoffs, &c, there is an interesting
tradeoff in the way COFF format stores relocation information.  Precisely
in the interests of keeping small amounts of data in memory, it more than
doubles the size of a data structure held on disc.

Each address in a segment that needs to be relocated has a triple
	[address:long, symindex:long, type:short]
stored for it in that segment's relocation table.  If you examine a
typical object file, you find that most of the references are to
_repeated_ symindex values (e.g. every time you call printf() you get a
relocation triple pointing to printf).  This is an obvious candidate for
compression:  store
	[symindex:long, 0type:short, address:long] -- for unique references
	[symindex:long, 1type:short, count:short   -- for repeated references
		{,address:long}...]
which change would reduce the size of the relocation table by nearly 60%,
and would save repeated references to the symbol table.  Looks like a win
all around.  (Alternatively, we might forget about singleton references,
and make everything [symindex,type,count,address*] and benefit from having
everything longword aligned.  Another tradeoff.)

The relocation information is actually stored in increasing order of
address, so fixing up the addresses requires one sequential pass over
the segment and one sequential pass over the relocation table.  There
are accesses all over the symbol table, but the symbol table had to be
read into memory anyway.  COFF's data structure means that the linker
doesn't have to hold the whole segment it is relocating all in memory,
as the "transposed" structure would.  That was clearly a Good Idea on
PDP-11s.  Maybe with a virtual memory machine it isn't a good idea any more.
The tradeoff here was that with a small memory (64k) it was quite likely
that the program and data wouldn't all fit into memory at once; the data
were made _bigger_ so that they could be got at easily.  With a large
memory it's unlikely that the linker and an object segment can't fit
together, so it would make sense to save the disc space.

So, while I heartily agree that you can get 80% of the power of Emacs in
50k of code, let's not forget that _small_ memories can warp things too.
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

dricejb@drilex.UUCP (Craig Jackson drilex1) (06/05/90)

As someone who has been programming for fifteen years or so, I can well
appreciate the value of tight programming.  However, there is such a thing
as too-tight progamming.  The use of too-small field widths was one
problem noted.  Another problem, which I believe is just as severe,
is the tendency to leave out assertion-checking and internal debugging
support in "finished" code.  I'm a firm believer in checking lots of
assumptions early and often, even in production code.  Also, it's useful
to have lots of hidden debugging capabilities.  All too often, when
you were trying to squeeze in another feature, it was a print statement
that died.
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

aglew@oberon.csg.uiuc.edu (Andy Glew) (06/09/90)

>With an orthogonal architecture and a good compiler, you can write
>maintainable programs in high-level languages and still produce
>products that run quickly on machines with a lot fewer than 20 MIPS.

With a good compiler you don't care how orthogonal the architecture is.
--
Andy Glew, aglew@uiuc.edu

bp@retiree.cis.ufl.edu (Brian Pane) (06/09/90)

In article <1990Jun1.200333.10672@pmsmam.uucp> wwm@pmsmam.UUCP (Bill Meahan) writes:
>In article <WSD.90Jun1130958@miles.cs.brown.edu> wsd@cs.brown.edu (Wm. Scott `Spot' Draves) writes:
>>In article <266577FA.6D99@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>>   According to jtc@van-bc.UUCP (J.T. Conklin):
>>
>>  [stuff deleted]
>>
>>One of the wonderful things about 20Mip 32Mb workstations is that I
>>don't have to worry about eff. when writing most code.  I can
>>concentrate on other issues such as clarity of code, speed of
>>execution, speed of development, fancy features, ...
>>
>>by "eff." i mean "frugal of code and data".
>>
>
>May I be among the first to say HORSEPUCKY!
>
>There seems to be a mindset among many CS majors that
>"memory is cheap and hardware is fast, so why worry about efficiency?"
>
>This kind of thinking is the result of looking only at chip prices and
>the latest hot-rod announcements.  In truth, only a SMALL subset of the

If such a mindset exists, it is not because of the abundance of powerful
hardware.  It is because CS majors are taught to build robust, maintainable,
and therefore seemingly elegant programs rather than compact and clever
programs.  If we get used to writing ruthlessly brilliant programs,
we'll only add to the "software crisis" when we graduate.

I agree that efficiency is important, but  it must be kept in
its proper perspective.  This group is devoted to the implementation
of a UNIX-like OS on an architecture that should have been allowed to
die ten years ago.  No matter how well you write your C code, an average
compiler will probably produce disgracefully slow executables.  There
is little to be gained by writing efficient C programs for inefficient
machines.  You *can* write fairly efficient code for the 8086--in
assembly language.  However, few people have that much time to waste.
While you're shouting about the expense of "improved" software and the
expense of the hardware on which such software must run, don't forget
about the cost of programmer time.

Finally, note that large and "inefficient" programs advance the state
of the art in software more often than small and clever programs.
Consider X Windows.  It is a huge system designed for flexibility
rather than efficiency, and it requires significant hardware power,
but it has revolutionized the way we use computers.

>Maybe all CS majors should be forced to take a few economics courses along
>with the rest of their curriculum!
>
Don't blame us for the economic problems of software development; blame the
EE's who design the hardware.  With an orthogonal architecture and a
good compiler, you can write maintainable programs in high-level languages
and still produce products that run quickly on machines with a lot fewer
than 20 MIPS.

>FAST, SMALL, CHEAP   <--- Pick any 2, you can't have all 3.
Not yet.  And not ever, if we all devote our efforts to
optimizing tiny programs for tiny machines.  20-MIPS
workstations will become affordable only when lots of software
is available for them.

>Bill Meahan  WA8TZG		uunet!mailrus!umich!pmsmam!wwm
>I speak only for myself - even my daughter's cat won't let me speak for her!

-Brian F. Pane
-------------------------------------------------------------------------
Brian Pane	University of Florida Department of Computer Science
bp@beach.cis.ufl.edu		Class of 1991

"If you can keep your expectations tiny,
 you'll get through life without being so whiny" - Matt Groening

#ifdef OFFENDED_ANYONE
#  include "disclaimer.h"
// Sorry to indulge in such 8086-bashing, folks, but I had a point to make.
#endif
-------------------------------------------------------------------------

peter@ficc.ferranti.com (Peter da Silva) (06/09/90)

In article <23473@uflorida.cis.ufl.EDU> bp@beach.cis.ufl.edu (Brian Pane) writes:
> If such a mindset exists, it is not because of the abundance of powerful
> hardware.  It is because CS majors are taught to build robust, maintainable,
> and therefore seemingly elegant programs rather than compact and clever
> programs.  If we get used to writing ruthlessly brilliant programs,
> we'll only add to the "software crisis" when we graduate.

Lots of nice buzzwords there, fella. Trouble is, it doesn't mean anything.
First of all, I haven't noticed that much, if any, difference in the quality
of net contributions from academia and industry. Quantity, yes... industry
can't afford the time to write the latest and greatest freeware. Second,
nobody's advocating gratuitous microefficiency here, just a consideration
of space-time tradeoffs in choosing algorithms. Like not loading a whole
file when you can get away with reading a line at a time. Or if you *do*,
check how much there is to read before you read it instead of just allocating
a big array and doubling in size when it fills up. Using a simplistic
algorithm makes as much sense as using bubble-sort on a megabyte array.

> Finally, note that large and "inefficient" programs advance the state
> of the art in software more often than small and clever programs.
> Consider X Windows.

Yes, lets.

> It is a huge system designed for flexibility
> rather than efficiency, and it requires significant hardware power,
> but it has revolutionized the way we use computers.

Actually, it was the Xerox Star and the Apple Macintosh that did that.
Machines with a fraction of the resources of the typical X workstation.
-- 
`-_-' Peter da Silva. +1 713 274 5180.  <peter@ficc.ferranti.com>
 'U`  Have you hugged your wolf today?  <peter@sugar.hackercorp.com>
@FIN  Dirty words: Zhghnyyl erphefvir vayvar shapgvbaf.

kt4@prism.gatech.EDU (Ken Thompson) (06/11/90)

>In article <266577FA.6D99@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>   According to jtc@van-bc.UUCP (J.T. Conklin):
>
>  [stuff deleted]
>
>One of the wonderful things about 20Mip 32Mb workstations is that I
>don't have to worry about eff. when writing most code.  I can
>concentrate on other issues such as clarity of code, speed of
>execution, speed of development, fancy features, ...
>>
>by "eff." i mean "frugal of code and data".
>

I strongly disagree that efficiency(including code/date size) can reasonably 
be ignored. No matter how quickly the power of machines grow, the things that
we want to do with them grow even faster. I believe it is a grave mistake
not to be concerned with the efficiency of the algorithms used in programming.
IMHO, this attitude has led to a severe decline in the capability of software
vs. the hardware resources required to execute it. Note I did not say 
anything about the cost of these resources. I find this depressing to say
the least.

				Ken

-- 
Ken Thompson  GTRI, Ga. Tech, Atlanta Ga. 30332 Internet:!kt4@prism.gatech.edu
uucp:...!{allegra,amd,hplabs,ut-ngp}!gatech!prism!kt4
"Rowe's Rule: The odds are five to six that the light at the end of the
tunnel is the headlight of an oncoming train."       -- Paul Dickson

greg@sce.carleton.ca (Greg Franks) (06/13/90)

In article <23473@uflorida.cis.ufl.EDU> we find:
...
>>There seems to be a mindset among many CS majors that
>>"memory is cheap and hardware is fast, so why worry about efficiency?"
>>
>>This kind of thinking is the result of looking only at chip prices and
>>the latest hot-rod announcements.  In truth, only a SMALL subset of the
>
>If such a mindset exists, it is not because of the abundance of powerful
>hardware.  It is because CS majors are taught to build robust, maintainable,
>and therefore seemingly elegant programs rather than compact and clever
>programs.  If we get used to writing ruthlessly brilliant programs,
>we'll only add to the "software crisis" when we graduate.

David Parnas would beg to differ.  He is not certain which is worse,
an Engineer who has been writing Fortran for the last 20 years, or a
present day CS major.  The former do not know ``modern'' programming
practices, hence they produce goto-full programs that do one thing
rather well.  The latter produce ``elegant'' programs that not only do
what the customer wanted (maybe), but twenty billion other things as
well.  After all does `ls' really need 18 different options?
Unfortunately, computer programming still seems to live in the CISC
era.

Prof. Parnas recently wrote an article in IEEE Computer on this very
subject.  I recommend reading it.

From:  "just call me Tex (as in massacre) - my productivity is
measured in negative lines"  :-) :-) :-)
-- 
Greg Franks, (613) 788-5726              |"The reason that God was able to
Systems Engineering, Carleton University,|create the world in seven days is
Ottawa, Ontario, Canada  K1S 5B6.        |that he didn't have to worry about
greg@sce.carleton.ca uunet!mitel!sce!greg|the installed base" -- Enzo Torresi

bpendlet@bambam.UUCP (Bob Pendleton) (06/13/90)

From article <2662D045.3F02@tct.uucp>, by chip@tct.uucp (Chip Salzenberg):

> Substitute "four megabytes of RAM" for "COBOL", however,
> and you get a depressingly accurate summary of the attitude
> of the day.  Am I implying that that 4M-or-die programmers
> are trogolodytes as well?  You bet your data space I am.
> -- 
> Chip Salzenberg at ComDev/TCT   <chip%tct@ateng.com>, <uunet!ateng!tct!chip>

A long time ago (about 10 years), at a company that has since changed
its name several times, I and 3 other damn good programmers spent a
year or so writing the runtime support libraries for a COBOL system
that generated code for an 8080 based "terminal" called the UTS400.
The compiler ran on a number of different machines and generated code
that ran on the '400. You linked the code with our runtime code and
you got an application you could down load to an eight inch floppy and
then boot on the '400. 

Our library did all the weird arithmetic and data formatting that
COBOL needs.  It also implemented a disk file system, host
communications, screen formatting, data entry validation,
multithreading (yes it was a multiuser system, up to 4 users if I
remember correctly), and segment swapping. It fit in 10K bytes. Normal
'400s had 24K, some had 32K. I know that at least one 20K lines COBOL
program ran on the machine all day, every day. 

Marketing decided we should also support indexed sequential files.
They "gave" us 1K to implement it. That is, the code for the indexed
sequential file system could not increase the size of the library by
more than 1K bytes.  We wrote the indexed sequential files module in
2K and rewrote the rest of the system to fit in 9K. 

So when people tell me they have done incredible things in tiny
memories on absurd machines I beleive them. I've even been know to buy
them a drink. 

Yes, it can be done. But for most things it is an absurd waste of
time. I can write code 5 to 10 times faster when I DON'T have to
worry about every byte I spend than when I'm memory tight. And I can
write code that RUNS several times faster when I'm free with memory
than when I have to count every byte. 

Some times you must run a ton of program on a pound of computer. Many,
if not most, commercial programs in the MS-DOS world fall into that
realm. But, most programming done in the name of "memory efficiency"
is just wasted time. You have to sell a lot of copies to make back the
cost of all that code tightening. Not to mention what it does to the
cost of further development. 

			Bob P.

P.S.

I also learned an important lesson on the power of structured design
and prototyping form this project. But, that's another story. 

-- 
              Bob Pendleton, speaking only for myself.
UUCP Address:  decwrl!esunix!bpendlet or utah-cs!esunix!bpendlet

                      X: Tools, not rules.

oz@yunexus.UUCP (Ozan Yigit) (06/14/90)

In article <23473@uflorida.cis.ufl.EDU> bp@beach.cis.ufl.edu (Brian Pane) 
babbles:

>Finally, note that large and "inefficient" programs advance the state
>of the art in software more often than small and clever programs.

And, you are writing this on an operating system that advanced the "state
of the art" without appearently needing 1/50th of what you may have on
your desk as a computing resource. So ironic.

oz
-- 
First learn your horn and all the theory.	Internet: oz@nexus.yorku.ca
Next develop a style. Then forget all that 	uucp: utzoo/utai!yunexus!oz
and just play.		Charlie Parker [?]	York U. CCS: (416) 736 5257

amr@rti.rti.org (Alan Roberts) (06/15/90)

In article <23473@uflorida.cis.ufl.EDU> bp@beach.cis.ufl.edu (Brian Pane) writes:
> ...
>Finally, note that large and "inefficient" programs advance the state
>of the art in software more often than small and clever programs.
>Consider X Windows.  It is a huge system designed for flexibility
>rather than efficiency, and it requires significant hardware power,
>but it has revolutionized the way we use computers.
>

Well, when you have a large base of "3M" workstations (well, they
are slightly better than that, but not much) that management will
not immediately replace (for many of the geo-political reasons that 
were discussed in an earlier article in this group), then I'm afraid
you have to argue that X Windows has great promise to revolutionize
how your users operate, but is "part of the current problem" 
relative to providing them service.

I don't have too much problem with your making use of large and
inefficient programs to advance the state-of-the-art in a research
context, but I'd like my VENDORS to work somewhat harder at keeping
their existing base operational when they convert those research
results into products.  The kind of turnover of large numbers of
platforms that bloated products require may cause a gleam in the
eyes of vendor bean-counters, but end-user bean-counters "just
say no," and the frustration, disappointment, and anger flows
downhill from there.

hedrick@athos.rutgers.edu (Charles Hedrick) (06/16/90)

Indeed.  I ported Kermit to Minix.  It took me several days to do.
On other versions of Unix you do it by typing "make", and maybe
fixing a few system dependencies.  The time was spent removing help
facilities and shortening text strings to get it to fit.  This is
not the way I want to spend my time (aside from being irked that
Kermit's nice user interface is being butchered in the process).

peter@ficc.ferranti.com (Peter da Silva) (06/16/90)

In article <Jun.16.00.15.42.1990.13822@athos.rutgers.edu> hedrick@athos.rutgers.edu (Charles Hedrick) writes:
> Indeed.  I ported Kermit to Minix.  It took me several days to [get it
> to fit]

Indeed. Which kermit were you using? Ours runs fine in small model.

+ which kermit
/usr/bin/kermit
+ size /usr/bin/kermit 
62124 + 30776 + 8606 = 101506 = 0x18c82
+ file /usr/bin/kermit 
/usr/bin/kermit:	separate executable not stripped
+ dates /usr/bin/kermit
C-Kermit, 4C(057) 31 Jul 85
Unix tty I/O, 4C(037), 31 Jul 85
Unix file support, 4C(032) 25 Jul 85
C-Kermit functions, 4C(047) 31 Jul 85
Wart Version 1A(003) 27 May 85
C-Kermit Protocol Module 4C(029), 11 Jul 85
Unix cmd package V1A(021), 19 Jun 85
User Interface 4C(052), 2 Aug 85
Connect Command for Unix, V4C(014) 29 Jul 85
Dial Command, V2.0(008) 26 Jul 85
Script Command, V2.0(007) 5 Jul 85
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.
<peter@ficc.ferranti.com>

wright@stardent.Stardent.COM (David Wright @stardent) (06/27/90)

In article <62777@sgi.sgi.com>, karsh@trifolium.esd.sgi.com (Bruce Karsh)
writes:

>> -- me
> - Karsh

>>But I must take issue with statements like:
>>In order to make the FFT be
>>really efficient, it's necessary to pretabulate trig tables and powers of 2, 
>>use tricky methods to perform the multiplications, arrange the code so that
>>the integer and floating point calculations are overlapped... etc.  When you
>>do all that, you wind up with a pretty fast, but very unclear program.

>I've never seen an optimizing compiler that will do any of these optimizations.
>It's a widely held belief that compiler optimizations can get the most out
>of an algorithm, but it's hardly ever true.  There are optimizations that
>compilers can do and optimizations that they can't do.  These are in the
>later set.

Whoops.  My fault for being unclear.  I didn't mean to imply that compilers
were smart enough to figure out how to pretabulate the trig tables.  Powers
of two, maybe.  However, you're going to have to be more specific about
what you mean by overlapped adds and multiplies that the compiler isn't
smart enough to reorganize.  Seems to me that with the advent of superscalar
chips, for example, the compilers damn well better be smart enough to do
this.  Or are we not talking about the same thing?

>Is there a C compiler which will recognize a complex multiplication and
>generate code which needs 3 multiplications instead of four?  Perhaps they
>ought to exist, but the don't.  I don't believe that compiler techniqes yet
>exist to make such a compiler.

But when you say "complex multiplication", exactly what do you mean?  An
example would be illuminating here.

>Assembly language is just one of the tools for
>solving these problems.  There seems to be an attitude that these
>problems aren't important, or are too "messy" to be dealt with.  Sometimes
>the result of this attitude is uncompetively slow systems.

Could be.  But I'm not trying to argue against the use of assembler where
it's needed; the question is "where is it needed?"  As an OS programmer,
I'm apt to be in situations where assembler is the only thing I can use,
and sometimes, as in your example of bcopy, it's just plain faster to do
it in assembler.  But obviously we're only going to do this where it 
makes a difference that matters.

>Quicksort is nowhere near optimal.  It's worst-case behavior is O(n*n).

Golly gee, no kidding.  But its expected behavior (unless it's really
stupidly coded) is O(n*log(n)).  I'm not arguing that the programmer doesn't
need to understand the algorithm, nor am I arguing that for the utterly
fastest behavior, the programmer can avoid looking at what the compiler
produces (I do this myself).  However, sometimes the result of this is
a note to the compiler developers that results in faster code.  And
sometimes it's the pleasant discovery that the compiler did just what I
wanted it to do.

And my compiler can beat up your compiler :-)

  -- David Wright, not officially representing Stardent Computer Inc
     wright@stardent.com  or  uunet!stardent!wright

After all, this is still the land of opportunity.  If you know where
to look.  -- Jack Douglas

karsh@trifolium.esd.sgi.com (Bruce Karsh) (06/28/90)

In article <1990Jun27.011149.2406@Stardent.COM> wright@stardent.Stardent.COM (David Wright @stardent) writes:
>Whoops.  My fault for being unclear.  I didn't mean to imply that compilers
>were smart enough to figure out how to pretabulate the trig tables.  Powers
>of two, maybe.

Maybe they could, but right now they don't.

>However, you're going to have to be more specific about
>what you mean by overlapped adds and multiplies that the compiler isn't
>smart enough to reorganize.

The MIPS processor can overlap a couple of floating point calculations with
an integer operation.  The MIPS optimizing compiler does not alway generate
optimal overlap for these cases.  Perhaps someday someone will generate
a compiler which does this optimally, but right now the MIPS compiler doesn't.

>  Seems to me that with the advent of superscalar
>chips, for example, the compilers damn well better be smart enough to do
>this.  Or are we not talking about the same thing?

We'll see how good a job the superscalar compilers can do.

>But when you say "complex multiplication", exactly what do you mean?  An
>example would be illuminating here.

I mean a multiplication of two complex numbers.  For example:

	(e + if) = (a + jb) * (c + jd)

can be calculated as:

	e = a*c - b*d;
	f = a*d + b*c;

This is the clear, obvious way to code this calculation.  It has four
multiplies and two summations.

It can also be coded as:

	u = (a-b)*d;
	e = u + a*(c-d);
	f = u + b*(c+d);

Which has only 3 multiplies.  (But 5 summations).

Some architectures (e.g. SPARC) take much longer to multiply than to add.

I know of no C compiler which will recognize a complex multiplication and
rewrite it in the faster form.  (It's not even clear that you'd want it to).
If you are interested in making these complex multiplies occur very rapidly,
then the later method should be used since it is faster.  It's not as clear
though.

>Golly gee, no kidding.  But its expected behavior (unless it's really
>stupidly coded) is O(n*log(n)).  I'm not arguing that the programmer doesn't
>need to understand the algorithm, nor am I arguing that for the utterly
>fastest behavior, the programmer can avoid looking at what the compiler
>produces (I do this myself).  However, sometimes the result of this is
>a note to the compiler developers that results in faster code.  And
>sometimes it's the pleasant discovery that the compiler did just what I
>wanted it to do.

I fully agree.  But the original posting stated that it was enough to just
right clear code because the optimizer will make the most of it.  In general,
it wont.  Maybe someday.

			Bruce Karsh
			karsh@sgi.com

ps@fps.com (Patricia Shanahan) (06/28/90)

In article <63034@sgi.sgi.com> karsh@trifolium.sgi.com (Bruce Karsh) writes:
>In article <1990Jun27.011149.2406@Stardent.COM> wright@stardent.Stardent.COM (David Wright @stardent) writes:
...
>
>Some architectures (e.g. SPARC) take much longer to multiply than to add.
>
...

There is certainly a major architectural difference in SPARC between
integer add (a register-to-register command) and integer multiply (which
has only indirect support). However, I don't think I have seen a program
that uses integer-complex. It would certainly not be a usable data type
for FFT.

There is no architectural difference in the status of floating point add
and floating point multiply in SPARC. They are both floating point
register-to-register commands. Of course, it is possible that some
implementations of SPARC do floating point adds faster than floating
point multiplies, but the difference should not normally be very big.

...
>But the original posting stated that it was enough to just
>right clear code because the optimizer will make the most of it.  In general,
>it wont.  Maybe someday.
...

I certainly agree that there are steps that can be taken to make FFT
fast that would not be within the state of the art for compilation
from a simple source, unless your compiler can recognise FFT's, and is
in the habit of reading the latest papers on how to do them.
--
	Patricia Shanahan
	ps@fps.com
        uucp : ucsd!celerity!ps
	phone: (619) 271-9940

karsh@trifolium.esd.sgi.com (Bruce Karsh) (06/29/90)

In article <9570@celit.fps.com> ps@fps.com (Patricia Shanahan) writes:

>There is certainly a major architectural difference in SPARC between
>integer add (a register-to-register command) and integer multiply (which
>has only indirect support). However, I don't think I have seen a program
>that uses integer-complex. It would certainly not be a usable data type
>for FFT.

Integer-complex FFT's are probably executed much more often than floating
point FFT's.  There are several microprocessor chips on the market, notably
the Motorola DSP56001, which are highly optimized for exactly this
calculation.  There are lots of articles on how to do this.

>There is no architectural difference in the status of floating point add
>and floating point multiply in SPARC. They are both floating point
>register-to-register commands. Of course, it is possible that some
>implementations of SPARC do floating point adds faster than floating
>point multiplies, but the difference should not normally be very big.

The change from having integer multiplies being much faster than FP
multiplies rather than having them be slower is one of the most interesting
trends that's occuring in computer architecture.

Is it really true that SPARC fp register to register ops are all constant-
time?  Are SPARC fp adds and multiplies both equal-time?  This isn't true for
the MIPS.  Does anyone out there know for sure?

>I certainly agree that there are steps that can be taken to make FFT
>fast that would not be within the state of the art for compilation
>from a simple source, unless your compiler can recognise FFT's, and is
>in the habit of reading the latest papers on how to do them.

I want THAT compiler.

			Bruce Karsh
			karsh@sgi.com

ps@fps.com (Patricia Shanahan) (06/29/90)

In article <63065@sgi.sgi.com> karsh@trifolium.sgi.com (Bruce Karsh) writes:
>In article <9570@celit.fps.com> ps@fps.com (Patricia Shanahan) writes:
>
[Dumb remark about non-usability of integer complex for FFT]
>
>Integer-complex FFT's are probably executed much more often than floating
>point FFT's.  There are several microprocessor chips on the market, notably
>the Motorola DSP56001, which are highly optimized for exactly this
>calculation.  There are lots of articles on how to do this.

Sorry - I was wrong about that. I forgot the DSP micros. I beg their 
pardon.

>The change from having integer multiplies being much faster than FP
>multiplies rather than having them be slower is one of the most interesting
>trends that's occuring in computer architecture.

I think in the long term it is going to turn out to be a temporary aberration.
A system that can do FP multiplies reasonably fast probably has the
ability somewhere to do e.g. 53*53->53 bit integer multiplies. It just
isn't necessarily accessible from the integer processor.

The general trend seems to be for getting the data to and from the registers
to become the real bottleneck, and arithmetic to get so fast that you
can do it as fast as the data can be moved around. I expect in the long
term a lot of operations will all become the same speed.

>
>Is it really true that SPARC fp register to register ops are all constant-
>time?  Are SPARC fp adds and multiplies both equal-time?  This isn't true for
>the MIPS.  Does anyone out there know for sure?

I didn't actually say they are constant time, just equal architectural
status and similar time. Also, remember that there are several implementations
of SPARC around with different floating point speeds. How does the difference
in speed between a MIPS FP add and a MIPS FP multiply compare with the
difference between a SPARC integer add and a typical SPARC integer multiply.
--
	Patricia Shanahan
	ps@fps.com
        uucp : ucsd!celerity!ps
	phone: (619) 271-9940

cik@l.cc.purdue.edu (Herman Rubin) (06/29/90)

In article <9570@celit.fps.com>, ps@fps.com (Patricia Shanahan) writes:
> In article <63034@sgi.sgi.com> karsh@trifolium.sgi.com (Bruce Karsh) writes:
> >In article <1990Jun27.011149.2406@Stardent.COM> wright@stardent.Stardent.COM (David Wright @stardent) writes:

			........................

> >But the original posting stated that it was enough to just
> >right clear code because the optimizer will make the most of it.  In general,
> >it wont.  Maybe someday.
> ...
> 
> I certainly agree that there are steps that can be taken to make FFT
> fast that would not be within the state of the art for compilation
> from a simple source, unless your compiler can recognise FFT's, and is
> in the habit of reading the latest papers on how to do them.

When can people in the computer field realize that this will always be the
case?  There is much that a compiler cannot handle well, and this will 
remain the case.  Computers cannot beat a human adept at something as
simple as chess, let alone the ingenuity required to produce efficient
computational procedures.

One could put FFTs in the language, but there are lots of others.  What
about such simple things as overflows and decent multiplication and
division?  There are numerous operations easily implemented in hardware
and expensive in software.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

markv@gauss.Princeton.EDU (Mark VandeWettering) (07/03/90)

In article <2273@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:

>When can people in the computer field realize that this will always be the
>case?  There is much that a compiler cannot handle well, and this will 
>remain the case.  Computers cannot beat a human adept at something as
>simple as chess, let alone the ingenuity required to produce efficient
>computational procedures.

*sigh*  Computers can certainly beat ME at chess, and I would imagine 
that most good compilers can beat most average assembly hackers on 
a given problem.  In particular, humans can be bad at effeciently managing
registers (let's just save 'em all).  Compilers ARE getting better, and its
not clear that humans are, so draw your own conclusion.

Also, once you have a good compiler, you needn't incur the cost of training
a person to become a proficient assembly language programmer.  (Therapy 
is expensive! :-)

Mark VandeWettering