[comp.arch] Small is beautiful

fouts@orville.nas.nasa.gov.UUCP (09/01/87)

In article <1883@encore.UUCP> adamm@encore.UUCP (Adam S. Moskowitz) writes:

> . . ., but why add the size limit?
>Why kill yourself to deal with what many people feel is a bad hardware design?

I do much of my work on a Cray 2.  It has 2 gigabytes of REAL memory.
I still try to write small programs.  (Some are even under a gigabyte
;-)  Seriously, small is something that a reasonable amount of effort
should be put into achieving.  Like most programming credos, it should
be practiced in moderation, rather excess.  When you can do it without
destroying maintainability, you can accomplish:

1) Faster code.  By taking the time to come up with a compact
   algorithm, you can usually find one faster than you were 
   going to use in the first place.

2) Easier to understand code.  This is only true if you try for
   moderation, but less code, if it is well thought out, means
   less to understand when trying to comprehend the program.

3) More useful code.  The more people who can use your code, the
   better off you are.

You can carry the less is more credo to far and get unmaintainable
code.  You can take memory is cheap to far and get unusable code.  As
always in engineering, truth lies in the middle ground, near the
swamp.

Marty

fpst@hubcap.UUCP (09/02/87)

in article <2640@ames.arpa>, fouts@orville.nas.nasa.gov (Marty Fouts) says:
> 1) Faster code.  By taking the time to come up with a compact ...
> 2) Easier to understand code.  This is only true if you try for...
> 3) More useful code.  The more people who can use your code, the
>    better off you are.

The other caveat has to do with not completely confuting the
optimization of the compiler.  As we get further away from von Neumann
architectures, the compiler will become more important as a tool past
the "save me from the detail" genre.  Now one doesn't know where or how
invocation proceeds.

Comments and/or proposals?

-- 
Steve Stevenson                            fpst@hubcap.clemson.edu
(aka D. E. Stevenson),                     fpst@clemson.csnet
Department of Computer Science,            comp.hypercube
Clemson University, Clemson, SC 29634-1906 (803)656-5880.mabell

srg@quick.COM (Spencer Garrett) (09/21/87)

I'd like to start by saying that I'm firmly in Henry Spencer's camp on this
one.  Performance isn't the only benefit, however.  Smaller programs are
easier to write, debug, maintain, and (drumroll here) *use*.  I measure
"small" by economy of concepts more than line or byte counts, though they
tend to follow.  Let me give an example.  For many years (back in the dark
ages) I used TECO as my primary text editor.  I used it day in and day out,
and I consider myself a wizard (I wrote a multi-user, protected mode OS
for the PDP-8 during this time!), but I *never* reached the point where I
didn't want the TECO manual handy at all times.  Later I found myself in
charge of a system with a screen-oriented editor very reminiscent of TECO
(you know - every ascii code is a command, and then some).  These editors
are very powerful, (I've seen a screen editor written as a TECO macro, for
pete's sake!) but they are complicated to use and impossible to remember.
Well, *everyone* waited to use the screen editor rather than using the line
editor, but it only ran on an expensive vector display which we couldn't
afford to replicate.  I spent quite a while watching how people used it
before starting to write a screen editor which would run on ordinary terminals
(this was pre-UNIX and very pre-VI).  I discovered that half a dozen commands
constituted nearly the entire working set for most users.  Most of the time
they would hold the right arrow key down and let it auto-repeat to get to
the end of the line, rather than trying to remember the direct sequence.  I
decided that what people needed was an editor simple enough to use that it
wouldn't get in their way, so they could think about the program and not
about the editor.  Well, I wrote it and it has many rabid adherents to this
day.  It has grown to a whopping 2800 lines of C, and nobody bothers to keep
the manual handy.  It takes about 32k + buffer(expanding) and averages
1/6th the cpu of vi.  It has maybe 20% of the functionality of vi, but that's
more than almost anybody can remember how to use in vi.  The source code for
vi, on the other hand, is twice as big as the V7 kernel, and there are well-
known bugs that nobody can find.  On the other end of the scale, I've seen
a single copy of EMACS bring a 750 to its knees, and you have to think very
carefully about every keystroke or you may delete every source file in the
county.  The difference here is one of design, not implementation.  I couldn't
write vi in 2800 lines, much less EMACS.  I simply chose to seek the essence
of editing, not the outer limits.  It was exactly this feeling which gave us
UNIX instead of VMS, and I don't think the shrinking cost of mips and megabytes
is any reason to abandon the quest.  I therefore humbly proclaim Spencer's Law:

	No program over 10,000 lines long can ever be made to work (correctly).

I think this fundamental constant describes people, not computers, but that
is, after all, what this business is really about.

I hereby don my asbestos suit and set my autodialer to 911 ---

					spencer

rogerh@arizona.edu (Roger Hayes) (09/21/87)

I am thinking of having this article engraved on my office wall.
Both for myself and for future denizens.

		Roger Hayes, rogerh@arizona.edu

bldrnr@apple.UUCP (Brian Hurley) (09/22/87)

>I measure
>"small" by economy of concepts more than line or byte counts, though they
>tend to follow.  Let me give an example.  For many years (back in the dark
>ages) I used TECO as my primary text editor.  I used it day in and day out,
>and I consider myself a wizard (I wrote a multi-user, protected mode OS
>for the PDP-8 during this time!), but I *never* reached the point where I
>didn't want the TECO manual handy at all times  [...]

	Some people have no problem remebering how to use 'vi',-- THE REST OF
US would do better if we added complexity, where complexity is needed.  
Terminals, not the software, need expanding.  I use a microcomputer for my
text editing needs.  It makes an incredible terminal/editor.  Fail to see why
there is such a fuss over ergonomics in a program that must by definition
communicate with it's user by having him/her bang away at 58 switches mounted
in a confusing array.  If this was not a problem, add to that cases where 
every key has an alternate meaning, depending on the MODE of the program.
(How many times have you had to use the 'dog-terminal' in a lab?  Isn't it 
strange that the only keys that don't work are the ones you really need?
	I'm not talking about <CR> or <BS>  I'm talking about arrow-keys, 
[Y]ank, and [P]ut. Just try spitting up a paragraph sometime, or relocating a
function for 'documentation purposes'.  Character only devices have been around
for a very long time admittedly, but they are not the problem.  I do perform
most of my text work off line, but that need not be.  This article was written
using a program that is an interface with large portions of the unix
environment.  Upload's and downloads are transparent, It keeps track of 'vi's
mode for me.  I click a mouse and it positions the cursor... I could elaborate
more but you get the point.
		Yeah it is a Macintosh, and I am not ashamed of that.  There's
got to be somthing to a 'terminal-emulator' that sorts the [i]rritation out
of 'vi'.  

		Summing up I would say that simplicity in software does not
solve the problem.  It creates incomplete solutions.  What we need it complete
hardware, as well as complete software, to supliment the 'wetware'.

>	No program over 10,000 lines long can ever be made to work (correctly).
>
>I think this fundamental constant describes people, not computers, but that
>is, after all, what this business is really about.
>
>I hereby don my asbestos suit and set my autodialer to 911 ---
>
>					spencer

	Thanks, spencer for pointing out that what is needed is a simple
interface.

--/--bldrnr--> 

*******
NOTE:My veiws are my views, and do not reflect in anyway, views of my employer
(I am not an Evangelist for Macintosh... I hate what it puts an innocent 
C programmer through.)

brucek@hpsrla.HP.COM (Bruce Kleinman) (09/22/87)

> No program over 10,000 lines long can ever be made to work (correctly).
  ...
> I hereby don my asbestos suit and set my autodialer to 911 ---

Flames?  Not from me.  Only a hearty *sigh* because I couldn't agree more.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              Bruce Kleinman
              Hewlett Packard -- Network Measurements Division
                          Santa Rosa, California

                         ....hplabs!hpsrla!brucek
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

keith@reed.UUCP (Keith Packard) (09/24/87)

> No program over 10,000 lines long can ever be made to work (correctly).
  ...
> I hereby don my asbestos suit and set my autodialer to 911 ---


Actually, I would only agree with this statement about C.  I'd rate maximum
program size differently with different languages.  Data hiding and run-time
checks can substantially reduce the number of bugs caused by mysterious
interactions.

Object oriented programming often lets substantial portions of a program to
be borrowed from an already working system (code sharing) which makes the
actual amount of new code substantially smaller while still ending up with a
giant system.  Here's how I rate them:

	Forth:		1 line
	Basic:		10 lines
	Asm:		50 lines
	Fortran:	100 lines
	Pascal:		1000 lines (standard pascal w/o modules)
	lisp:		3000 lines
	C:		10000 lines
	Modern Pascal:	20000 lines (w/ modules and type bashing)
	C++:		20000 lines
	
Modern Pascal and C++ trade problems -- C++ doesn't have run-time array
bounds checking and some people have a tendancy to use pointers a bit wildly.
Pascal, on the other hand, has poor support for initialization and no real
code-sharing possibilities (as well as no object support).

I know of no system useful in designs > 20000 lines.  SmallTalk comes close,
but most SmallTalk systems would run that amount of code at a snail's pace
(if it could hold all of it in the object table anyway).

I've worked on too many 100000-200000 line C programs...

-- 
Keith Packard			tektronix!reed!keith

trb@stag.UUCP ( Todd Burkey ) (09/25/87)

In article <121@quick.COM> srg@quick.COM (Spencer Garrett) writes:
>I'd like to start by saying that I'm firmly in Henry Spencer's camp on this
>one.  Performance isn't the only benefit, however.  Smaller programs are
>easier to write, debug, maintain, and (drumroll here) *use*.  I measure
>"small" by economy of concepts more than line or byte counts, though they
>tend to follow.

I couldn't agree more. Having used line editors for years, and then
the various Vax editors, Emacs, VI, and psuedo editors like Apollos'
DM thing (which I am currently using), I finally broke down several
weeks ago and started a spare time project to develop a programmer's
editor. After glancing at the size of the EMACS/uEmacs/Vi source code
and the horrendously ineffecient way data is kept track of, I decided
to start from scratch. And I would be surprised if my code even came
to 2800 lines (even with procedure and code segment snap control and
smart variable lookup)...The key thing in getting to small code is to
think out the project completely before you start and to not get
carried away with gimmicks. For example, I won't have a twiddle
character or line option...if I design cut and paste correctly, then
there will be no need for it. I also don't plan on putting any
limitations on the code for small systems that have limitations of
segments and the like (i.e. it will run on my Atari ST, but I feel it
is silly to butcher the code to make it work around the segment sizes,
limited memory, and slow speed on my IBM PC).

  Also awaiting with my Fire cap on...
    -Todd Burkey
    trb@stag.UUCP

mdg@suprt.UUCP (Marc de Groot) (10/07/87)

In article <121@quick.COM>, srg@quick.COM (Spencer Garrett) writes:
> (I've seen a screen editor written as a TECO macro, for
> pete's sake!)

Didn't Richard Stallman write the original EMACS as a TECO macro? That's
what I thought I heard.

> I
> decided that what people needed was an editor simple enough to use that it
> wouldn't get in their way, so they could think about the program and not
> about the editor.  Well, I wrote it and it has many rabid adherents to this
> day.  It has grown to a whopping 2800 lines of C, and nobody bothers to keep
> the manual handy.

Yay! This is my idea of how software should be done!

> I therefore humbly proclaim Spencer's Law:
> 
> 	No program over 10,000 lines long can ever be made to work (correctly).

May I quote you on this?

I have only recently started working professionally in the UNIX world. I came
from more modest microcomputer environments. While I can understand the big
boys' argument that people who live with 64K limitations and the like become
crippled (or at least blindered), I still feel that the trend towards eating
memory for breakfast, megalithic programs, and inefficient use of CPU cycles
is misguided.
I have programmed in Forth
	(half the readers are deciding to skip the rest of the article now)
for a number of years on 8- and 16-bit machines, *and* the 68020. I have
used interrupts, implemented multitasking, simulated privileged mode op-
eration on the Z80, written custom I/O drivers for hardware I built myself,
and many other "heavy" systems-type jobs. I have never experienced the sen-
sation that the code changes by itself overnight before working with UNIX
and its hopelessly obfuscated, under-documented utilities, nor have I ever
worked with the conviction that "you can't get much of anything done in 1
day".
	I hasten to point out that Forth is not an environment for running
off-the-shelf DBMS and accounting software. It *is* an environment for
software development which affords unequalled flexibility in a compact
space. The only environments which provide such extensive power are Lisp
environments, and I've never owned a machine that was big enough for Lisp.
	I have developed *many* programs over the past few years that have
no bugs in them. These programs were designed with Mr. Garrett's philosophy
in mind: Try to include the essence without bogging the program down in
unnecessary functionality. Mr. Charles Moore, the inventor of Forth and
chief architect of the Novix NC4000 hardware Forth processor, has written
a Forth interpreter (for the NC4000) that fits in 2 kilowords, and is
sufficiently powerful to recompile itself. It is also a reasonable soft-
ware development environment.
	Now THAT'S what I call software.
-- 
Marc de Groot (KG6KF) @ Microport Systems, Inc.
UUCP: {hplabs, sun, ucbvax}!amdcad!uport!mdg
FONE: 408 438 8649 Ext. 31
DISCLAIMER: "..full of sound and fury, not necessarily agreeing with anyone.."

callen@ada-uts (10/07/87)

>Object oriented programming often lets substantial portions of a program to
>be borrowed from an already working system (code sharing) which makes the
>actual amount of new code substantially smaller while still ending up with a
>giant system.  Here's how I rate them:
>
>        Forth:          1 line
>        Basic:          10 lines
>        Asm:            50 lines
>        Fortran:        100 lines
>        Pascal:         1000 lines (standard pascal w/o modules)
>        lisp:           3000 lines
>        C:              10000 lines
>        Modern Pascal:  20000 lines (w/ modules and type bashing)
>        C++:            20000 lines

What about Ada? (No snickering, please.) I've been working on the runtime
system of an Ada compiler that is written in Ada (self-bootstrapped) for
well over a year, and at this point I wouldn't trade Ada for any of the
languages on the list above, FOR LARGE SYSTEMS (quick hacks in Ada are a
pain). But then I don't know Lisp or C++. Ada really makes it easy to
write nearly bug-free code that works first time. I've only done a small
amount of programming in Modula-II, but since it shares many of Ada's
key concepts, I suspect that this is true of Modula-II as well. Of course,
I'm not about to claim that our compiler is bug-free.... :-)

-- Jerry Callen    ...{ihnp4,ima}!inmet!callen

earl@mips.UUCP (Earl Killian) (10/12/87)

In article <115@suprt.UUCP>, mdg@suprt.UUCP (Marc de Groot) writes:
> Didn't Richard Stallman write the original EMACS as a TECO macro? That's
> what I thought I heard.

Yes, this is true.  However, it was not written in ordinary TECO, but
rather ITS TECO, which is a high-powered programming language in its
own right, with far more control structure and data primitives than C
or Pascal (the control structures are mostly borrowed from Lisp).  A
better debugging environment too.  While powerful, ITS TECO has an
obscurity quotient higher than just about any programming language
ever invented.  (About as close as a hacker can get to mainlining.)

If you think small is beautiful, then you'd probably love the original
EMACS; the TECO code was probably 8x smaller than the corresponding
pdp10 machine code would be.  The number of lines of source is
probably 4x smaller than the corresponding C code would be.  But for
some very strange reason, none of the subsequent EMACS implementations
has ever used ITS TECO as an implementation language :-).

eugene@pioneer.arpa (Eugene Miya N.) (10/12/87)

In article <56700003@ada-uts> callen@ada-uts writes:
>>        Forth:          1 line
	. . . List of Languages
>What about Ada? (No snickering, please.) I've been working on the runtime

Funny, harkens of the days when these arguments were used with APL and
PL/I.  Shows the limitations of Usenet readership, sigh!  A real loss
for the hundreds of IBM language design people.

From the Rock of Ages Home for Retired Hackers:

--eugene miya
  NASA Ames Research Center
  eugene@ames-aurora.ARPA
  "You trust the `reply' command with all those different mailers out there?"
  "Send mail, avoid follow-ups.  If enough, I'll summarize."
  {hplabs,hao,ihnp4,decwrl,allegra,tektronix}!ames!aurora!eugene