[comp.misc] Basics of Program Design

brister@td2cad.intel.com (James Brister) (06/20/88)

This may be a stupid question, and I hope I'm in the right newsgroup.
Send any flames directly to me and I'll shut up.

For the past few years my professors have rambled on about data
structures, compiler design and more, ad nauseam. But none of them
have ever mentioned the best way, (or any way really) to go about
sitting down and actually writing a program.

What I mean is: do experienced programmers usually write flow charts,
or pseudo code. Do they write the whole program out by hand first or
build it up in their favorite editor?

What's a good way to organize the writing process? I have all these
great ideas running around in my head that I find difficult to get
out without making mistakes in the task itself.

If you want to reply to me personally, and if anyone else is 
interested, I will summarize any answers, and then post them.

+------------------------------------+----------------------------------------+
| Any opinions are my own and not    |   brister@td2cad.intel.com             |
| of my employer.                    |   USNAIL: 3065 Bowers Ave MS SC9-37    |
| (Corporations only have policys)   |           Santa Clara, CA 95051        |
+------------------------------------+----------------------------------------+

jfh@rpp386.UUCP (John F. Haugh II) (06/22/88)

[ remember kids, this is just the way this lowly hacker does things.  your
  milage may vary.  i'm bored, so writing this posting seemed like a nice
  way to pass a few minutes. - john. ]

In article <901@td2cad.intel.com> brister@td3cad.UUCP (James Brister) writes:
>This may be a stupid question, and I hope I'm in the right newsgroup.
>Send any flames directly to me and I'll shut up.

not much else going on.  not enough daylight left for a good bike ride,
so here goes.

>For the past few years my professors have rambled on about data
>structures, compiler design and more, ad nauseam. But none of them
>have ever mentioned the best way, (or any way really) to go about
>sitting down and actually writing a program.

yes, colleges and universities don't seem to big on useful information.
a friend in college was charged with writing a string length function
in C for a compiler writing class so he chose the `elegant' solution.

strlen (s)
char	*s;
{
	if (*s)
		return (strlen (s + 1) + 1);
	else
		return (0);
}

not such a smart move.  always consider the cost of your algorithm.

>What I mean is: do experienced programmers usually write flow charts,
>or pseudo code. Do they write the whole program out by hand first or
>build it up in their favorite editor?

i don't think i've written a flow chart in 3 years, and i've been
programming for 10 years now.  the methods i've used ranged from
hand code, flow chart, keypunch over and over again, psuedo code, and
just plain hack at a terminal.  my current favorite is a mix of
the psuedo code game and hack at a terminal.  i just finished 45,000
lines of C this spring, most of which i hacked at a terminal.

>What's a good way to organize the writing process? I have all these
>great ideas running around in my head that I find difficult to get
>out without making mistakes in the task itself.

i try to use successive refinements.  throw some good ideas on a 
piece of paper or tube, crank the sucker up, and see where you missed.
the most useful thing i find is to get the ideas written down while
they are fresh, then work out the algorithms later.  and after that,
do the final coding.  coding isn't such a big deal, algorithms are.

worst mistake is to write code to soon.  code once writen becomes
engraved in stone.  always be ready to throw the whole mess away and
start again.  there is a fantastic text, "The Mythical Man-Month -
Essays in Software Engineering" which is a must read.  go buy it and
read it fifty or a hundred times.

- john.
-- 
John F. Haugh II                 +--------- Cute Chemistry Quote ---------
River Parishes Programming       |  "If you aren't part of the solution,
UUCP:   killer!rpp386!jfh        |   you are part of the precipitate."
DOMAIN: jfh@rpp386.uucp          |             -- Some FORTUNE program

webber@aramis.rutgers.edu (Bob Webber) (06/23/88)

In article <3061@rpp386.UUCP>, jfh@rpp386.UUCP (John F. Haugh II) writes:
> ...
> yes, colleges and universities don't seem to big on useful information.
> a friend in college was charged with writing a string length function
> in C for a compiler writing class so he chose the `elegant' solution.
> 
> strlen (s)
> char	*s;
> {
> 	if (*s)
> 		return (strlen (s + 1) + 1);
> 	else
> 		return (0);
> }
> 
> not such a smart move.  always consider the cost of your algorithm.

A perfectly fine algorithm.  Any decent compiler would remove the tail
recursion.  I suppose one could do a little loop unrolling, but on
cpu's with instruction caches that is not always a big win.  Also it
is not clear that strlen would be heavily used (certainly it was never
a bottleneck in any of the compilers I wrote).  Of course, it would
have been nice if he could have psychically reached into the C
libraries and invoked the standard strlen function.  [Anyway it is a
much smarter move than declaring a function called ``write'' to handle
the write statement in the source language.]

>...
> worst mistake is to write code to soon.  code once writen becomes
> engraved in stone.  always be ready to throw the whole mess away and

Even more important, sometimes when you think long enough about a problem
it just goes away.

> start again.  there is a fantastic text, "The Mythical Man-Month -
> Essays in Software Engineering" which is a must read.  go buy it and
> read it fifty or a hundred times.

For sure and it is still available in print as a trade paperback.
Brooks wrote a note in IEEE Software Engineering not too long ago
saying how every time someone comes up to him and tells him how useful
his book is and how true to life the examples -- he shudders.
Apparently he had hoped things would get better.

------- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

p.s., your description of how to ``actually write'' code looked fine to
me (I bet that makes you nervous).

stevev@tekchips.TEK.COM (Steve Vegdahl) (06/25/88)

In article <Jun.22.22.48.57.1988.7380@aramis.rutgers.edu>, webber@aramis.rutgers.edu (Bob Webber) writes:
> In article <3061@rpp386.UUCP>, jfh@rpp386.UUCP (John F. Haugh II) writes:
> > ...
> > yes, colleges and universities don't seem to big on useful information.
> > a friend in college was charged with writing a string length function
> > in C for a compiler writing class so he chose the `elegant' solution.
> > 
> > strlen (s)
> > char	*s;
> > {
> > 	if (*s)
> > 		return (strlen (s + 1) + 1);
> > 	else
> > 		return (0);
> > }
> > 
> > not such a smart move.  always consider the cost of your algorithm.
> 
> A perfectly fine algorithm.  Any decent compiler would remove the tail
> recursion.

Unfortunately, the above program is not tail-recursive.  The result of the
recursive "strlen" call is incremented before its value is returned.  It
would take a pretty sophisticated compiler to transform this into an
iteration.  Among other things, it would probably have to use the
associativity of "+".

Consider changing the algorithm so that we the elements of a "string" of
floating point numbers.  (Floating-point addition is not associative.)
floatStringSum (s)
 float	*s;
 {
 	if (*s = 0.0)
 		return (floatStringSum (s + 1) + *s);
 	else
 		return (0.0);
 }

BTW, does a typical C compiler perform tail-call optimization.

		Steve Vegdahl
		Computer Research Lab
		Tektronix Labs
		Beaverton, Oregon

bill@proxftl.UUCP (T. William Wells) (07/01/88)

In article <Jun.22.22.48.57.1988.7380@aramis.rutgers.edu>, webber@aramis.rutgers.edu (Bob Webber) writes:
> In article <3061@rpp386.UUCP>, jfh@rpp386.UUCP (John F. Haugh II) writes:
> > strlen (s)
> > char        *s;
> > {
> >     if (*s)
> >             return (strlen (s + 1) + 1);
> >     else
> >             return (0);
> > }
> >
> > not such a smart move.  always consider the cost of your algorithm.
>
> A perfectly fine algorithm.  Any decent compiler would remove the tail
> recursion.

Ahhhh.  Last I heard, tail recursion implies a function call
immediately before a return.  This function has an add (or
increment) before the return.  Has the definition of tail
recursion changed?

Also, get REAL.  Merely asserting that "any decent compiler
would..." is not relevant.  This just asserts that YOU think that
a compiler that does not deal with tail recursion is not decent.
In the real world, most compilers do NOT deal with tail recursion,
it not often being very useful to do this for C programs.

A perfectly good algorithm can be a BAD choice in the real
world.  Yes, the algorithm is "good", which is to say, it works
and is of the best possible order, but it is poorer than the
standard implementation since its constant factor is going to be
larger.  What you learn in school are the tools of the trade,
what that unfortunate is running into is that school does not
teach how to select the right tool for the job.

webber@porthos.rutgers.edu (Bob Webber) (07/02/88)

In article <395@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
> In article <Jun.22.22.48.57.1988.7380@aramis.rutgers.edu>, webber@aramis.rutgers.edu (Bob Webber) writes:
> > > [recursive strlen which you should all be familiar with by now]
> > > not such a smart move.  always consider the cost of your algorithm.
> > A perfectly fine algorithm.  Any decent compiler would remove the tail
> > recursion.
> Ahhhh.  Last I heard, tail recursion implies a function call
> immediately before a return.  This function has an add (or
> increment) before the return.  Has the definition of tail
> recursion changed?

Nope.  There is more to it than just tail recursion, but tail
recursion is an essential part of the optimization.

> In the real world, most compilers do NOT deal with tail recursion,
         ^^^^^^^^^^ -- what makes you think your world is real?
> it not often being very useful to do this for C programs.

Well, there is a free, high quality C compiler for 68000-based systems
and various others that does do tail recurision (i.e., gcc) among other
things.  [However, it wouldn't catch this particular case -- perhaps
that is why you aren't using it? ]  Optimizations fall into two categories:
those that handle the poor fit between language and architecture and those
that save programmer time.  While for a language like SCHEME, this
would be viewed as a language-related optimization, in a language like
C it falls in the second category.  

> A perfectly good algorithm can be a BAD choice in the real
> world.  Yes, the algorithm is "good", which is to say, it works
> and is of the best possible order, but it is poorer than the
> standard implementation since its constant factor is going to be
> larger. 

Yes, you have learned the first lesson.  Let me introduce you to the
second lesson: Not all code is executed the same amount of time.  Thus
it is more important for some things to be fast than it is for others.
Strlen does not make sense as a function to be optimizing the death out
of.  After all, if it is used all that much -- why isn't it a macro?

------ BOB (webber@athos.rutgers.edu ; rutgers!athos.rugters.edu!webber)

chris@mimsy.UUCP (Chris Torek) (07/03/88)

In article <395@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) answers
Bob Webber with:
[tail-recursive strlen deleted]
>... Merely asserting that "any decent compiler would..." is not relevant.
>This just asserts that YOU think that a compiler that does not deal with
>tail recursion is not decent.  In the real world, most compilers do NOT
>deal with tail recursion, it not often being very useful to do this for
>C programs.

Actually, there are (only) two reasons `real world' C compilers do nothing
about tail recursion:  Hysterical Raisins, and debugging.

top:
	since C compilers do not eliminate tail recursion, programmers
	avoided it; since programmers avoided it, it does not occur
	often; since it does not occur often, compilers that eliminate
	it do work that rarely pays off; since the work rarely pays
	off, those compilers are not as fast as others; since the
	compilers are not as fast, they are not considered as good;
	since they are not considered as good, they are not used; since
	they are not used, C compilers do not eliminate tail recursion;
	goto top.

:-)

Debugging programs in which tail recursion has been optimised away is
often a confusing experience.  Combined with the Raisins, this may
make enough of an argument to keep compiler writers from doing more work.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bill@proxftl.UUCP (T. William Wells) (07/06/88)

In article <Jul.2.04.25.59.1988.8179@porthos.rutgers.edu>, webber@porthos.rutgers.edu (Bob Webber) writes:
> > In the real world, most compilers do NOT deal with tail recursion,
>          ^^^^^^^^^^ -- what makes you think your world is real?

14 years of commercial programming, the last 5 almost exclusively
C. Languages used at least a year to earn paychecks,
chronologically: Fortran, Basic, various database languages,
COBOL, RPG, C.  Over a half a dozen assembly languages used for
commercial programming.  Recent projects: project leader for
improvements to a spelling checker for 6 (at the time) languages,
currently doing R&D and for a grammar checker.

Real enough?

> > it not often being very useful to do this for C programs.
>
> Well, there is a free, high quality C compiler for 68000-based systems
> and various others that does do tail recurision (i.e., gcc) among other
> things.  [However, it wouldn't catch this particular case -- perhaps
> that is why you aren't using it? ]

That's fine for 68000's (like the Sun I am working on) but not
for IBM-PC's (our secondary machines) nor for the 100+ compilers
our code is regularly compiled with. Relying on compilers to do
tail recursion optimization would be slitting our own throats.

>                                     Optimizations fall into two categories:
> those that handle the poor fit between language and architecture and those
> that save programmer time.  While for a language like SCHEME, this
> would be viewed as a language-related optimization, in a language like
> C it falls in the second category.

Sorry, but using purely tail recursive routines does not save
time for this programmer. Anywhere you would use tail recursion,
I'd automatically code a while loop.  (See below for why.) This
means that your optimization actually falls in a third category:
optimizations designed to let the programmer choose his own
personal coding style.  This does not apply to tail recursion
discovered by the optimizer, however.

Also, another real-world point: recursive routines in general
need to be used carefully, to avoid excess resource consumption.
As an example, I recently translated a rather complicated parsing
routine from SCHEME to C. In the translated version were several
recursive routines. One of the recursive routines was left alone,
the greatest depth of recursion was the length of a grammar rule
and removing the recursion would have been a real pain.  Another
got fixed (even though it required extensive recoding) because it
was not possible to bound the stack usage. Our customers WILL
NOT let us chew resources with wild abandon.

> > A perfectly good algorithm can be a BAD choice in the real
> > world.  Yes, the algorithm is "good", which is to say, it works
> > and is of the best possible order, but it is poorer than the
> > standard implementation since its constant factor is going to be
> > larger.
>
> Yes, you have learned the first lesson.  Let me introduce you to the
> second lesson: Not all code is executed the same amount of time.  Thus
> it is more important for some things to be fast than it is for others.
> Strlen does not make sense as a function to be optimizing the death out
> of.  After all, if it is used all that much -- why isn't it a macro?

Look buddy, I know both lessons. From the contents of your
posting, I'd say better than you. One of the things I do better
than damn near anyone else is make C programs go faster and get
smaller.

So here is a third lesson for you: try to make everything you do
as good as you can. The notion that a routine is only going to be
used a small fraction of the time is no excuse for sloppiness.
Even if the routines are only going to be used 1% of the time,
those damn little routines can end up eating much of your
program's run time.

To make this concrete, let me describe the profile output from a
typical program after I have finished with it: The first few
routines take about 5-10% of the execution time apiece.  The
remaining routines take less than a few percent apiece of the
execution time.  (This happens because I follow your second
lesson: I beat the execution time hogs down until they are no
longer hogs.)

If the coder of those small change routines was as sloppy as you
advocate, that sloppiness would result in many or most of those
trivial routines being slower. Since those routines, in
aggregate, make up a significant fraction of my program (from
the standpoint of execution time), that means that my program
will go significantly slower.

Here is another example. A program I had to work with (a C
version of the pattern generator for Knuth's hyphenator) spent
40% of its time in sscanf. There was NO GOOD REASON for
sscanf(buf, "%d %s", &n, &ptr) to take that long, other than
sloppy programming.  (I replaced the call with my own scanning
routines, the time spent in them was then 5% of the execution
time.) Perhaps someone else can give an example of a program
where strlen() did the same thing?

And a final point. You have encouraged sloppiness in writing
strlen() by justifying it with "that routine is never going to be
executed much anyway." Even granted that proposition you have
forgotten an important point. If you do not make it a habit to
write good code ALL THE TIME, that means two things: 1) Your code
will be generally less efficient than it could have been.  2)
When it comes time to write efficient code, you will not have
the experience needed to do it.

Learning is a cumulative process. This little bitty routine that
will not be executed much is as good a place for thinking about
efficiency as any other, and possibly better. Do it enough and
efficient coding becomes second nature. Then you will naturally
write efficient programs and as a matter of course will not do
stupid things like writing strlen() recursively.

chris@mimsy.UUCP (Chris Torek) (07/06/88)

In article <430@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>... try to make everything you do [including strlen()]
>as good as you can. The notion that a routine is only going to be
>used a small fraction of the time is no excuse for sloppiness.

Actually, strlen is all too often more than just a few percent.  This
shows up in particular on 4.3BSD on the MicroVAX II, where strlen is
implemented using the `locc' instruction.  On the MicroVAX II this
instruction is not implemented in hardware, and causes a trap to
emulation code.  Unfortunately, the fancy locc version is almost as
much faster on the other VAXen as it is slower on the uVAX II, when
compared with the obvious version.

>Learning is a cumulative process. This little bitty routine that
>will not be executed much is as good a place for thinking about
>efficiency as any other, and possibly better. Do it enough and
>efficient coding becomes second nature. Then you will naturally
>write efficient programs ....

(wow, someone agrees with me :-) )  This needs to be tempered a bit,
though: there is usually no point in optimising the test in main to see
whether argc is correct....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

chpf127@ut-emx.UUCP (J. Eaton) (07/06/88)

In article <430@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
> 
>    [stuff about no excuse for sloppiness deleted]
>  (hey, I agree with this, at least for programming)
> 
> To make this concrete, let me describe the profile output from a
> typical program after I have finished with it: The first few
> routines take about 5-10% of the execution time apiece.  The
> remaining routines take less than a few percent apiece of the
> execution time.

   But there are thousands of 'em and the whole thing takes weeks
   to execute!! !-) !-)

   J. Eaton
   UT Department of Chemical Engineering and ugly Fortran

   ( But we're tyring to get prettier! )

gateley@mips.csc.ti.com (John Gateley) (07/07/88)

In article <430@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>[...]
>our code is regularly compiled with. Relying on compilers to do
>tail recursion optimization would be slitting our own throats.
>[...]
>Sorry, but using purely tail recursive routines does not save
>time for this programmer. Anywhere you would use tail recursion,
>I'd automatically code a while loop.  (See below for why.) This
>[...]
>Also, another real-world point: recursive routines in general
>need to be used carefully, to avoid excess resource consumption.
>As an example, I recently translated a rather complicated parsing
>routine from SCHEME to C. In the translated version were several
>recursive routines. One of the recursive routines was left alone,
>the greatest depth of recursion was the length of a grammar rule
>and removing the recursion would have been a real pain.  Another
>got fixed (even though it required extensive recoding) because it
>was not possible to bound the stack usage. Our customers WILL
>NOT let us chew resources with wild abandon.
>[...]
>efficient coding becomes second nature. Then you will naturally
>write efficient programs and as a matter of course will not do
>stupid things like writing strlen() recursively.

The tail-recursion debate has been going on for a while, and I would
like to defend tail recursion, recursion, and optimizations in general
against the points made above. First: tail recursion has been only
discussed as a "true" recursive call optimization in these postings.
It also applies to many non-recursive and indirectly recursive calls.
Tail recursion optimizations serve several more purposes than what has
been brought out here: it does not require pushing a return address on
the stack (no stack growth, this is the optimization that has been discussed
here); it allows more efficient argument passing (in the self-recursive
case, all you have to do is change the arguments on the stack, not push
new ones); it eliminates the need for saving register variables on the
stack; and probably several more that I dont know about. I prefer these
optimizations to take place. It has been mentioned that it is hard to
debug with a tail-recursion optimizing compiler. I disagree and I use
one every day.

As for recursion: it is a tool just like while loops. It is a useful
abstraction: many algorithms are much more simple in a recursive than
non recursive form (my opinion of course). I feel that in order to get
the most out of a language, it should have good tools (including recursion).
It is up to the compiler writers to make those tool perform well (including
being efficient). For example, there is a simple transformation which
completely removes the stack from a recursive program (translating the
program into continuation passing style). This is useful for compiler
writers: they can use it where appropriate to make the program execute
more efficiently.

I have a story which is similar to some that have been posted. In college,
one of my homework assigments was to determine if a graph was connected.
The graph was represented as a array with 1's for edges. I thought that
recursion was terribly inefficient, so I just multiplied the matrix against
itself the neccesary number of times. I also did it recursively. Guess which
one was faster! The recursive one of course.

I do not feel that recursion and/or tail-recursion are bad things. They
are extremely powerful tools, and like all tools (especially extremely
powerful ones) they must be used with care. But, if you think about it,
just about any programming language tool must be used with care.

John Gateley
The above statements are my opinions.

friedl@vsi.UUCP (Stephen J. Friedl) (07/07/88)

In article <12328@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
< (wow, someone agrees with me :-) )  This needs to be tempered a bit,
< though: there is usually no point in optimising the test in main to see
< whether argc is correct....

Hey Chris, I thought *everybody* agreed with you? :-) :-)

-- 
Steve Friedl     V-Systems, Inc. (714) 545-6442     3B2-kind-of-guy
friedl@vsi.com     {backbones}!vsi.com!friedl    attmail!vsi!friedl
--------Nancy Reagan on Professor Chomsky: "Just say Noam" --------

larry@merlin.cvs.rochester.edu (Lawrence Snyder) (07/08/88)

In article <430@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>To make this concrete, let me describe the profile output from a
>typical program after I have finished with it: The first few
>routines take about 5-10% of the execution time apiece.  The
>remaining routines take less than a few percent apiece of the
>execution time.

How did you get these statistics?
(I'm not challenging your numbers.
Obtaining a breakdown of execution time by function call sounds
like a real useful tool.)

thanx,
lar

bill@proxftl.UUCP (T. William Wells) (07/10/88)

In article <807@vax.UUCP>, larry@merlin.cvs.rochester.edu (Lawrence Snyder) writes:
> In article <430@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
> >To make this concrete, let me describe the profile output from a
> >typical program after I have finished with it: The first few
> >routines take about 5-10% of the execution time apiece.  The
> >remaining routines take less than a few percent apiece of the
> >execution time.
>
> How did you get these statistics?
> (I'm not challenging your numbers.
> Obtaining a breakdown of execution time by function call sounds
> like a real useful tool.)
>
> thanx,
> lar

You bet it is. In fact, often in the late phases of program
development, it is one of my most important tools.

The tool is called an `execution time profiler', `profiler' for
short. They are available on many systems. My primary
development system is UNIX. On UNIX, if you want a profile, you
compile and link your program with the -p option.  When your
program runs, and if it terminates by calling exit, a file
`mon.out' is created. You then run the program `prof' which
creates a nice listing showing the time in each function, the
number of times a function is called, and the percent of the
total time used by the function.

There are other profilers available on UNIX, this is just the
easiest to use and interpret.  There are similar tools available
on other systems, MS-DOS for example; also, it is often possible
to roll your own on systems that do not have one.

On UNIX, there is also another program `tcov' which is also
useful.  This program gives you the number of times that each
statement of a program is executed. While this does not tell you
how long the statements took, it does give you a good idea where
to look for inefficiencies.

webber@aramis.rutgers.edu (Bob Webber) (07/10/88)

In article <430@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
< In article <Jul.2.04.25.59.1988.8179@porthos.rutgers.edu<, webber@porthos.rutgers.edu (Bob Webber) writes:
< < [Bill writes:]
< < < In the real world, most compilers do NOT deal with tail recursion,
< <          ^^^^^^^^^^ -- what makes you think your world is real?
< 14 years of commercial programming, the last 5 almost exclusively
< C. Languages used at least a year to earn paychecks,
< chronologically: Fortran, Basic, various database languages,
< COBOL, RPG, C.  Over a half a dozen assembly languages used for
< commercial programming.  Recent projects: project leader for
< improvements to a spelling checker for 6 (at the time) languages,
< currently doing R&D and for a grammar checker.
< 
< Real enough?

Commercial programming in Basic, database languages, COBOL, and RPG you
call REAL?   Phooey.  Glad to hear you finally got to C -- sorry to
hear you got to it after all the fun.

< < < it not often being very useful to do this for C programs.
< <
< < Well, there is a free, high quality C compiler for 68000-based systems
< < and various others that does do tail recurision (i.e., gcc) among other
< < things.  [However, it wouldn't catch this particular case -- perhaps
< < that is why you aren't using it? ]
< 
< That's fine for 68000's (like the Sun I am working on) but not
< for IBM-PC's (our secondary machines) nor for the 100+ compilers
< our code is regularly compiled with. Relying on compilers to do
< tail recursion optimization would be slitting our own throats.

Fine.  From all the comp.binaries.* talk, I understand that it is a major
heroic effort just to compile a program on an IBM-PC, so we certainly
wouldn't expect you to have access to modern compiler technology there.
However, I see no point in crippling an entire generation of programmers
due to the limitations of the IBM-PC.


< Sorry, but using purely tail recursive routines does not save
< time for this programmer. Anywhere you would use tail recursion,
< I'd automatically code a while loop.  

Yeah, well there are some people who never did catch on to recursion
-- but don't worry, I am not banning the while loop, just saying that
there is no reason for people to have to do that sort of coding when
they are trying to organize a large program.

< Also, another real-world point: recursive routines in general
< need to be used carefully, to avoid excess resource consumption.

And while loops need to be used carefully to avoid infinite loops.
So?

< < < A perfectly good algorithm can be a BAD choice in the real
< < < world.  Yes, the algorithm is "good", which is to say, it works
< < < and is of the best possible order, but it is poorer than the
< < < standard implementation since its constant factor is going to be
< < < larger.
< <
< < Yes, you have learned the first lesson.  Let me introduce you to the
< < second lesson: Not all code is executed the same amount of time.  Thus
< < it is more important for some things to be fast than it is for others.
< < Strlen does not make sense as a function to be optimizing the death out
< < of.  After all, if it is used all that much -- why isn't it a macro?
< 
< Look buddy, I know both lessons. From the contents of your
< posting, I'd say better than you. 

Well, you are wrong, but I can hardly say I am surprised to hear
you say it.

<                                  One of the things I do better
< than damn near anyone else is make C programs go faster and get
< smaller.

:-)

< So here is a third lesson for you: try to make everything you do
< as good as you can. The notion that a routine is only going to be
< used a small fraction of the time is no excuse for sloppiness.
< Even if the routines are only going to be used 1% of the time,
< those damn little routines can end up eating much of your
< program's run time.
<....
< And a final point. You have encouraged sloppiness in writing
< strlen() by justifying it with "that routine is never going to be
< executed much anyway." Even granted that proposition you have
< forgotten an important point. If you do not make it a habit to
< write good code ALL THE TIME, that means two things: 1) Your code
< will be generally less efficient than it could have been.  2)
< When it comes time to write efficient code, you will not have
< the experience needed to do it.

Sigh.  Why spend your life writing the world's best strlen routine
to run on an operating system that routinely crashes (such as UNIX)
or a chip as badly built as the 8086-family (what on earth were they
thinking about when they set up pointers -- surfing?)?  The world does
not need your super-spiffy strlen routine.  It needs a stable operating
system, a compiler that works ALL the time, a screen editor that works
ALL the time, ...

---- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

webber@aramis.rutgers.edu (Bob Webber) (07/10/88)

In article <440@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
> In article <807@vax.UUCP>, larry@merlin.cvs.rochester.edu (Lawrence Snyder) writes:
<...
< On UNIX, there is also another program `tcov' which is also
< useful.  This program gives you the number of times that each
< statement of a program is executed. While this does not tell you
< how long the statements took, it does give you a good idea where
< to look for inefficiencies.

Err, tcov was actually designed for something slightly different, test
coverage analysis.  The idea is that you might want to test out all that
spiffy code to make sure it works and tcov would tell you which chunks
of code your tests haven't yet executed.  Believe it or not, alot of
commericial code makes it out the door with remarkably few execution
paths tested (for that matter, alot makes it out the door with known
bugs because of deadlines, but that is another matter, at least it
runs fast, sometimes).

----- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

bill@proxftl.UUCP (T. William Wells) (07/13/88)

In article <Jul.10.00.37.03.1988.21259@aramis.rutgers.edu>, webber@aramis.rutgers.edu (Bob Webber) writes:
) In article <440@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
) < On UNIX, there is also another program `tcov' which is also
) < useful.  This program gives you the number of times that each
) < statement of a program is executed. While this does not tell you
) < how long the statements took, it does give you a good idea where
) < to look for inefficiencies.
)
) Err, tcov was actually designed for something slightly different, test
) coverage analysis.  The idea is that you might want to test out all that
) spiffy code to make sure it works and tcov would tell you which chunks
) of code your tests haven't yet executed.

Good point, and one I did not think of since my mind was on
tools for execution time optimization. I personally do not often
use tcov for this purpose because my debugging techniques usually
have me hand step through each line of the program. Of course,
some programs are so complex that this is not practical, e.g., a
parser I wrote recently; then I used tcov.

)                                           Believe it or not, alot of
) commericial code makes it out the door with remarkably few execution
) paths tested (for that matter, alot makes it out the door with known
) bugs because of deadlines, but that is another matter, at least it
) runs fast, sometimes).

Unfortunately, that is all too true.  Here we try to check all
the possibilities in the code, though we do not mandate use of
tcov.  And, we have had to ship code with known bugs, though they
are usually of the "this feature is not present" variety.  That's
life in the deadline-ridden world.  Ugh.

bill@proxftl.UUCP (T. William Wells) (07/13/88)

Sigh....  Some people just can't seem to read.

My first posting in this series, <395@proxftl.UUCP>:

1) Queried the definition of tail recursion.

2) Lambasted a poster for saying an algorithm was good because:
   "any good compiler [would get rid of the recursion]", in
   spite of the fact that essentially all C compilers do not.

3) Asserted that the choice of algorithms is determined by
   pragmatic factors, not just the theoretical characteristics
   of the algorithm.

In my next posting, <430@proxftl.UUCP>, I:

1) Said that I've been in this business a long time and have
   worked with a variety of systems.

2) Pointed out that the only compiler that anyone says can do
   tail recursion, gcc, applies to a single processor, and that
   my work requires porting to over a hundred compilers.

3) Asserted that because of this I can not rely on compilers to
   do tail recursion.

4) Pointed out that tail recursion is equivalent to a while loop
   and that is the preferred method of coding this kind of
   algorithm. [The implied context is C, I'd not say this for
   some other languages.]

5) Asserted that recursion often needs to be checked carefully
   because it consumes resources.

6) Gave examples of how I dealt with some recursion: one by
   leaving it, it being the better code, and another by fixing
   it, as its unbounded recursion would cause problems.

7) Responded to a gratuitous insult with one of my own.

8) Pointed out that I have a lot of experience in optimizing C
   programs.

9) Proposed that the argument that a routine is going to be used
   a small fraction of the time is no excuse for sloppy coding.

10) Gave an example of where that approach could cost one, and
   another example where a sloppy implementation of sscanf
   accounted for 35% of the execution time of a program.

11) Pointed out the way to become the best programmer you can be
   is to try to program as well as possible, all the time.

12) Pointed out that a recursive strlen is just plain stupid.

Nowhere did I suggest that there is anything wrong with
recursion per se.  So let me say this real clear for those of you
insist on interpolating your own fears and fanaticisms into
other's writings.

	Recursion is a programming tool.  Like any tool, it has
	its uses and limitations.  Under certain circumstances,
	coding it any way but recursive is absolute stupidity.
	Under others, using recursion should get you condemned
	to writing RPG programs for the rest of your life.  In
	many other circumstances, a strong case can be made for
	using recursion, or for doing it some other way.  And in
	other cases, the use of recursion is purely a matter of
	taste.

With that taken care of, I'll address the points raised in the
article I am responding to:

In article <53374@ti-csl.CSNET>, gateley@mips.csc.ti.com (John Gateley) writes:
> In article <430@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
> >[...]
> >our code is regularly compiled with. Relying on compilers to do
> >tail recursion optimization would be slitting our own throats.
> >[...]
> >Sorry, but using purely tail recursive routines does not save
> >time for this programmer. Anywhere you would use tail recursion,
> >I'd automatically code a while loop.  (See below for why.) This
> >[...]
> >Also, another real-world point: recursive routines in general
> >need to be used carefully, to avoid excess resource consumption.
> >As an example, I recently translated a rather complicated parsing
> >routine from SCHEME to C. In the translated version were several
> >recursive routines. One of the recursive routines was left alone,
> >the greatest depth of recursion was the length of a grammar rule
> >and removing the recursion would have been a real pain.  Another
> >got fixed (even though it required extensive recoding) because it
> >was not possible to bound the stack usage. Our customers WILL
> >NOT let us chew resources with wild abandon.
> >[...]
> >efficient coding becomes second nature. Then you will naturally
> >write efficient programs and as a matter of course will not do
> >stupid things like writing strlen() recursively.
>
> The tail-recursion debate has been going on for a while, and I would
> like to defend tail recursion, recursion, and optimizations in general
> against the points made above.

There is flatly no defense against those points.  Fortunately,
the rest of your article does not really try to defend against
them, so you actually say something useful.

1) It is a fact that our code is ported to compilers that do not
   handle recursion well. This is not avoidable.  (Do YOU want to
   write me a good compiler for an 8051 or a 7809?) We can not
   avoid this fact; therefore, we must deal with it. That means
   that often we may not write routines that do extensive
   recursion. And for others, should you want to write portable
   code, you ought not to assume that you have megabytes of
   stack space. You won't.

2) Tail recursion, as a programming technique, is equivalent to
   iteration. Given that most compilers do not properly handle
   tail recursion, that makes the iteration the better coding
   technique. Unless, of course, one is a gcc chauvinist, in
   which case, one's opinions are irrelevant.

3) Like any other programming technique, recursion has its
   costs; failure to properly consider them, in the context of
   the task at hand, is bad programming. Period.

4) Strlen has absolutely no business being coded recursively.
   And while a recursive strlen algorithm might be an
   interesting exercise in alternative thinking, it is simply
   the wrong way to do it. It is just as silly as those example
   factorial algorithms which I suspect mislead the individual
   who wrote the strlen. An algorithm implements an idea; the
   idea of strlen is that of the number of characters in a
   string.  The recursive implementation of it simply fails to
   capture this property, except, perhaps, to a mathematician, or
   someone else who counts using Peano's axioms.

>                                First: tail recursion has been only
> discussed as a "true" recursive call optimization in these postings.
> It also applies to many non-recursive and indirectly recursive calls.
> Tail recursion optimizations serve several more purposes than what has
> been brought out here: it does not require pushing a return address on
> the stack (no stack growth, this is the optimization that has been discussed
> here); it allows more efficient argument passing (in the self-recursive
> case, all you have to do is change the arguments on the stack, not push
> new ones); it eliminates the need for saving register variables on the
> stack; and probably several more that I dont know about. I prefer these
> optimizations to take place.

These are good points in favor of adding these optimizations to a
compiler; they have to be balanced against the fact that there
is an almost unbreakable vicious circle that will make it
unlikely that these optimizations will become general.

>                              It has been mentioned that it is hard to
> debug with a tail-recursion optimizing compiler. I disagree and I use
> one every day.

I too thought that objection was silly.

> As for recursion: it is a tool just like while loops. It is a useful
> abstraction: many algorithms are much more simple in a recursive than
> non recursive form (my opinion of course). I feel that in order to get
> the most out of a language, it should have good tools (including recursion).
> It is up to the compiler writers to make those tool perform well (including
> being efficient). For example, there is a simple transformation which
> completely removes the stack from a recursive program (translating the
> program into continuation passing style). This is useful for compiler
> writers: they can use it where appropriate to make the program execute
> more efficiently.

I agree with this; however, remember that we are writing
programms today, we can't write inefficient programs, giving the
excuse that someday a compiler writer will make them efficient
for us. (Well, ok, we can; would you hire someone who thought
like that?) [B.T.W., doesn't a continuation have a context, and
isn't that context equivalent to an activation record?]

> I have a story which is similar to some that have been posted. In college,
> one of my homework assigments was to determine if a graph was connected.
> The graph was represented as a array with 1's for edges. I thought that
> recursion was terribly inefficient, so I just multiplied the matrix against
> itself the neccesary number of times. I also did it recursively. Guess which
> one was faster! The recursive one of course.

A good illustration of the tool-ness of techniques.  Another
example of this is a parsing algorithm that is O(n**2.78) (I
think, anyway < 3). It too relies on matrix multiplication.
Standard parsing methods are O(n**3) or worse and tend to be
recursive; however, the constant factor for the "faster"
algorithm is huge. On the other hand...would that change for a
parallel system? Or one especially designed for matrix
multiplication?

> I do not feel that recursion and/or tail-recursion are bad things. They
> are extremely powerful tools, and like all tools (especially extremely
> powerful ones) they must be used with care. But, if you think about it,
> just about any programming language tool must be used with care.

No disagreement here. Though I'd emphasize that the care required
with recursion is of a slightly different kind than for most
other control structures. When I write a while loop, for
example, the care is that I get the algorithm right; the cost of
a while loop is usually evident.  When I code recursively, not
only must I get the algorithm right, but I must also consider
the space and time consumed by the recursive calls.  This
requires a significantly more complex analysis than a while
usually requires.

The power of recursion is a good reason for emphasizing it in
programming courses. As long as they also teach techniques for
estimating the cost of the tool.  And grind it into the
student's head that these cost estimation techniques are
necessary.

chris@mimsy.UUCP (Chris Torek) (07/13/88)

In an article whose referent is missing (what weird news software *are*
they running at proxftl anyway? :-) ), someone writes:
>>It has been mentioned that it is hard to
>>debug with a tail-recursion optimizing compiler. I disagree and I use
>>one every day.

In article <464@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>I too thought that objection was silly.

I was the one who said `harder to debug'.  Note: hard*er*, not *hard*.
I think both of you will agree that it takes a moment's thought to
first decide that even though there is only one occurrence of some
recursive function on the stack trace, you may not be in the first
level of recursion, and then to find out how deep you really are.  (On
the other hand, I suppose it can be argued that hundreds of screenfuls
of nested calls can be rather overwhelming.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gateley@mips.csc.ti.com (John Gateley) (07/14/88)

In article <464@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>2) Pointed out that the only compiler that anyone says can do
>   tail recursion, gcc, applies to a single processor, and that
>   my work requires porting to over a hundred compilers.

There are many compilers which do tail recursion: several Common Lisp compilers
do, and by definition the programming language Scheme is tail recursive.

>4) Pointed out that tail recursion is equivalent to a while loop
>   and that is the preferred method of coding this kind of
>   algorithm. [The implied context is C, I'd not say this for
>   some other languages.]

Perhaps the name tail recursion is misleading, a function call in tail
recursive position does not need to be a recursive call. In this c program:
main() /* does a semicolon go here? */
{
   if (some-c-exp) {
         foo(1,2,3);
         }
   else {
         bar(4,5,6);
        };
}
(pardon my c, I do not usually use this language).
both the call to foo and the call to bar are tail recursive, even though they
are not a call to main. The difference between this and recursive example
is that both of them do a "jump" to the function, instead of a subroutine call,
but in a recursive call, the argument space is reused (instead of pushing on
the stack). This means that tail recursion is not equivalent to while loops.

>   technique. Unless, of course, one is a gcc chauvinist, in
>   which case, one's opinions are irrelevant.

I am worse than a gcc chauvinist (actually I have never used gcc): I am a Lisp
chauvinist.

>4) Strlen has absolutely no business being coded recursively.
>   And while a recursive strlen algorithm might be an
>   interesting exercise in alternative thinking, it is simply
>   the wrong way to do it. It is just as silly as those example
>   factorial algorithms which I suspect mislead the individual
>   who wrote the strlen. An algorithm implements an idea; the
>   idea of strlen is that of the number of characters in a
>   string.  The recursive implementation of it simply fails to
>   capture this property, except, perhaps, to a mathematician, or
>   someone else who counts using Peano's axioms.

To me, the recursive implementation of strlen tells me exactly what it does;
a while loop would require extra effort on my part to understand. This may
be our different coding styles, but it is not a reason to call recursion
silly. (I am not a mathematician, nor can I use Peano's axioms)

>like that?) [B.T.W., doesn't a continuation have a context, and
>isn't that context equivalent to an activation record?]
Yes, kindof, I think. But what I was describing was continuation passing style
which is a different thing. In continuation passing style, a program is
converted into an equivalent program where every call is in a tail recursive
position. This makes the program more amenable to some optimizations. (Not
just the tail recursive ones).

bill@proxftl.UUCP (T. William Wells) (07/20/88)

In article <Jul.10.00.30.47.1988.21186@aramis.rutgers.edu> gateley@mips.UUCP (Bob Webber) writes:
> In article <430@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
> < Real enough?
>
> Commercial programming in Basic, database languages, COBOL, and RPG you
> call REAL?

I hate to disillusion you, but those languages still account for
most of the programming being done today.

>              Phooey.  Glad to hear you finally got to C -- sorry to
> hear you got to it after all the fun.

Well, I'll say that I am glad to have got to C.  Believe me,
programming in those other languages was a real chore.

> < < < it not often being very useful to do this for C programs.
> < <
> < < Well, there is a free, high quality C compiler for 68000-based systems
> < < and various others that does do tail recurision (i.e., gcc) among other
> < < things.  [However, it wouldn't catch this particular case -- perhaps
> < < that is why you aren't using it? ]
> <
> < That's fine for 68000's (like the Sun I am working on) but not
> < for IBM-PC's (our secondary machines) nor for the 100+ compilers
> < our code is regularly compiled with. Relying on compilers to do
> < tail recursion optimization would be slitting our own throats.
>
> Fine.  From all the comp.binaries.* talk, I understand that it is a major
> heroic effort just to compile a program on an IBM-PC, so we certainly
> wouldn't expect you to have access to modern compiler technology there.
> However, I see no point in crippling an entire generation of programmers
> due to the limitations of the IBM-PC.

You missed the point: I write programs so that they will be
portable; unless you are doing research or are programming for a
particular environment, so will you.  That being the case, you'd
better not write programs that work well for one compiler and
fail for others.  Also, keep in mind that the discussion on
comp.binaries.* comes from the fact that most of the code that
they are having trouble with was written by people who have or
had little idea of how to write portable code.  (This comes from
an examination of some of the posted source code, it is not
intended as a blanket condemnation.)

A minor grouse: why complain about the IBM-PC and its supposed
limiting effect?  Especially since a good part of the design of C
caters to the characteristics of the PDP-11, a machine with
characteristics similar to the IBM-PC but much more restrictive.

> < Sorry, but using purely tail recursive routines does not save
> < time for this programmer. Anywhere you would use tail recursion,
> < I'd automatically code a while loop.
>
> Yeah, well there are some people who never did catch on to recursion
> -- but don't worry, I am not banning the while loop, just saying that
> there is no reason for people to have to do that sort of coding when
> they are trying to organize a large program.

You know, I just finished writing a data compression program.
That was over 3000 lines of code, better than half recursive
routines.  I, at least, know when to use recursion.  And when and
why to avoid it.

> < So here is a third lesson for you: try to make everything you do
> < as good as you can. The notion that a routine is only going to be
> < used a small fraction of the time is no excuse for sloppiness.
> < Even if the routines are only going to be used 1% of the time,
> < those damn little routines can end up eating much of your
> < program's run time.
> <....
> < And a final point. You have encouraged sloppiness in writing
> < strlen() by justifying it with "that routine is never going to be
> < executed much anyway." Even granted that proposition you have
> < forgotten an important point. If you do not make it a habit to
> < write good code ALL THE TIME, that means two things: 1) Your code
> < will be generally less efficient than it could have been.  2)
> < When it comes time to write efficient code, you will not have
> < the experience needed to do it.
>
> Sigh.  Why spend your life writing the world's best strlen routine

Sigh.  Don't you read carefully?  I did NOT say to optimize
strlen to death.  I said to do the best job you can.  That
certainly includes making decisions on how much effort you should
expend in writing strlen.  Which, in a given context, might mean
writing it the easiest way possible, or it might mean optimizing
it to death.

The original point was the rather elementary fact that a
recursive strlen is a bad idea, sufficiently bad that thinking of
a recursive strlen algorithm as being good is idiotic.  Just wait
till you use said strlen on a machine with a few K of available
stack space and you try a strlen on a long string.  Boom.

Your reply is likely to be something of the order: why limit
programmers to today's technology?  And if that is so, we have
nothing further to discuss.

> to run on an operating system that routinely crashes (such as UNIX)
> or a chip as badly built as the 8086-family (what on earth were they
> thinking about when they set up pointers -- surfing?)?  The world does
					      ^^^^^^^
Perhaps you are ignorant of history?  They were thinking of
COST.

> not need your super-spiffy strlen routine.  It needs a stable operating
> system, a compiler that works ALL the time, a screen editor that works
> ALL the time, ...

Perhaps *your* UNIX routinely crashes, perhaps your compiler,
editor, and other tools fail too often; but attitudes such as
yours are the cause for your complaints.  Those who routinely
make their best effort, investigate their options and their
consequences, and remember that today's programs must be written
for today's systems, write better programs than those who do not.

I have had enough of this discussion.  I am not going to waste
further time with someone whose position is so far removed from
mine that he believes that, in C, a recursive strlen is as good
as a nonrecursive one.

bill@proxftl.UUCP (T. William Wells) (07/20/88)

In article <54109@ti-csl.CSNET> gateley@mips.UUCP (John Gateley) writes:
> In article <464@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
> >2) Pointed out that the only compiler that anyone says can do
> >   tail recursion, gcc, applies to a single processor, and that
> >   my work requires porting to over a hundred compilers.
>
> There are many compilers which do tail recursion: several Common Lisp compilers
> do, and by definition the programming language Scheme is tail recursive.

But not C compilers.  That was the implied context.  Sorry if
that was not clear.  And, if one wants portability, none of the
languages where recursion is the preferred way to express
iteration are suitable.  The problem is not with the languages
themselves, but rather that we have to have better
implementations than are currently available.  Imagine running
your Scheme program on a Z80 (or an 8051)!  An 8086 is just
barely adequate for that.

ok@quintus.uucp (Richard A. O'Keefe) (07/21/88)

In article <505@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>A minor grouse: why complain about the IBM-PC and its supposed
>limiting effect?  Especially since a good part of the design of C
>caters to the characteristics of the PDP-11, a machine with
>characteristics similar to the IBM-PC but much more restrictive.

No part of the design of C caters to the PDP-11.
C is BCPL
  +  plus Pascal-like types
  +  plus auto-increment pointer operations (*p++, *++p, *--p, *p--)
The *only* thing in C which would justify the "PDP-11" claim is
the autoincrement pointer operations, but note that (a) the PDP-11
directly supports only two of those operations, and not on all types,
and (b) history is against you:  C's designers claim that they put this
feature in *before* they got a PDP-11.

jbn@glacier.STANFORD.EDU (John B. Nagle) (07/22/88)

In article <182@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>No part of the design of C caters to the PDP-11.
>The *only* thing in C which would justify the "PDP-11" claim is
>the autoincrement pointer operations, but note that (a) the PDP-11
>directly supports only two of those operations, and not on all types,
>and (b) history is against you:  C's designers claim that they put this
>feature in *before* they got a PDP-11.

      Wrong.  See "The C Programming Language", by Kernigan and Richie,
original 1978 edition, ISBN 0-13-110163-3, page ix.  More of the
language philosophy can be found in the original Bell Systems Technical
Journal on "The UNIX operating system", circa 1978.

					John Nagle

cjl@ecsvax.uncecs.edu (Charles Lord) (07/23/88)

In article <182@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
> No part of the design of C caters to the PDP-11.
> C is BCPL
 ...
> and (b) history is against you:  C's designers claim that they put this
> feature in *before* they got a PDP-11.

WRONG.  C *was* based on BCPL which predates the '11 *but*
K&R state in the Preface to THE book, 1st ed:
  "C was originally designed for ... UNIX ... on the DEC PDP-11, 
   by Dennis Ritchie"

They do go on to state that C was and is still meant to be machine-
independent, but the language WAS born on the PDP-11.

[quote copped without permission. Sorry, Dennis]
-- 
Charles Lord           cjl@ecsvax.UUCP    Usenet
Cary, NC               cjl@ecsvax.BITNET  Bitnet
#include <std.disclamers>
#include <cutsey.quote>

webber@aramis.rutgers.edu (Bob Webber) (07/23/88)

In article <505@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
< A minor grouse: why complain about the IBM-PC and its supposed
< limiting effect?  Especially since a good part of the design of C
< caters to the characteristics of the PDP-11, a machine with
< characteristics similar to the IBM-PC but much more restrictive.

You will be amused to know that I agree with you that C is PDP-11
biased, but that it turns out to be a minority opinion vigorously
denied by at least one of the inventors of C.

<...
< You know, I just finished writing a data compression program.
< That was over 3000 lines of code, better than half recursive
< routines.  I, at least, know when to use recursion.  And when and
< why to avoid it.

I hardly see how the above allows you to deduce that you know when to
use recursion and when to avoid it.  I too have written many lines of
both recursive and non-recursive code -- so the same line of reasoning
would say that I too know when to use it and when to avoid it.  Indeed,
more to the point, I have written a few (not just one) of these student
compiler projects that the original example was taken from.

<...
< Sigh.  Don't you read carefully?  I did NOT say to optimize
< strlen to death.  I said to do the best job you can.  That

Yes I read carefully.  ``The best job you can'' is a wonderfully
ambiguous phrase comparable to ``do it right.''  It has no meaning
outside of the context of the rest of the message.

<...
< The original point was the rather elementary fact that a
< recursive strlen is a bad idea, sufficiently bad that thinking of
< a recursive strlen algorithm as being good is idiotic.  Just wait

This is not an ``elementary fact'' -- it is sheer opinion.

< till you use said strlen on a machine with a few K of available
< stack space and you try a strlen on a long string.  Boom.

Boom?  I thought the IBM-PC only did that when you messed around with
the graphics card.

< Your reply is likely to be something of the order: why limit
< programmers to today's technology?  And if that is so, we have
< nothing further to discuss.

Why is it that people always come to conclusions like this at the
end of a long message?

< < to run on an operating system that routinely crashes (such as UNIX)
< < or a chip as badly built as the 8086-family (what on earth were they
< < thinking about when they set up pointers -- surfing?)?  The world does
< 					      ^^^^^^^
< Perhaps you are ignorant of history?  They were thinking of
< COST.

Considering how much it has COST everyone else, they must have had
alot to think about then.

< Perhaps *your* UNIX routinely crashes, perhaps your compiler,
< editor, and other tools fail too often; but attitudes such as
< yours are the cause for your complaints.

Actually, I suspect that the people who wrote ``my'' UNIX, compiler,
etc., are much more likely to agree with your view of programming.

< I have had enough of this discussion.  I am not going to waste
< further time with someone whose position is so far removed from
< mine that he believes that, in C, a recursive strlen is as good
< as a nonrecursive one.

This is hardly my position -- my position is that I believe that for
a student compiler project (particularly in the early stages of
development), a recursive strlen is BETTER than a nonrecursive one,
regardless of the programming language being used.

--- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

bill@proxftl.UUCP (T. William Wells) (07/25/88)

In article <Jul.23.01.03.32.1988.29231@aramis.rutgers.edu> webber@aramis.rutgers.edu (Bob Webber) writes:
: < You know, I just finished writing a data compression program.
: < That was over 3000 lines of code, better than half recursive
: < routines.  I, at least, know when to use recursion.  And when and
: < why to avoid it.
:
: I hardly see how the above allows you to deduce that you know when to
: use recursion and when to avoid it.

The above was an example, not a proof; it does not allow me to
deduce anything.  My knowledge of my abilities comes from the
many years of programming I have done, a mind quite capable of
judging its actions and their results, and plenty of evidence of
my effectiveness.  Like many programs out there in the real
world.  And one thing my mind is telling me is that we are not
communicating.  Because you have chosen a wrong position and are
willing to go to any length to defend that wrong position.

:                                      I too have written many lines of
: both recursive and non-recursive code -- so the same line of reasoning
: would say that I too know when to use it and when to avoid it.  Indeed,
: more to the point, I have written a few (not just one) of these student
: compiler projects that the original example was taken from.

Where are the products you have produced?  How might I evaluate
what you have done?  As for me, you can find my code in any
Proximity spelling checker (other than the ones we put into
typewriters).  Like the ones used in Ashton-Tate products, and
lots of others that I am not going to name-drop.

As for you abilities, the only evidence I have of them is that
you seem to believe that a recursive strlen is a good thing in
C.  And that your arguments for this include insults and
vagueness.

: <...
: < Sigh.  Don't you read carefully?  I did NOT say to optimize
: < strlen to death.  I said to do the best job you can.  That
:
: Yes I read carefully.  ``The best job you can'' is a wonderfully
: ambiguous phrase comparable to ``do it right.''  It has no meaning
: outside of the context of the rest of the message.

You are quite right.  It does have to be taken in context.  And
the context, laboriously and frequently repeated by me, is that
programming is a utilitarian art.  The quality of an algorithm
has to be judged by the context.  You may interpret my "do the
best job you can" as vaguely as you wish; however, the rest of
us, who can read and understand what was read, will interpret it
as it was intended.

: <...
: < The original point was the rather elementary fact that a
: < recursive strlen is a bad idea, sufficiently bad that thinking of
: < a recursive strlen algorithm as being good is idiotic.  Just wait
:
: This is not an ``elementary fact'' -- it is sheer opinion.

OK, it is not elementary.  Obviously so, since you haven't
grasped it.

: < till you use said strlen on a machine with a few K of available
: < stack space and you try a strlen on a long string.  Boom.
:
: Boom?  I thought the IBM-PC only did that when you messed around with
: the graphics card.

A misinterpretation on your part; no doubt everyone else who read
the posting recognized this as idiom for a system crash.  And I'm
not talking about the IBM-PC, though I suppose that in your
infinite ignorance you don't know of the many other useful
machines that come with a limited amount of memory.

: < Your reply is likely to be something of the order: why limit
: < programmers to today's technology?  And if that is so, we have
: < nothing further to discuss.
:
: Why is it that people always come to conclusions like this at the
: end of a long message?

Because you provide the necessary evidence for them to come to
this conclusion.

: < < to run on an operating system that routinely crashes (such as UNIX)
: < < or a chip as badly built as the 8086-family (what on earth were they
: < < thinking about when they set up pointers -- surfing?)?  The world does
: <                                           ^^^^^^^
: < Perhaps you are ignorant of history?  They were thinking of
: < COST.
:
: Considering how much it has COST everyone else, they must have had
: alot to think about then.

Again you are demonstrating your ignorance.  That decision did
not COST anyone anything.  Rather, it made it possible to create
a comparatively cheap microprocessor with some real power.  This
was valuable to those who made use of it; it did not COST them
anything.  But, like any technology, it gets outgrown; hindsight
might tell you how to do it better, but that is not evidence of
stupidity on the part of others.  And like any outgrown
technology it is restricts those who use it.

And yes, they did have a lot to think about then.  Like how to do
the job at all. It was not easy, you can bet.

: < Perhaps *your* UNIX routinely crashes, perhaps your compiler,
: < editor, and other tools fail too often; but attitudes such as
: < yours are the cause for your complaints.
:
: Actually, I suspect that the people who wrote ``my'' UNIX, compiler,
: etc., are much more likely to agree with your view of programming.

I very much doubt it, having seen the quality of most published
code.  MY code rarely has these faults.

: < I have had enough of this discussion.  I am not going to waste
: < further time with someone whose position is so far removed from
: < mine that he believes that, in C, a recursive strlen is as good
: < as a nonrecursive one.
:
: This is hardly my position -- my position is that I believe that for
: a student compiler project (particularly in the early stages of
: development), a recursive strlen is BETTER than a nonrecursive one,
: regardless of the programming language being used.
:
: --- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

Again, you didn't read. I HAVE HAD ENOUGH!!!!!

My judgement of you is that you have vague thinking processes,
backed up by unthinking evaluations, an overwhelming ignorance,
and a loud mouth.  You have gone into my kill file.  You are the
very first.  Congratulations, you even beat weemba there.

webber@athos.rutgers.edu (Bob Webber) (07/25/88)

In article <536@proxftl.UUCP>, bill@proxftl.UUCP (T. William Wells) writes:
]...
] Where are the products you have produced?  How might I evaluate
] what you have done?  As for me, you can find my code in any
] Proximity spelling checker (other than the ones we put into
] typewriters). 

Oh?  Source comes with these?  Or are we to evaluate you esteemed programming
ability purely on the binaries of a sophmore project that you have managed
to sell to people who can't use a dictionary?  Perhaps we should have
some sort of programming duel?

] As for you abilities, the only evidence I have of them is that
] you seem to believe that a recursive strlen is a good thing in
] C.  And that your arguments for this include insults and
] vagueness.

You might recall that this whole thing began with someone (not me) insulting
a student for writing a perfectly good recursive strlen function.

] You are quite right.  It does have to be taken in context.  And
] the context, laboriously and frequently repeated by me, is that
] programming is a utilitarian art.  The quality of an algorithm
] has to be judged by the context.  You may interpret my "do the
] best job you can" as vaguely as you wish; however, the rest of
] us, who can read and understand what was read, will interpret it
] as it was intended.

Certainly this is vaguer than anything I have ever come up with.
Truly a retreat to a complete non-position.

] : < till you use said strlen on a machine with a few K of available
] : < stack space and you try a strlen on a long string.  Boom.
] :
] : Boom?  I thought the IBM-PC only did that when you messed around with
] : the graphics card.
] 
] A misinterpretation on your part; no doubt everyone else who read
] the posting recognized this as idiom for a system crash.  And I'm

A system crash from a stack overflow?  What kind of junk are you running
on?

] not talking about the IBM-PC, though I suppose that in your
] infinite ignorance you don't know of the many other useful
] machines that come with a limited amount of memory.

I certainly NEVER said the IBM-PC was a useful machine.  For the same
memory, a PDP-11/45 would be my choice ANY DAY.  Note that both the 11/45
and the Intel 8008 were introduced in 1972.

] : < I have had enough of this discussion.  I am not going to waste
] : < further time with someone whose position is so far removed from
] : < mine that he believes that, in C, a recursive strlen is as good
] : < as a nonrecursive one.
] :
] : This is hardly my position -- my position is that I believe that for
] : a student compiler project (particularly in the early stages of
] : development), a recursive strlen is BETTER than a nonrecursive one,
] : regardless of the programming language being used.
] 
] Again, you didn't read. I HAVE HAD ENOUGH!!!!!

So?  Does the ``fact'' that you have ``had enough'' give you some magical
right to mis-state my position?

] My judgement of you is ...

A matter too trivial to bother replying to.

--- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

dee@linus.UUCP (David E. Emery) (07/25/88)

See also the article by K&R in the latest Byte Magazine for some more
history on C, Unix, and PDP-11.  Actually, If I remember right, there
was an initial version of Unix in PDP-7 Assembler, then came the
PDP-11, C and the first C version of Unix...

				dave emery
				emery@Mitre-bedford.arpa

bobmon@iuvax.cs.indiana.edu (RAMontante) (07/26/88)

cjl@ecsvax.uncecs.edu (Charles Lord) writes:
>In article <182@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
>> No part of the design of C caters to the PDP-11.
>> C is BCPL
> ...
>> and (b) history is against you:  C's designers claim that they put this
>> feature in *before* they got a PDP-11.
>
>WRONG.  C *was* based on BCPL which predates the '11 *but*
>K&R state in the Preface to THE book, 1st ed:
>  "C was originally designed for ... UNIX ... on the DEC PDP-11, 
>   by Dennis Ritchie"

In 2nd ed., it still says "C was originally designed for and implemented on
the UNIX operating system on the DEC PDP-11, by Dennis Ritchie."

On the other hand, the preface to "The UNIX Programming Environment", by
Kernighan and Pike, says "The UNIX operating system started on a cast-off
DEC PDP-7 at Bell Laboratories in 1969....Ritchie, who helped move the system
to the PDP-11 in 1970.  Ritchie also designed and wrote a compiler for the
C programming language."

Meanwhile, S. R. Bourne (who's HE?) sez: "Thompson had also [in 1972] been
working on the language B and the assembler for [Second Edition UNIX] was
written in B.  B was a direct descendant of BCPL, but programs were compiled
in one pass to produce interpretive code.

"Both B and BCPL were typeless languages, providing a single data object
called the machine word making access to the PDP 11 byte handling instructions
difficult.  Types were therefore added to B to produce NB and an attempt to
rewrite the system in NB was unsuccessful.  Ritchie started work on a code
generator for NB to make execution of programs faster.  This language was
called C although there were no structures or global variables.  The
language was sufficiently attractive, however, that new utilities were being
written directly in C."

Bourne also says "...although the first compiler was written for a PDP
11/45." and "The historical origins of C help explain some aspects of the
language design.  It was intended for systems programming and similarity to
the machine was considered important.  For example, the ++ operator has a
direct equivalent in the PDP 11 instruction set.  Features allowing portable
programs to be written, such as unions, were added to the language in the
late 1970s."

Hope you paid attention; there'll be a quiz next period, and the first
question will be "Who cares?"...
-- 
--	bob,mon				(bobmon@iuvax.cs.indiana.edu)
--	"In this position, the skier is flying in a complete stall..."

ok@quintus.uucp (Richard A. O'Keefe) (07/27/88)

In article <10981@iuvax.cs.indiana.edu> bobmon@iuvax.UUCP (RAMontante) writes:
:cjl@ecsvax.uncecs.edu (Charles Lord) writes:
:>In article <182@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
:>> No part of the design of C caters to the PDP-11.
:>> and (b) history is against you:  C's designers claim that they put this
:>> [++ and --] feature in *before* they got a PDP-11.
:>
:>WRONG.  C *was* based on BCPL which predates the '11 *but*
:>K&R state in the Preface to THE book, 1st ed:
:>  "C was originally designed for ... UNIX ... on the DEC PDP-11, 
:>   by Dennis Ritchie"
:In 2nd ed., it still says "C was originally designed for and implemented on
:the UNIX operating system on the DEC PDP-11, by Dennis Ritchie."
:
:On the other hand, the preface to "The UNIX Programming Environment", by
:Kernighan and Pike, says "The UNIX operating system started on a cast-off
:DEC PDP-7 at Bell Laboratories in 1969....

I apologise for expressing myself badly.  My point was that C got the
autoincrement operators from B, and that feature was put into >>B<<
before they got a PDP-11.  I could be wrong about that too, but it hardly
matters, because B was interpreted.

cjl@ecsvax.uncecs.edu (Charles Lord) (07/27/88)

OK, Bob - you were right on most issues.  It still seems that the
++ / -- operators WERE written with the PDP-11 in mind as it has
the instructions in op-codes.  And yes, Richard, UNIX does date
back to the PDP-7, with BCPL then B as untyped language.  It was
the rich variety of types in the -11 that inspired the typing in
C.

If anyone has a burning question in this topic (it really belongs
in ..std.c or ..lang.c), post a request in those newsgroups.  
Dennis Richie himself reads and responds to those discussions
(he is dmr@att[something]..).
-- 
Charles Lord           ..!decvax!mcnc!ecsvax!cjl    Usenet
Cary, NC               cjl@ecsvax.uncecs.edu        Bitnet
#include <std.disclamers>
#include <cutsey.quote>

dmr@alice.UUCP (07/28/88)

I suppose I should save this article, since the subject seems
to come up every year.

C was developed on the PDP-11; most of it aside from the type
structure and associated syntax came from B, which was developed on
the PDP-7.  B already had the ++ and -- operators (and the associated
idioms like *p++).  The first B compiler produced interpreted
(threaded) code.  Doubtless the invention of ++ and -- (due to
Thompson) was suggested by the auto-increment cells of the PDP-7, or
perhaps the one such cell on the PDP-8, or perhaps some of the more
recondite indirection modes on the GE 635/645, or the
count-increment-refill mechanism of the Stretch.  In any event, neither
B nor the first C compiler generated -7 or -11 instructions that used
autoincrement.

C was less influenced by the PDP-11 than most people think.  Certainly
the addition of types was motivated by the desire to take advantage
of byte addressing and the (future) existence of floating point
(indeed, C compiled 11/45 floating point instructions before
the delivery of any 11/45s; it was an annoyance when DEC changed
its mind about what opcodes to use.)

Discounting general things like that, the only strong PDP-11isms I can
think of in C are the signed characters and about 50% of the
justification for the widening rules for floats.  (The other 50% is
simplification of the language rules and libraries).

	Dennis Ritchie
	research!dmr
	dmr@research.att.com