[comp.software-eng] Cynic's Guide to Software Engineering, part 3

neff@Shasta.STANFORD.EDU (Randy Neff) (04/06/88)

------		The Cynic's Guide to Software Engineering		------
------ an invitation to dialogue, starting with the personal view of	------
------		    Randall Neff @ sierra.stanford.edu			------
------	        	April 5, 1988   	part 3			------
------------------------------------------------------------------------------
	Monolingualism:  the Religion and Curse of Software Engineering

[Note:  A, U, E are free variables and not meant to be any specific program]

Joe Programmer learned one programming language, A, and one operating system,
U, and one text editor, E, while earning his CS degree.  He found a job that
used the same A,U,E triple.  As the years went by, he gained more experience
and learned all of the magic hacks (heuristics) for the A,U,E triple.
Joe also gained in hubris;  he knows that he can solve any problem with A,U,E.
Joe's group/company prides itself on the latest hardware; more mips and
megabytes every year.  Joe's group/company prides itself on using programming
language A, whose design is about twenty years old and has never been 
standardized;  operating system U, whose design is about twenty years old,
never standardized and locally hacked; and text editor E, whose design 
is about twenty years old.   In fact, next year Joe's group will be hiring
new programmers who are younger than A, U, and E.  

Now Joe's company only hires programmers with training and experience in
A,U,E, so they only get programmers that like A,U,E.   Any individuals that
suggest that maybe, perhaps, possibly, that there might be something better
than A,U,E are carefully pushed out. The programmers are happy with
isolating themselves from new ideas in operating systems, programming languages,
and text editors.   The programmers have thousands of excuses for their
deliberate ossification.  

Now the management is very unhappy with the state of software in the company:
late schedules, cost overruns, buggy products, expensive maintenance, the
standard list of S.E. woes.  Management starts running around looking for the
silver bullet.  Is there a correlation?  Is the state of the software related 
to the choice and continued use of A,U,E?

-----------------------------------------------------------------------------
One of the more educational CS courses at Stanford is just called Progamming
Languages.  It uses a textbook similar to Terrance W. Pratt: Progamming
Languages, Design and Implementation.   Basically the course is a survey of a 
number of different programming languages such as Fortran, APL, Pascal, Snobol,
PL/I, Ada, Lisp, Smalltalk, C, and Prolog.  Typically each biweekly programming
assignment is in a different language.   The point of all of this is that
there are a lot more concepts and features in the study of programming 
languages than are available in any single language.  Declarative vs Procedural
or type polymorphism or tasking or garbage collection or exceptions or storage
models or object oriented/inheritance are important topics.  A follow on 
course, Designing Programming Languages, is offered when someone with enough
experience can be found.  

The basic conclusion is that a specific programming language is, at best, a 
tool;  imperfect, containing some subset of possible features.  There are
experimental tools, toy tools, personal tools, group tools, and industrial
strength tools.   A programming language is a tool to solve a problem, it is
not a religion nor a religious experience.   Unfortunately, monolinguists get
confused and seem to think that their language is the only true light.

Similar comments can be made about mono-operating-system types and the
great religious flame wars about text editors.  Mono-anything enthusiasts are
responsible for braking the flow of new software engineering concepts and 
tools.  Mono's like to freeze progress and  stagnate in their Mono religions.
This sort of explains why a programming language designed in 1954-57 is the
language of choice on the world's fastest computers.  

karenc@amadeus.TEK.COM (Karen Cate;6291502;92-734;LP=A;60sC) (04/08/88)

In article <2636@Shasta.STANFORD.EDU> neff@Shasta.stanford.edu (Randall Neff) writes:
>	Monolingualism:  the Religion and Curse of Software Engineering
>
>[Note:  A, U, E are free variables and not meant to be any specific program]
>
>Joe's group/company prides itself on using programming
>language A, ... operating system U, ... and text editor E... [which are] 
>about twenty years old.
> ...
>Now Joe's company only hires programmers with training and experience in
>A,U,E, so they only get programmers that like A,U,E.
> ...
>Is the state of the software related to the choice and continued use of A,U,E?
> ...
>The basic conclusion is that a specific programming language is, at best, a 
>tool;  imperfect, containing some subset of possible features.  
>...Unfortunately, monolinguists get
>confused and seem to think that their language is the only true light.
>

As I was reading this, I saw the situation from another point of view.  Let
me try it this way:

Joe's group/company started a large/complex project 10 years ago.  At that
time, they chose language A, operating system U, and text editor E.  The
product grew, and they hired people that could work with AUE.  Now
this product is still doing well, and other products in the same "family"
are being developed, additional "follow-on" projects are released, etc.
They now have 10 years of development time invested in AUE.  Later 
products can leverage heavily on the previous, so they continue in
the AUE tradition.  

Of course some of these developers are going to think AUE is wonderful...
Look at all the great things they can do with it.  They have spent the 
last 10 years becoming experts in it.  They don't want to have
to start all over.  The company has thousands of man-hours invested in
their product.  It's the best there is.  They don't want to start all over.

Of course the state of the software is related to the choice of AUE.
The software and environment are mutually reinforcing.  When one
develops a major system, one not only designs it to meet functional
specs, but also to meet environment requirements and limitations.
Application Q would be designed completely differenly in, say, Ada than
it would be in, say, C.  One might be able to make it look and act just
the same (maybe...), but the data and control structures will be
completely different.


For example:

COBOL will be around for a very long time.  It has built up a lot of
inertia.  I happen to not like working with COBOL very much, but it is
well suited to many applications for which it is used.  Even if
something new is faster/easier to develop, whole departments, office
procedures, etc, are based on systems designed for COBOL.  Not only
would the all the code have to be converted, but chances are that the
system would have to change, so procedures would have to change...  The
entire philosophy of the "world" would undergo a minor revolution.  One
can argue that the result would be better, but the cost is so high that
companies are not willing to pay it.  So old, slow COBOL still hangs
around, and COBOL programmers are still being hired.  I'm just not going
to be one of them.

We want to have libraries of common functions, and be able to re-use
algorithms, etc, but we can't do that if we change environments every
few years.  I admit, I like the environment I'm using.  I know it, I
"think" in it, and I'm comfortable with it.  We don't buy new cars every
year or two just because the new ones are a little better.  We like the
car we chose five years ago, why change?  I have a wonderful 20 year old
car that runs better now than many new cars.  It doesn't get as good 
gas mileage, it doesn't have emission control stuff, no beeps or buzzes to 
remind me to put on my seat belt, but it has almost 150,000 miles on it.
I trust it to take me across the country without breaking down.  I have 
the tools, and know how to fix it when it does break.  There are advantages
and disadvantages, but I'm not going to just junk all I have invested in
it just because it's out of style.

Karen A. Cate
Tektronix Inc, Beaverton OR
tektronix!amadeus!karenc -OR- karenc@amadeus.LA.TEK.COM

daveb@llama.rtech.UUCP (Crack? No thanks, I've got a new CD player) (04/13/88)

In <2636@Shasta.STANFORD.EDU> neff@Shasta.stanford.edu (Randall Neff) writes:

>	Monolingualism:  the Religion and Curse of Software Engineering
...
>Similar comments can be made about mono-operating-system types and the
>great religious flame wars about text editors.  Mono-anything enthusiasts are
>responsible for braking the flow of new software engineering concepts and 
>tools.  Mono's like to freeze progress and  stagnate in their Mono religions.
>This sort of explains why a programming language designed in 1954-57 is the
>language of choice on the world's fastest computers.  

Being prone to inventing contrary positions to investigate a dialectic,
I wonder about the curse of mono-lingualism.  There seem to be few
complaints about the use of English as the Air Traffic Control language. 
At some point, won't language hopping make you less able to communicate
effectively?  Does knowing a smattering of French, German, Japanese,
Chinese, Swahili and really help you write in English?

Monolinguistic people slow the flow of progress in the development of
new languages.  That does not necessarily equate to impeding the
abililty of those people to express new thoughts in their native
language.  In general, it is probably more productive for most writer or
programmers to develop a mastery of one language than to dabble in many
or invent new ones.

It is only when the old languages/paradigms become incapable of handling
a new problem space that it is really necessary to change, and then the
followers of the old order are in trouble.  It is difficult to say when
one of these revolutions is happening until it is over.  

We have just barely gotten out of the "operating systems are written in
assembler" mind set.  It is hard for me to believe that, for sake of
argument, FORTRAN is incapable or even very seriously flawed in its
ability to model physics problems compared to any of the likely
alternatives.

-dB

See:  Thomas S. Kuhn, _The_Structure_of_Scientific_Revolutions_.  Many
people like to envision themselves as revolutionaries.  Very few are, or
should be.



"Remember, there's a seeker born every minute."  -- Happy Harry Cox.
{amdahl, cpsc6a, mtxinu, ptsfa, sun, hoptoad}!rtech!daveb daveb@rtech.uucp

cdshaw@alberta.UUCP (Chris Shaw) (04/14/88)

In article <1950@rtech.UUCP> daveb@rtech.UUCP  writes:
>In <2636@Shasta.STANFORD.EDU> neff@Shasta.stanford.edu (Randall Neff) writes:
>
>>	Monolingualism:  the Religion and Curse of Software Engineering
>
>Being prone to inventing contrary positions to investigate a dialectic,
>I wonder about the curse of mono-lingualism.  There seem to be few
>complaints about the use of English as the Air Traffic Control language. 
>At some point, won't language hopping make you less able to communicate
>effectively?  Does knowing a smattering of French, German, Japanese,
>Chinese, Swahili and really help you write in English?

Clearly this is a specious argument. The difference is that English is fluid 
and all computer languages are not. Computer languages must have precice
meaning, which implies that a computer language changes at its peril.
English can change in style and usage without breaking software (or crashing
planes).

On the other hand, knowing a smattering of French, German, Japanese,
Chinese, Swahili really can help you write in English, because you can borrow 
words from other languages to better encapsulate a thought. English has
succeeded so well in part because it is expressive. It is expressive in part
because words have been stolen from other languages.

>-dB
>{amdahl, cpsc6a, mtxinu, ptsfa, sun, hoptoad}!rtech!daveb daveb@rtech.uucp


-- 
Chris Shaw    cdshaw@alberta.UUCP (via watmath, ihnp4 or ubc-vision)
University of Alberta
CatchPhrase: Bogus as HELL !

steve@nuchat.UUCP (Steve Nuchia) (04/16/88)

From article <1950@rtech.UUCP>, by daveb@llama.rtech.UUCP (Crack?  No thanks, I've got a new CD player):
> We have just barely gotten out of the "operating systems are written in
> assembler" mind set.  It is hard for me to believe that, for sake of
> argument, FORTRAN is incapable or even very seriously flawed in its
> ability to model physics problems compared to any of the likely
> alternatives.

FORTRAN is a very nice language for arranging to have arithmetic
FORmulas evaluated, andit is even ok (in its present form) at
organizing a moderately complex sequence of calculations.

The problem is, that isn't what most programs do.  In my experience
the text of most "scientific" programs begin life as some small
amount of FORTRAN embodying some algorithm - maybe as much as a
few thousand lines.  This much is as it should be.

Now, this little program starts to be _used_ for something.  It becomes
"supported", and starts to grow.  The algorithm doesn't grow.  What
grows is the data management and user interface cruft around the
central algorithm.  Since the whole wad is still in FORTRAN the
maintenance programmers have littered the algorithmic code with
flags and such to support all the new features, but by this time
the data management and user interface code outweighs the "scientific"
code by ten to one or better.

Sure, the program spends most of _its_ time in the scientific code,
but where are your programmers spending _their_ time?   Right.

Nuchia's law:  90% of any production program is doing data
	managment and user interface for the 10% that is
	doing the real work.

With that out of the way, how good of a language is FORTRAN for
data management?  Pretty terrible.  It just doesn't have any
of the semantics you need.  Sure, you can kluge around it,
but why?

User interfaces, if implemented properly, just consist of
a large state machine which calls a giant UI library.  The
state machine can be written in FORTRAN or BASIC or SWAHILI
or whatever you want, as long as some kind of conventional
structure is used - the nodes of the state graph should
all follow a stereotyped format to a greater or lesser
degree.

You wouldn't want to write the UI library in FORTRAN, but
thankfully I don't think there are very manu people left
who would insist that you do so.

What's the solution?  Let your scientific programmers code in
FORTRAN.  They're getting their work done, and you've got enough
work to do without going on a crusade to "save" them.  And of
course you get to take advantage of the available optimization
technology for the long-running core of the programs.  But
make your maintenance programmers learn and use a modern
language; pick one with a good interface to your FORTRAN
environment.

I have to admit that I haven't seen this tried on a significant
scale.  The ideas presented above have been forming in my mind
for a couple of years, and lately I've had an opportunity to
watch scientists produce programs, confirming much of what I
had thought.  Take it for what its worth, and I would appreciate
any evidence supporting or contradicting my positions.
-- 
Steve Nuchia	    | [...] but the machine would probably be allowed no mercy.
uunet!nuchat!steve  | In other words then, if a machine is expected to be
(713) 334 6720	    | infallible, it cannot be intelligent.  - Alan Turing, 1947

UH2@PSUVM.BITNET (Lee Sailer) (04/16/88)

In article <1950@rtech.UUCP>, daveb@llama.rtech.UUCP (Crack?  No thanks, I've got a new CD player) says:
>
>In <2636@Shasta.STANFORD.EDU> neff@Shasta.stanford.edu (Randall Neff) writes:
>
>>    Monolingualism:  the Religion and Curse of Software Engineering
>I wonder about the curse of mono-lingualism.  There seem to be few
>complaints about the use of English as the Air Traffic Control language.

   There are LOTS of complaints about air traffic controllers.  Many
AC's only know english as a kind of set of magic spells they chant in
tongue.  (Pilots, too.)


>At some point, won't language hopping make you less able to communicate
>effectively?  Does knowing a smattering of French, German, Japanese,
>Chinese, Swahili and really help you write in English?

Perhaps not, though it certainly helped me.  I knew a little Czech, which
like Russian uses a lot of special markers for the different cases.  It
not only helps my english, but made learning Frame Based AI systems a
snap.

The issue here isn't knowing a smattering, anyway.  Being BILINGUAL certainly
improves a programmers abailty to get the job done.  Knowing PASCAL improved
my on the job programming in FORTRAN.  Knowing SIMULA makes all my
code in non-object oriented languages better.
>
>It is only when the old languages/paradigms become incapable of handling
>a new problem space that it is really necessary to change, and then the
>followers of the old order are in trouble.  It is difficult to say when
>one of these revolutions is happening until it is over.
>
>We have just barely gotten out of the "operating systems are written in
>assembler" mind set.  It is hard for me to believe that, for sake of
>argument, FORTRAN is incapable or even very seriously flawed in its
>ability to model physics problems compared to any of the likely
>alternatives.
>

One of the reasons physicists need super computers is that they usually haven't
a clue about using data structures to implement sophisticated
algorithms.  Most P's do most everything by brute force.  Partly, this
is because FORTRAN makes it so hard to implement clever code.

EGNILGES@pucc.Princeton.EDU (Ed Nilges) (04/17/88)

In article <39501UH2@PSUVM>, UH2@PSUVM.BITNET (Lee Sailer) writes:
 
>One of the reasons physicists need super computers is that they usually haven't
>a clue about using data structures to implement sophisticated
>algorithms.  Most P's do most everything by brute force.  Partly, this
>is because FORTRAN makes it so hard to implement clever code.
 
 
   "In the good old days physicists repeated each other's
    experiments, just to be sure.  Today they stick to FORTRAN,
    so that they can share each other's programs, bugs
    included."
 
                             - Edsger Dijkstra, "How do we tell truths
                               that might hurt?", from Selected Writings
                               in Computing: a Personal Perspective.
                               New York, 1982: Springer-Verlag, Inc.

UH2@PSUVM.BITNET (Lee Sailer) (04/18/88)

In article <945@nuchat.UUCP>, steve@nuchat.UUCP (Steve Nuchia) says:
>
>Now, this little program starts to be _used_ for something.  It becomes
>"supported", and starts to grow.  The algorithm doesn't grow.  What
>grows is the data management and user interface cruft around the
>central algorithm.  Since the whole wad is still in FORTRAN the
>maintenance programmers have littered the algorithmic code with
>flags and such to support all the new features, but by this time
>the data management and user interface code outweighs the "scientific"
>code by ten to one or better.
>

This is a very good point.  Here's a testimonial---in the early 70's I
supported an in-house statisitcs package that was used primarily
for multiple regression.  The regression code was lifted from early
BMD (no P back then) code, and was 155 lines of FORTRAN.  The total
package was over 5000 lines.  User interface, idiot proofing,
a transformation language, file management, etc, took up the rest.


>Sure, the program spends most of _its_ time in the scientific code,
>but where are your programmers spending _their_ time?   Right.
>
>Nuchia's law:  90% of any production program is doing data
>    managment and user interface for the 10% that is
>    doing the real work.
>
>With that out of the way, how good of a language is FORTRAN for
>data management?  Pretty terrible.  It just doesn't have any
>of the semantics you need.  Sure, you can kluge around it,
>but why?
>
>User interfaces, if implemented properly, just consist of
>a large state machine which calls a giant UI library.  The
>state machine can be written in FORTRAN or BASIC or SWAHILI
>or whatever you want, as long as some kind of conventional
>structure is used - the nodes of the state graph should
>all follow a stereotyped format to a greater or lesser
>degree.
>
>You wouldn't want to write the UI library in FORTRAN, but
>thankfully I don't think there are very manu people left
>who would insist that you do so.
>
>What's the solution?  Let your scientific programmers code in
>FORTRAN.  They're getting their work done, and you've got enough
>work to do without going on a crusade to "save" them.  And of
>course you get to take advantage of the available optimization
>technology for the long-running core of the programs.  But
>make your maintenance programmers learn and use a modern
>language; pick one with a good interface to your FORTRAN
>environment.
>

Except for COMPLEX variables, I have never been able to see why FORTRAN
is better than C, Pascal, ALGOL, etc etc etc for "scientific" programming.

I have always been bothered, too, that by not putting a Complex type
and its associated operators in the language, C and other modern
languages have thumbed there noses at physical scientists.

simpson@spp3.UUCP (Scott Simpson) (04/19/88)

In article <39544UH2@PSUVM> UH2@PSUVM.BITNET (Lee Sailer) writes:
>Except for COMPLEX variables, I have never been able to see why FORTRAN
>is better than C, Pascal, ALGOL, etc etc etc for "scientific" programming.
>
>I have always been bothered, too, that by not putting a Complex type
>and its associated operators in the language, C and other modern
>languages have thumbed there noses at physical scientists.

FORTRAN is often used because it is fast.  It has no dynamic allocation and 
passes arrays by reference.  Unfortunately is doesn't support parallelism 
well.  OCCAM attempted to solve that problem.  Some of FORTRAN other problems
include
    - Implicit declaration
    - No spaces required
    - Card oriented
This list is obviously not exhaustive.

C++ and Ada allow you define new types so complex numbers need not be built
in.
-- 
	Scott Simpson
	TRW Space and Defense Sector
	...{decvax,ihnp4,ucbvax}!trwrb!simpson  (UUCP)
	trwrb!simpson@trwind.trw.com		(Internet)

shf@well.UUCP (Stuart H. Ferguson) (04/21/88)

Article 474 of comp.software-eng:
>From: steve@nuchat.UUCP (Steve Nuchia)

(Sorry about the length.  I guess I had a lot to say ...)

In a previous article, Steve Nuchia talks about Fortran:
>From article <1950@rtech.UUCP>, by daveb@llama.rtech.UUCP ...:
>> ...  It is hard for me to believe that, for sake of
>> argument, FORTRAN is incapable or even very seriously flawed in its
>> ability to model physics problems compared to any of the likely
>> alternatives.
>FORTRAN is a very nice language for arranging to have arithmetic
>FORmulas evaluated, andit is even ok (in its present form) at
>organizing a moderately complex sequence of calculations.

First of all, I like Fortran.  No really!  I've been working for about 
four years as a programmer with a group of solar physicists writing image
processing and data analysis applications.  I have a B.A. in computer
science from UC Berkeley, and am *fluent* in C and Pascal and a couple
of varieties of assembly language, as well as being fairly well versed
in Lisp and other more abstruse languages. 

I wrote a couple of image processing programs in Pascal once.  I was
using a recursive algorithm, so I thought it would make sense to use a
language that supported recursion (everybody knows you can't do
recursion in Fortran, right?).  It didn't work out.  Pascal turns out
just not to be a good language for this kind of thing -- I couldn't say
exactly why.  I found it easier to implement the algorithm in Fortran
using an array as an explicit stack so that the recursive algorithm
became iterative.  I know Pascal very well too, having written quite a
few things in Pascal including a recursive decent parser and other
goodies.  The Fortran version of the image processing algorithm has gone
through several phases of refinement while the Pascal version has sat
around not doing much of anything. 

I have, over the course of the past two years, written a tremendous 
volume of Fortran code, including a 'lexx'-like, a 'yacc'-like and a 
'make'-like program (VMS didn't have these utilities, so I wrote my own 
in Fortran).  Besides just writing image processing algorithms, I have 
also been concerned with the types of user interface issuses that Nuchia
addresses. 

>... In my experience
>the text of most "scientific" programs begin life as some small
>amount of FORTRAN embodying some algorithm - maybe as much as a
>few thousand lines.  ...

An image processing algorithm can often be expressed as ONE line ...

>Now, this little program starts to be _used_ for something.  It becomes
>"supported", and starts to grow.  The algorithm doesn't grow.  What
>grows is the data management and user interface cruft around the
>central algorithm.  Since the whole wad is still in FORTRAN the
>maintenance programmers have littered the algorithmic code with
>flags and such to support all the new features, but by this time
>the data management and user interface code outweighs the "scientific"
>code by ten to one or better.

This is a very good observation.  User interface isn't much of an issue
for the scientists I work with, however, so that is not where the code
is.  Most scientists are happy with something as simple as: 

	ENTER FILENAME: 

A scientist's idea of a fancy user interface is being able to press
return at the prompt to exit the program.  (This is a stereotype, to be
sure.)  Actually, I have a lot of respect for the scientists here, and
many of them are very good programmers as well, but they'll usually only
work on user interface as a last resort.  The place I find the most 
'cruft' is in what I would call "I/O."

I think Nuchia's observation may be a bit of a simplification, and I
have some further refinement to suggest.  What happens to me is this:  I
write a little program to try some new data analysis technique.  When I
show the results to the scientists they get all excited and tell me to
try it on other data that we have.  So, now I need to go back and modify
the program to handle different types of data than I originally designed
it for, but this is no big deal, because I'm just making the program
better and more general so I go and do it gladly. 

Here's where the trouble starts.  Let's say I run the program on the
other data and the results are tantalizing but not as good as the 
scientists had hoped.  We then enter the "What If" Loop, where the 
scientists think up all sorts of various complicated and often contrived
ways to approach the problem to try and make the results better. 
Someone will come up with a good "what if we try it this way ..." type
suggestion and I'll go off and muck with my code, try it, and show them
the results.  Then they say, "Hmmm, that's not quite it.  What if
instead of _this_ we do _that_?" and off I go again into the code to try
that idea. 

The result is a real nightmare.  What I end up with is an algorithm that
is the result of many incremental and often temporary (or kludgy)
changes.  Ideally the program would eventually work right and then I
could go back and recode it correctly, but it rarely seems to happen
that way.  Usually what happens is the idea gets abandoned until six
months later when we get a new data set and someone suggests that the
idea we were working on six months ago might work well on this data, and
I dig the code up and bang on it again. 

>Sure, the program spends most of _its_ time in the scientific code,
>but where are your programmers spending _their_ time?   ...

In a type of *interactive* software development cycle for which few
languages are well adapted.

>Nuchia's law:  90% of any production program is doing data
>        managment and user interface for the 10% that is
>        doing the real work.

This is a good start, but what exactly is meant by a "production"
program?  I also don't like the implication that user interface cannot
be considered "real" work.  What about something like a CAD program
which is nothing BUT user interface? 

>... how good of a language is FORTRAN for
>data management?  Pretty terrible.  It just doesn't have any
>of the semantics you need.  Sure, you can kluge around it,
>but why?
... and in the same vein ...
>You wouldn't want to write the UI library in FORTRAN, but
>thankfully I don't think there are very manu people left
>who would insist that you do so.

It's easy for us software types to blame the programming language; I
mean, what could be worse than Fortran, right?  But I remind you that I
wrote 'lexx,' 'yacc' and 'make' in Fortran with very little trouble. 
This is Vax Fortran, an extension of FORTRAN-77 with 31 character
identifiers, structured control flow and structured data types, among
other things.  I've done recursion, linked lists, string manipulation
and even dynamic memory allocation in Fortran.  I'm not pushing Vax
Fortran as some kind of True and Great language, but I AM suggesting
that the source of some of the problems Steve addresses may be a result
of more than JUST the choice of programming language. 

The "scientific" programs which start small and grow, such as the ones 
that Steve refers to, are rarely planned.  Rather they evolve out of the
kind of interactive development (i.e. hacking) that I described.  For
cases where the program was actually designed in advance, such as my
'lexx,' 'yacc,' and 'make' utilities, Fortran was a workable choice. 
Some might argue that bad C code is better than good Fortran code, but I
don't find this to be the case, and I strongly suspect that those who
say this don't really know how to recognize a good Fortran program. 


Steve Nuchia also talks about user interfaces:

>User interfaces, if implemented properly, just consist of
>a large state machine which calls a giant UI library.  The
                                     -----
>state machine can be written in FORTRAN or BASIC or SWAHILI
>or whatever you want, as long as some kind of conventional
>structure is used - the nodes of the state graph should
>all follow a stereotyped format to a greater or lesser
>degree.
...
>What's the solution?  Let your scientific programmers code in
>FORTRAN.  They're getting their work done, and you've got enough
>work to do without going on a crusade to "save" them.  And of
>course you get to take advantage of the available optimization
>technology for the long-running core of the programs.  But
>make your maintenance programmers learn and use a modern
>language; pick one with a good interface to your FORTRAN
>environment.

This has proven to be a good general model to work from.  We have done
this in our lab to a certain extent at various levels with varying 
degrees of success.  Three of our more successful approaches validate 
Nuchia's model:

1) User interface libraries

In designing libraries, there's a parallel issue to user interface --
that of programmer interface.  There are really a lot of wonderful
libraries out there which do a lot of wonderful things, but many have no
concern for what the programmer sees.  Good examples are some of the
really big graphics libraries.  They are really powerful and really
general and really can do lots of terrific things, but it takes really a
LOT of calls to do even the simplest thing.  You sometimes have to learn
every weird formalization that the program uses internally in order to
do something as simple as draw a line.  Experience with these tends to
make me shudder when I see someone refer to a building a GIANT user
interface library. 

The best library interface, as well as the most useful user interface
library for Fortran programs I've even seen happens to be the same
library.  It is called GIRL (Generalized Input Routine Library) and the
basic programmer interface consists of ONE subroutine call.  The 
programmer can do something like:

	ALPHA = 1.0
	COUNT = 10
	CALL GIRL('Enter Alpha, Count',,,'RI',ALPHA,COUNT)

and the result would be a friendly prompt:

	Enter Alpha, Count [1.0,10]: 

The preset values for the variables get displayed in []'s, and the user 
can press return to get those as defaults or can type ",2" to leave 
ALPHA alone and set COUNT to 2.  The GIRL subroutine can take any 
number of arguments and determines their type from a string (the 'RI'
argument in the example above). 

This very simple interface makes it *easier* to use GIRL than the
equivalent 2-4 lines of Fortran, and you get defaults and a uniform
input method throughout your program.  But GIRL also has a whole lot of
other features such as on-line help, super-defaults, backtracking and
others, but all of that is hidden behind a deceptively simple programmer
interface.  The other features get accessed by adding more parameters to
the GIRL call, or by calling other routines.  It's organized, however,
so that the programmer only needs to learn about those features he's
interested in when he's interested in them. 

2) Fortran pre-processor

We have been sucessful at separating the functions of user interface and
data I/O from the computational part of the program by using a Fortran
pre-processor.  The input to the pre-processor is a language which is a
super-set of Vax Fortran with the basic Fortran data types extended to
include data types specific to image processing.  Normally, in order to
read an image into memory to work on it, the Fortran programmer would
have to get the filename, open the file, get the filesize and read the
data into a static array in the program. Using the pre-processor, the
programmer can now get an image into memory by just declaring a variable
which is the file and a variable for the in-memory array and just assign
one to the other, as in: 

	IMEXT IMIN	!External Image IMIN, the input image
	IMWIN A		!In-memory Image A, the data array
	...
	A = IMIN	!Read file into A

The assignment statement would be expanded by the pre-processor into 
calls to the appropriate library routines, and the array for the image 
would be dymanically allocated and loaded.  Even getting the filename 
from the user is part of the library.  As result, what would have been a
lot of 'cruft' for reading an image is now a simple expression which 
much more eloquently expresses what the programmer is doing.  The 
pre-processor provides other capabilities to do simple image
manipulations easily, but since the output is Fortran code, the main
body of the algorithm can pass through unchanged. 

There have been a number of useful side effects.  Before using this
pre-processor, the "standard" image file format was very simple and
limited.  All images contained the same data type and had a fixed
maximum size of 256 by 256 pixels because it was easy to write code that
would read them, and the files had to be read into fixed sized arrays in
the program meaning that they had to have a maximum size.  Since the
pre-processor makes file I/O transparent and makes it possible to have
arbitrarily sized arrays, we have started using a more sophisticaed
image file format allowing very large arrays and arbitrary data types.
For the same reasons, users are also starting to see a uniform style of
interface. 

By the way -- I wrote the pre-processor in Fortran ;-).

3) Interactive shell

Another approach we've used is to build an interactive shell around a 
core of basic operations which does all the "dirty work" of managing 
data and much of the user I/O.  The core operations can be coded in 
Fortran or assembly or whatever you like, and specialial core algorithms
can be spliced in with a bit of software "glue."  Since they become just
functions called from within the shell, they don't need to be concerned
with data I/O or user interface -- they just get data from the shell,
process it, and return the results to the shell for the user to display
or continue working on.  In the shell we use (Ana, written by Dr.
Richard Shine and associates) you can do many simple operations right in
the shell.  For example, to subtract the mean from an image stored in
the variable X and display the result on a TV monitor, you could do: 

	ANA> X=X-mean(X)
	ANA> TV,X

Although the shell could be relativly simple, the one we use is a
complete programming language similar to Fortran (but without GOTO's, 
thank goodness :).  A suitable such shell is IDL by Research Systems
Inc., although I don't know if you can add your own code to it.  We use
a home grown design which allows us to modify it any way we wish,
although it does tend to be less stable than a commercial product. 

One real advantage is that this approach facilitates the interactive
development cycle I mentioned earlier.  Since the shell is interactive
and results are calculated and displayed immediately on pressing return,
it is much easier to "fool around," and try different approaches in
order to get a certain result.  Once you get the algorithm working as a 
shell script, it is a fairly trivial process to code it up (using the 
pre-processor) into a self-contained Fortran program, or even as a new 
built-in function in the shell language.

>I have to admit that I haven't seen this tried on a significant
>scale.  The ideas presented above have been forming in my mind
>for a couple of years, and lately I've had an opportunity to
>watch scientists produce programs, confirming much of what I
>had thought.  Take it for what its worth, and I would appreciate
>any evidence supporting or contradicting my positions.
>-- 
>Steve Nuchia        | [...] but the machine would probably be allowed no mercy.
>uunet!nuchat!steve  | In other words then, if a machine is expected to be
>(713) 334 6720      | infallible, it cannot be intelligent.  - Alan Turing, 1947

Our little lab has done some of the best and most impressive image 
processing in the field of solar physics in recent years.  We've been
doing, in a large part, just what Steve Nuchia is suggesting, and I
think that this general approach has been a strong force in our success.
I don't know if what we have done would work for everyone, but it has
certainly worked very well for us. 

	Stuart Ferguson
	Lockheed Palo Alto Research Lab
	Research and Development Division
	Solar and Optical Physics
-- 
		Stuart Ferguson		(shf@well.UUCP)
		Action by HAVOC		(shf@Solar.Stanford.EDU)

lee@diane.uucp (Leonid Poslavsky) (04/21/88)

In article <4962@pucc.Princeton.EDU>, EGNILGES@pucc.Princeton.EDU (Ed Nilges) writes:
> In article <39501UH2@PSUVM>, UH2@PSUVM.BITNET (Lee Sailer) writes:
>  
>One of the reasons physicists need super computers is that they usually haven't
>>a clue about using data structures to implement sophisticated
>>algorithms.  Most P's do most everything by brute force.  Partly, this
>>is because FORTRAN makes it so hard to implement clever code.
>  
>    "In the good old days physicists repeated each other's
>     experiments, just to be sure.  Today they stick to FORTRAN,
>     so that they can share each other's programs, bugs
>     included."

In reality there large libs of greatly optimized code for just about any
mathematical operation. And should you wish to write a structure dependent 
program in FORTRAN all that is required is the ingenuity.
E.g. : a couple of years ago I asked some undergrads in the CS department
as to how they would process a couple of trees which have joined branches
(any offshoot of one tree can be connected or disconnected to the root of
another). None of the 5 people I asked could do this. The problem was that
they were only familiar with Pascal and C and applicable methodologies for
tree parsing. 

By the way the solution is very easy if instead of pointers you use integer
arrays, or to be very extravagant use array of pointers.

Thus, FORTRAN has its advantages : it forces one to evaluate ones options !!!


Lee (Captain)
U. of Notre Dame
BITNET: gba6bc

steve@nuchat.UUCP (Steve Nuchia) (05/01/88)

From article <5752@well.UUCP>, by shf@well.UUCP:
> I wrote a couple of image processing programs in Pascal once.  I was
> using a recursive algorithm, so I thought it would make sense to use a
> language that supported recursion (everybody knows you can't do

People who profess a preference for one kind of language over
others usually approach problems from a perspective appropriate
to that kind of language.  This colors their solution at a deep
structural level, leading to a mismatch if they are forced to
code the solution in some inappropriate language.  This is due
to the choice of solution architecture plus language, and has
much less to do with the combination of _problem_ and language
than is commonly believed.

I'm not trying to say that Mr. Ferguson is wrong, I'm generalizing.
But I do find his argument unconvincing - it amounts to proof
that he likes to program in FORTRAN by reasoning circularly from
the premise that he likes to program in FORTRAN. 


> This is a very good observation.  User interface isn't much of an issue
> for the scientists I work with, however, so that is not where the code
> is.  Most scientists are happy with something as simple as: 
> 	ENTER FILENAME: 
> A scientist's idea of a fancy user interface is being able to press
> return at the prompt to exit the program.  (This is a stereotype, to be

We are thinking of slightly different situations - your community
is using the programming language directly as a tool for data
manipulation, mine is (ultimately) building tools for production
data analysis.  The process is quite similar as far as it goes,
yours just stops when the code is thrown away or mutates out of
existance, mine becomes "supported".

[Much text describing the search for useful algorithms]

>>Sure, the program spends most of _its_ time in the scientific code,
>>but where are your programmers spending _their_ time?   ...
> 
> In a type of *interactive* software development cycle for which few
> languages are well adapted.

Again, we are looking at different phases of the process; you make
a very good point here.  I had glossed over the process by which
what I called the "algorithmic core" of my "supported programs".

>>Nuchia's law:  90% of any production program is doing data
>>        managment and user interface for the 10% that is
>>        doing the real work.
> 
> This is a good start, but what exactly is meant by a "production"
> program?  I also don't like the implication that user interface cannot
> be considered "real" work.  What about something like a CAD program
> which is nothing BUT user interface? 

As usual, the controversy is one of semantics :-)

CAD includes things like routers and design rule checkers and other
computationally intensive algorthms, but that is begging the question.
I was thinking of explicitly computational jobs when I wrote that, and
I meant "real" in its pop-culture sense - as in Real Programmers don't
write comments.  That is the sense in which your scientist might use
the word, "fancy users interfaces are useless - they don't do any
_real_ work".

By production program I mean one that has a user's manual, dozens
of users at several site, and is run orders of magnitude more often
than it is compiled.  Obviously this isn't a definition of the term,
but you get the idea.

>>... how good of a language is FORTRAN for
>>data management?  Pretty terrible.  It just doesn't have any
>>[etc]

> It's easy for us software types to blame the programming language; I
> mean, what could be worse than Fortran, right?  But I remind you that I
> wrote 'lexx,' 'yacc' and 'make' in Fortran with very little trouble. 

And some people do pretty amazing things in assembler, too.  Does
that make it right?

Ultimately the choice of program architecture is much more
important than the choice of language; all languages are
after all formally equivalent.  I find that FORTRAN restricts
my ability to manage system resources efficiently, and greatly
limits the portability of the resulting program.  That coupled
with the fact that I don't do a whole lot of numerical work
myself leaves me very comfortable _not_ using FORTRAN.  Diffirent
situations lead to different choices.

> This is Vax Fortran, an extension of FORTRAN-77 with 31 character
> identifiers, structured control flow and structured data types, among
> other things.  I've done recursion, linked lists, string manipulation
> and even dynamic memory allocation in Fortran.  I'm not pushing Vax

"and even" dynamic memory allocation?

shhh!  my case is resting.

> Fortran as some kind of True and Great language, but I AM suggesting
> that the source of some of the problems Steve addresses may be a result
> of more than JUST the choice of programming language. 

On that I agree completely.

> The "scientific" programs which start small and grow, such as the ones 
> that Steve refers to, are rarely planned.  Rather they evolve out of the
> kind of interactive development (i.e. hacking) that I described.  For
> cases where the program was actually designed in advance, such as my
> 'lexx,' 'yacc,' and 'make' utilities, Fortran was a workable choice. 

Excelent points.  My point in relation to the unplanned growth
was simply that at some point the program gets so large, and
is doing so many unexpected things, that the original language
choice is no longer appropriate.  Common practice does not have
a provision for recognizing and dealing with that threshold; I
am proposing that making such a provision explicitly would be
a good thing.  Secondarily I'm arguing that having that mechanism
in place would reduce a lot of the motivation for the language wars.

The point about FORTRAN being workable with advance planning says
a great deal more about the value of planning that it does about
the fundamental suitability of FORTRAN for systems programming.

> Some might argue that bad C code is better than good Fortran code, but I

Some might; I'm not one of them.

> don't find this to be the case, and I strongly suspect that those who
> say this don't really know how to recognize a good Fortran program. 

Or a bad C program.  FORTRAN is (was - 9X is a mess) a much simpler
language in some ways than C.  Bad FORTRAN programs are bad because
of scale, aliasing problems, and poor choice of names, primarily.
Bad C programs can have all of those problems, along with additional
aliasing brought about by pointer abuse, preprocessor abuses, and
unimaginably tortuous control structures (C does have goto, and bad
programs use it, but that is the least of the control structure problems
in a really bad C program).

An important concept in program quality is "referential transparency";
the idea that the meaning of a fragment should be as clear as possible
without reference to distant passages.  In my opinion, no program is
truly world-class bad unless it grossly lacks referential transparency.
Of course both C and FORTRAN have mechanisms by which a dedicated
hacker can obscure the meaning of his code; arguing about which is
better on the basis of bad programs is silly.  (and besides, FORTRAN
might win, and we can't have that!  :-)

A skilled and disciplined programmer can produce high quality
programs in any language.  No language can prevent poor code.



Now we dive into a discussion of user interfaces.

> do something as simple as draw a line.  Experience with these tends to
> make me shudder when I see someone refer to a building a GIANT user
> interface library. 

Once again we are seeing things differently because we are looking
at different things.

> The best library interface, as well as the most useful user interface
> library for Fortran programs I've even seen happens to be the same
> library.  It is called GIRL (Generalized Input Routine Library) and the
> basic programmer interface consists of ONE subroutine call.  The 
> programmer can do something like:
> [description of prompt generator] 

Simple tools are just the thing for simple results, and the prompt
handler looks like just the thing for short-lived programs.  What
I had in mind were things like the Macintosh Toolkit or Suntools
or similar graphical interactive user interface environments.  Even
the typical spreadsheet or data entry screen needs significantly more
sophistication than your prompter can give, and that additional complexity
inevitably impacts the design of your code.



The remainder of Mr. Ferguson's article contains very good data
on his experiences with growing programs, exactly the kind of
information I had asked for.  Thank you!


-- 
Steve Nuchia	    | [...] but the machine would probably be allowed no mercy.
uunet!nuchat!steve  | In other words then, if a machine is expected to be
(713) 334 6720	    | infallible, it cannot be intelligent.  - Alan Turing, 1947

shf@well.UUCP (Stuart H. Ferguson) (05/05/88)

In a previous article, Steve Nuchia responds to my response to his 
previous article ...
> >From article <5752@well.UUCP>, by shf@well.UUCP:
> > I wrote a couple of image processing programs in Pascal once.  I was
...
> People who profess a preference for one kind of language over
> others usually approach problems from a perspective appropriate
> to that kind of language.  ...
...
> I'm not trying to say that Mr. Ferguson is wrong, I'm generalizing.
> But I do find his argument unconvincing - it amounts to proof
> that he likes to program in FORTRAN by reasoning circularly from
> the premise that he likes to program in FORTRAN. 

I undoubtedly made a mistake in phrasing my article the way I did.  It's 
easy to take what I said as material for a language war, but that was 
not at all my intention.  Let me re-iterate:

>> ... I'm not pushing 
>> Fortran as some kind of True and Great language, but I AM suggesting
>> that the source of some of the problems Steve addresses may be a result
>> of more than JUST the choice of programming language.

Language wars start with the premise that the Quest is to find the Right
Language.  Although there are always language choice issues involved in
software engineering, the ultimate goal has always really been to write
Good Programs.  In one's personal Quest for Good programs, one may find
that some programming paradigms can provide greater insight and freedom 
than others.

I wrote in an apparant defense of Fortran partly because it's one of
those languages that programmers snicker about even though they
themselves have never written a line of Fortran code.  (Rather like
COBOL in that respect -- or lack of it :-).)  But the real issue is not
about any particular language, it's about producing good code, and on
that front I agree pretty well with what Steve has said. 

He summed it up much more eloquently than I:

>A skilled and disciplined programmer can produce high quality
>programs in any language.  No language can prevent poor code.


On the real issue:

>> The "scientific" programs which start small and grow, such as the ones 
>> that Steve refers to, are rarely planned.  Rather they evolve out of the
>> kind of interactive development (i.e. hacking) that I described.  For

>Excelent points.  My point in relation to the unplanned growth
>was simply that at some point the program gets so large, and
>is doing so many unexpected things, that the original language
>choice is no longer appropriate.  Common practice does not have
>a provision for recognizing and dealing with that threshold; I
>am proposing that making such a provision explicitly would be
>a good thing.  Secondarily I'm arguing that having that mechanism
>in place would reduce a lot of the motivation for the language wars.

The motivation for this is excellent.  Mr. Nuchia has made an
observation (that I can verify to some extent) that small programs often
grow into big programs and the original language in which they were
formulated is no longer appropriate to the kind of program it has
become.  The original language Steve refers to is Fortran, but it could
just as easily be BASIC, Lisp, LOGO or Ada for that matter.  In alot of 
cases, there may be a point where it would make sense to use a different 
programming language in order to facilitate writing good code.  The 
problem then becomes one of recognizing and dealing with this threshold, 
rather than one of struggling with an innapropriate language.


[discussion of "Nuchia's Law" and the meaning of a "production" 
program]
>By production program I mean one that has a user's manual, dozens
>of users at several site, and is run orders of magnitude more often
>than it is compiled.  Obviously this isn't a definition of the term,
>but you get the idea.

I do.  You mean a "production" program in the sense that the program is
the _product_, not in the sense that the program is _used_ for
production.  We have a lot of programs we use as part of a data 
production pipeline which get run orders of magnitude more than they get 
compiled, but we do not have programs which are a product in themselves.

[discussion of libraries]

>Once again we are seeing things differently because we are looking
>at different things.
...
>... What
>I had in mind were things like the Macintosh Toolkit or Suntools
>or similar graphical interactive user interface environments.  Even
>the typical spreadsheet or data entry screen needs significantly more
>sophistication than your prompter can give, and that additional complexity
>inevitably impacts the design of your code.

I was again not trying to push the prompt library as a great thing in
and of itself.  I'm sure nobody's interested in that, especially not in
comp.software-eng.  I was trying to illustrate my point that many
libraries are not well designed.  The issues of library interface and
user interface are strongly analogous, and lessons learned in one are
often applicable to the other.  The sophistication and sheer complexity
of graphical user interfaces means that all that much more time should
be spent designing a proper programmer interface. 

Sometimes with domains as complex as this one it makes sense to go 
beyond the conventional idea of just compiling a library of useful
functions.  When the task represents a whole formal domain in itself, it
can make sense to provide design tools which address the formal domain
directly. Databases are a good example.  Databases can be implemented as
a library interface to a general-purpose programming language, but since
database management is such a large and well-defined domain in itself it
can be worthwhile to write database languages -- DBaseII, for example --
which are a kind of special-prupose programming language giving more
direct access to the formalisms of database management. 

If you do this, the trick then becomes interfacing the special-purpose
language to other languages in order to implement the things not easily
accessable by the formalisms of the special-purpose language.  This is
exactly the issue Nuchia is addressing when he refers to the
"threshold," the point where it makes sense to shift language rather
than muddle through using an innapropriate formal domain. 

>The remainder of Mr. Ferguson's article contains very good data
>on his experiences with growing programs, exactly the kind of
>information I had asked for.  Thank you!
>Steve Nuchia        | [...] but the machine would probably be allowed no mercy.


You're welcome.
-- 
		Stuart Ferguson		(shf@well.UUCP)
		Action by HAVOC		(shf@Solar.Stanford.EDU)

geoff@desint.UUCP (Geoff Kuenning) (05/07/88)

In article <5879@well.UUCP> shf@well.UUCP (Stuart H. Ferguson) writes:

> I wrote in an apparant defense of Fortran partly because it's one of
> those languages that programmers snicker about even though they
> themselves have never written a line of Fortran code.
...
> >A skilled and disciplined programmer can produce high quality
> >programs in any language.  No language can prevent poor code.

While this is technically true, it ignores a very important issue:
productivity.  I have sitting on my shelf a foot-thick listing of a
real-time control system written in Fortran;  about 80% of the code
was written by me.  The code is some of the best I've ever done.  But
if I had been allowed to do the project in C, Bliss, or another
language that had data structures, I would have finished it in 30%
less time.  (If I'd been allowed to use a Vax instead of a PDP-11,
I would have saved another 40%, but that's another issue).
-- 
	Geoff Kuenning   geoff@ITcorp.com   {uunet,trwrb}!desint!geoff