[comp.lang.misc] What makes a language "easy" to program in?

feldmark@hanako.stars.flab.Fujitsu.JUNET (05/31/88)

The other day I was having a discussion about "what I would put into a
parallel processing computer language, if I were designing one".  We
were trying to quantify what makes certain programming languages "easy"
to program in.

We were discussing general MIMD parallel processing, and I am a
consummate C programmer, but wouldn't dream of proposing a sequential
programming language that is "hacked" to provide parallelism.  It has
to be an inherently parallel or data flow type of language.  In
particular, we were discussing features of C, GHC (a concurrent
Prolog-ish logic programming language), and Smalltalk. GHC is
inherently parallel resulting in many data flow-like features.  I have
done a little programming in GHC here at work (but nothing really
big), and started learning Smalltalk about a week ago, so don't have a
lot of experience to base comparisons on.  (But I know that I was
constantly wishing I could put loops and arrays into my GHC programs.)
The points I could come up with maybe a bit generalized and obvious,
but...

To me right now, sequential style programming is just easier than
programming in a language that is inherently parallel.  Too many
things to keep track of. In GHC for example, goals comprising the body
of an predicate may essentially be executed in a random order.  So one
has to be very careful to be sure his program works regardless of the
order of execution.  It is much easier to program, given a specific
order (sequential) to depend on.  (But obviously bad for parallel
processing.)  There is also some advantage in my being able to program
well with sequential language constructs, because I can ALREADY do it and know
instinctively how to attack a problem.  It was also suggested by an
experienced GHC programmer that if he came back a year after writing a
complex GHC program and a complex C program to add features, it would take
considerably longer time to understand the GHC program than the C
program.  (Yes, even with sufficient comments. :-)

From the little I know of Smalltalk and object oriented programming in
general it seems that the best features are the modularity that is
imposed by the language itself and lack of typing.  Modularity is
definitely not mandatory in GHC or C.  But then I have also heard that
really large systems get painful to deal with even in Smalltalk. (Any
comments on this?)

Maybe this all boils down to the pros and cons of sequential,
functional, logic, and object oriented programming.  If this hasn't
already been beaten to death at some time in the recent past, does
anyone have any comments on what it really is that makes any of these
"easy" (or hard) to program in?  Would loop macros in Prolog/GHC do
the trick for me?  Can I incorporate sequential features into an
inherently parallel language and still maintain a coherent language
model?  Would I have a different feeling about it being "easier" to
program in C than GHC or Smalltalk if I had learned a non-sequential,
logic, or object-oriented programming language first?  Or is the only
solution to just develop programming instincts for inherently parallel
languages?

Mark Feldman
Fujitsu Laboratories Limited
JAPAN: feldmark@hanako.stars.flab.fujitsu.junet
USA:   feldmark%hanako.stars.flab.fujitsu.junet@uunet.uu.net

baldwin@cs.rochester.edu (Douglas Baldwin) (06/01/88)

	Having thought a fair bit about parallel programming and why
it's hard to do and how it ought to be done, here're my comments
(for what they're worth).

	Coordinating a multiprocessor to do something IS harder than
getting a uniprocessor to do it - someone has to indicate how the
overall application is broken up into processes, how and when those
processes communicate, how they have to be synchronized in order to
avoid a whole host of problems (deadlock, starvation, violation of
data dependencies in accessing data, etc. etc. etc.). When programmers
have to describe all of this explicitly, in addition to just describing
the basic function they want computed, it's no wonder that they have a
hard time. This suggests that IMPLICIT parallelism (i.e., parallelism
detected and managed by a compiler or run-time system) is preferable to
explicit parallelism.

	Conventional imperative languages essentially describe state
machines - not necessarily machines with a finite number of states,
but still machines whose current actions are dependent on the state
they are currently in. This means that programs (i.e., state machines)
written in these languages had better sequence through their states in
just the right order if they are to produce the results that programmers
expect. This dependence on the order of states makes all imperative
languages (i.e., the whole Algol family and its close relatives,
object-oriented languages) inherently sequential, and any attempt to
extract parallelism from these languages fights a losing battle
against this sequentiality. I'd even argue that attempts to reason
about the correctness of programs written in explicitly parallel
imperative languages face the same battle.

	Declarative languages (functional, dataflow, logic, etc.)
do seem better for implicitly parallel programming, although they
have problems of their own. The main one is that to varying degrees
all of the declarative languages of which I am aware have problems
expressing general data parallel computations - the problem is that
the cleanest way to describe a computation done on all elements of
a data structure is via some sort of recursive traversal of that
structure, and the recursion introduces apparent data dependencies
that serialize the computation. This problem could probably be fixed
by proper language design, and in fact that's what my research is
about - things called "constraint languages" that I think avoid a
lot of the problems that other languages have in supporting implicitly
parallel programming.

	Just to put in a plug for myself, more details on the above
argument can be found in "Why We Can't Program Multiprocessors the
Way We're Trying to Do It Now", Technical Report number 224, University
of Rochester Computer Science Department, by me. I'll be glad to have
copies sent to anyone who asks.

	One final comment on Mark's "would loops help GHC?" question.
I suspect not. I haven't used GHC myself, although I've dabbled in
Prolog and followed some of the literature on parallel logic languages.
To use any language you really need to think in its idioms and paradigms,
and if GHC is anything like other logic languages the right paradigms
are recursion, backtracking, simultaneous search for all solutions to
a goal, etc., not explicit loops. Maybe as you get more used to GHC
you'll stop wishing for loops so much.

ephraim@think.COM (ephraim vishniac) (06/02/88)

In article <1802@hubcap.UUCP> baldwin@cs.rochester.edu (Douglas Baldwin) writes:
>	Having thought a fair bit about parallel programming and why
>it's hard to do and how it ought to be done, here're my comments (for
>what they're worth).

>	Coordinating a multiprocessor to do something IS harder than
>getting a uniprocessor to do it - someone has to indicate how the
>overall application is broken up into processes, how and when those
>processes communicate, how they have to be synchronized in order to
>avoid a whole host of problems (deadlock, starvation, violation of
>data dependencies in accessing data, etc. etc. etc.). When
>programmers have to describe all of this explicitly, in addition to
>just describing the basic function they want computed, it's no wonder
>that they have a hard time. This suggests that IMPLICIT parallelism
>(i.e., parallelism detected and managed by a compiler or run-time
>system) is preferable to explicit parallelism.

The area that Baldwin seems to be ignoring in this discussion is SIMD
machines.  His initial statement ("Coordinating a multiprocessor is
harder...") is true, but his consequent ("someone has to indicate how
the overall application is broken into processes...") is parochial.

In the parallel system I use (a Connection Machine), there's no such
problem as dividing the application into processes, hence no
interprocess communication, synchronization, etc.  Instead (yes, there
is a catch!), the hard part is deciding on the division of data among
the processors.  When coding the application, most of the parallelism
is implicit: all operations on occur on all of the currently selected
processors.  In the language I use (C*), there are only slight
extensions to conventional C syntax.  The body of a typical C*
function can be understood as operating on a single set of data even
though it operates on an arbitrary number.

Sorry if the above sounds too much like an ad for the CM.  I'm not one
of the designers of the system or the language, just an in-house user
who thinks the designers did some smart things.

Ephraim Vishniac					  ephraim@think.com
Thinking Machines Corporation / 245 First Street / Cambridge, MA 02142-1214

     On two occasions I have been asked, "Pray, Mr. Babbage, if you put
     into the machine wrong figures, will the right answers come out?"

daveb@geac.UUCP (David Collier-Brown) (06/03/88)

In article <FELDMARK.88May31160241@hanako.stars.flab.Fujitsu.JUNET> feldmark@hanako.stars.flab.Fujitsu.JUNET writes:
>The other day I was having a discussion about "what I would put into a
>parallel processing computer language, if I were designing one".  We
>were trying to quantify what makes certain programming languages "easy"
>to program in.
>
>We were discussing general MIMD parallel processing, and I am a
>consummate C programmer, but wouldn't dream of proposing a sequential
>programming language that is "hacked" to provide parallelism.  

  Well, Per Brinch Hansen (of "Monitors" fame, along with Tony
Hoare) invented an interesting language some years ago for
expressing programmer-visible parallellism in a somewhat "normal"
notation...
  This was Edison, described in a special issue of SP&E, and a
parallel buffer-copier looked something like the following:

   <declarations for two buffers, in and out>
   while (1) do
	out = in;
	cobegin 
		1 do read(in);
	also
		2 do write(out);
	coend
   end

  As you can see, this allows two long-lived processes to rendesvous
in a common critical section (the assignment statement) without tons
of syntax.
  I infer the code generated from this was substantially more
complex, something like:

	<declarations>
	process_start(processor=1, process={read(in);});
	process_start(processor=2, process={write(out);});
    loop_label:
	wait_for(1); /* causes suspension */
	wait_for(2);
	out = in;
	run(1);
	run(2);
	goto loop_label;
    exit_label:
	process_stop(processor=1);
	process_stop(processor=2);
	exit;

--dave (its an interesting notation, and subtle) c-b
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  | "His Majesty made you a major 
 350 Steelcase Road,   |  because he believed you would 
 Markham, Ontario.     |  know when not to obey his orders"

evan@cunixc.columbia.edu (Evan Bigall) (06/05/88)

>Maybe this all boils down to the pros and cons of sequential,
>functional, logic, and object oriented programming.  If this hasn't
>already been beaten to death at some time in the recent past, does
>anyone have any comments on what it really is that makes any of these
>"easy" (or hard) to program in?  Would loop macros in Prolog/GHC do
>the trick for me?  Can I incorporate sequential features into an
>inherently parallel language and still maintain a coherent language
>model?  Would I have a different feeling about it being "easier" to
>program in C than GHC or Smalltalk if I had learned a non-sequential,
>logic, or object-oriented programming language first?  Or is the only
>solution to just develop programming instincts for inherently parallel
>languages?
>
>Mark Feldman
>Fujitsu Laboratories Limited
>JAPAN: feldmark@hanako.stars.flab.fujitsu.junet
>USA:   feldmark%hanako.stars.flab.fujitsu.junet@uunet.uu.net

You have managed to skip one class of language (or at least one language) APL.

Many people would argue with me that APL is not easy to program in, but there
are a fair amount of people who use it (and swear by it) despite its glaring
flaws.  

APL is exactly what you have asked for, it is a sequential language that
supports inherent parallelism.  It has both fine grain parallelism in the 
form of all the vectors lying around and coarse grain parallelism in the form
"heavy data independant" nodes in the parse tree the can be efficently
scheduled for simultaneous execution.  For more on this idea see the papers by
Wai-Mee Ching of IBM Watson research describing his parallel/vector APL
compiler that takes advantage of theses things (I was heavily involved in the
code generation). 

While I am not suggesting we all switch to APL for parallel applications I do
think that it would be wise to take an objective look at the language and see
what might be worth stealing.  I have also noticed that an uncanny number of
papers on parallel languages seem to end up footnoting APL (Steeles C* paper
off the top of my head) maybe this is a hint?????

Evan

-- 
APL is a mistake, carried through to perfection.  It is the language of the
future for the programming techniques of the past.    -	Edsger Dijkstra

mrspock@hubcap.UUCP (Steve Benz) (06/06/88)

   
 ( For brevity, I use the term 'parallel' to mean MIMD and MISD
   architectures.  Hopefully the SIMD crowd will see their way
   clear to forgive the slight.  )

From article <10216@sol.ARPA>, by baldwin@cs.rochester.edu (Douglas Baldwin):
> [It's harder to write programs for parallel computers than sequential
>  computers]... This suggests that IMPLICIT parallelism (i.e., parallelism
> detected and managed by a compiler or run-time system) is preferable to
> explicit parallelism.

  There is only a fine hair of difference between
Baldwin's "explicit" parallelism and "implicit" parallelism.
"Explicit" parallel constructs mean something like "do A in parallel
with B."  "Implicit" parallel constructs mean "do A in parallel with
B or do them in sequence -- I don't care."

  While implicit parallelism makes things easier on the programmer,
it makes it harder on the "optimizer" (excuse my very liberal use
of the term.  The "optimizer" is the unit that has to determine what
would be the best way to balance load across the processing elements.)

> [imperative programming languages...]
> (i.e., the whole Algol family and its close relatives,
> object-oriented languages) [are] inherently sequential, and any attempt to
> extract parallelism from these languages fights a losing battle
> against this sequentiality.

  Just as with any other sort of optimization, there are ways to make
things easier on the optimizer.  I'll demonstrate my point from analogy,
consider these for loops (written in C):

{	int i,a[10];
	for (i = 0; i < 10; i++) a[i]=0;

	--==*==--

{	int *p,a[10];
	for (p = a+10; p != a; *--p = 0);

  I feel certain that the second for loop would run at least as fast
as the first on a PDP-11.  Some optimizers would pick up on the
fact that the first solution wasn't that great and would output
something more closely resembling the second for loop.

  This carries over to parallel systems as well.  On such
architectures, algorithms that are written in a manner more suited
to parallel architectures will run faster than those which are more
suited to sequential architectures.

  In the world of sequential processing, there are a wide variety
of programming languages to choose from.  Each is best suited to
a particular set of problems.  No one language is ideal in all
situations.  I think the same will be true for parallel languages.
Imperative languages will find their niche in parallel processing,
just as functional and logic languages will.

				- Steve Benz

g-rh@cca.CCA.COM (Richard Harter) (06/06/88)

In article <1818@hubcap.UUCP> mrspock@hubcap.UUCP (Steve Benz) writes:

>  Just as with any other sort of optimization, there are ways to make
>things easier on the optimizer.  I'll demonstrate my point from analogy,
>consider these for loops (written in C):

>{	int i,a[10];
>	for (i = 0; i < 10; i++) a[i]=0;

>	--==*==--

>{	int *p,a[10];
>	for (p = a+10; p != a; *--p = 0);

>  I feel certain that the second for loop would run at least as fast
>as the first on a PDP-11.  Some optimizers would pick up on the
>fact that the first solution wasn't that great and would output
>something more closely resembling the second for loop.

	As a side point, this examples illustrates a comon deficiency
in programming languages -- they force the user to specify too much.
What is wanted in this case is really something like

	for i in {0,...,n-1} a[i] = 0;

or
	for all i in range of a, a[i] = 0;
or even
	a = 0;

	The point is that the quoted forms specify too much, and too
little.  They specify too much in that they specify the details of the
calculation, and too little in that the desired result is not explicit,
but must be inferred from the results of the calculation.  I would think
that a language would be easier to program in to the extent that you
can specify what you want the result to be directly.  The best pro-
gramming language only has one instruction:

	"Do what I want"
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

chris@mimsy.UUCP (Chris Torek) (06/06/88)

>In article <1818@hubcap.UUCP> mrspock@hubcap.UUCP (Steve Benz) gives
two code fragments:
>>	int i,a[10];
>>	for (i = 0; i < 10; i++) a[i]=0;
and
>>	int *p,a[10];
>>	for (p = a+10; p != a; *--p = 0);

In article <29190@cca.CCA.COM> g-rh@cca.CCA.COM (Richard Harter) writes:
>What is wanted in this case is really something like ... or even
>	a = 0;
>... the quoted forms specify too much, and too little.  They specify
>too much in that they specify the details of the calculation, and too
>little in that the desired result is not explicit, but must be inferred
>from the results of the calculation.

That is, they say `how' (at some level) and not `what'.

>... The best programming language only has one instruction:
>	"Do what I want"

This is, alas, unrealistic.  We have enough trouble telling
(supposedly :-) ) intelligent people what we want done!

Something that would be interesting would be a semi-automatic system to
turn `how' into `what'.  Suppose you had an system language, where
you might feed in the original loop:

	for (i = 0; i < 10; i++) a[i] = 0;

and the system would ask, `Did you really mean set a[0] to 0, then
a[1], etc., or did you just mean to set all of a to zero?'  If you
answered the latter, it might `create' a construct:

	concept <concept-tag> /* set all of a to zero */;

then ask for a name for the concept and perhaps a parameterisation
(always `a', or any array? always all? etc.).

Clearly this sort of thing, being rather AIish (an optimisation expert
system---it is at least conceivable), is a bit beyond the state of the
art in conventional compilers.  But it would be nice to somehow get rid
of the notion of having to find vector-able FORTRAN loops, by replacing
them with the intended operation.  While the proposed FORTRAN 8X
standard gives some hope towards this, I would like to see something
more ambitious done.  (Not that I am going to do it! :-) )
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (06/07/88)

I have a question for the APL users out there.  I'm used to programming
with C pointers, which make it easy to write any sorts of strange data
structures you want.  As far as I can tell, APL only gives you arrays
(very powerful arrays, of course).  How easy is it to use data structures
like trees, queues, and lists in APL?  It it necessary to simulate them
with arrays like you would do in fortran, or are there other ways to get
around the problem?

	Wayne

evan@cunixc.columbia.edu (Evan Bigall) (06/07/88)

Its not easy, in fact its a bloody nuisance.  Sometimes it helps to remember 
that pointers are just and index into a big array (all of memory), but not
really.  But, you have to remember that the lack of pointers is a feature not
a bug.  It removes the problem of aliasing allowing for a much better 
data flow/dependency analysis.  I would much rather see records added to APL
than pointers.  Typically things like trees are implemented by indexing into
tables.  It makes for very clumsy and slow code, but hey this is APL!  It wasnt
meant to do things like that.
Evan
-- 
APL is a mistake, carried through to perfection.  It is the language of the
future for the programming techniques of the past.    -	Edsger Dijkstra

guido@cwi.nl (Guido van Rossum) (06/08/88)

In article [...] evan@cunixc.columbia.edu (Evan Bigall) writes:
>But, you have to remember that the lack of pointers is a feature not
>a bug.  It removes the problem of aliasing allowing for a much better 
>data flow/dependency analysis.  I would much rather see records added to APL
>than pointers.  Typically things like trees are implemented by indexing into
>tables.  It makes for very clumsy and slow code, but hey this is APL!  It
>wasnt meant to do things like that.

Remember that trees are often used as a way to implement other
abstractions: associative arrays, sorted sets with easy insert/
retrieve/delete operations, etc.  It might be more appropriate to
add such "higher-level" data types to a language than pointers.
--
Guido van Rossum, Centre for Mathematics and Computer Science (CWI), Amsterdam
guido@piring.cwi.nl or mcvax!piring!guido or guido%piring.cwi.nl@uunet.uu.net

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (06/10/88)

In article <350@piring.cwi.nl>, guido@cwi.nl (Guido van Rossum) writes:
> Remember that trees are often used as a way to implement other
> abstractions: associative arrays, sorted sets with easy insert/
> retrieve/delete operations, etc.  It might be more appropriate to
> add such "higher-level" data types to a language than pointers.

The way you would add such data types to a language is to build them out
of the primitives available, which are pointers (in the case of C), or
arrays (in the case of APL).  Actually adding them as primitives would
be a very bad idea -- you could never add enough to make everybody happy.

The sorts of things I use pointers for that I couldn't use arrays for
are parse trees and graphs (both directed and undirected).  Often there
is no "higher-level concept" I am trying to express.  Has anybody written
code to deal with these things in APL?  Is it as ugly as it seems?

	Wayne

guido@cwi.nl (Guido van Rossum) (06/10/88)

In article <3880@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne A. Christopher) writes:
>The way you would add such data types to a language is to build them out
>of the primitives available, which are pointers (in the case of C), or
>arrays (in the case of APL).  Actually adding them as primitives would
>be a very bad idea -- you could never add enough to make everybody happy.

Not true.  ABC, a language I have helped implement, has no pointers but
provides very general associative arrays (you can use *any type* as the
key type for an array, and of course arrays are types) and sorted
multisets ("lists").  These two primitive data types make up for many
uses of pointers.  True, for things like graphs you have to implement
them using arrays, but now you can choose meaningful values  as the
indices instead of assigning numbers 1, 2, 3, ...; e.g., in a directed
graph, you can use the label of the destination node to represent an
edge.
--
Guido van Rossum, Centre for Mathematics and Computer Science (CWI), Amsterdam
guido@piring.cwi.nl or mcvax!piring!guido or guido%piring.cwi.nl@uunet.uu.net

steven@cwi.nl (Steven Pemberton) (06/10/88)

In article <350@piring.cwi.nl>, guido@cwi.nl (Guido van Rossum) wrote:
> Remember that trees are often used as a way to implement other
> abstractions: associative arrays, sorted sets with easy insert/
> retrieve/delete operations, etc.  It might be more appropriate to
> add such "higher-level" data types to a language than pointers.

In article <3880@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne
A. Christopher) replied:
> The way you would add such data types to a language is to build them out
> of the primitives available, which are pointers (in the case of C), or
> arrays (in the case of APL).  Actually adding them as primitives would
> be a very bad idea -- you could never add enough to make everybody happy.

Actually, from the point of view of what's in the Subject: line - what
makes a program easy to program in - our experience here is that
there's a *big* win in making the high-level data-types the
'primitive' types of your language. You actually hardly ever want the
low-level types themselves, but only as a tool for creating the higher
ones. On the occasions that you really need the low-level types, then
you can use the high-level ones to simulate the lower-level ones.

You don't need to supply *all* the data-types, only the best set to
implement the others that you need.

In designing the type system for ABC, Lambert Meertens did something
like the following (I imagine Lambert reads this group, so if I get
it wrong, I'm sure he'll put me right):

- enumerate all the data-types you can think of: stacks, trees, lists,
  bags, sets, graphs, ...

- to each data-type assign an importance weight, based on experience,
  intuition, etc. For instance, sets are more important than deques.

- for each type, design a set of implementation schemes, based on
  other data-types. For instance, trees implemented as a special case
  of graphs. Associate with these usage efforts, based on how often
  you use the type, and how difficult it is to define in terms of the
  other types. 

- Write a program that takes as primitive types all subsets of the set
  of types, and computes the implementation complexity of the whole
  set of types, where the predefined types have complexity 1, and the
  other types have complexities calculated from the implementation
  schemes. 

- Scatter plot the results, with usage effort * importance on the one
  axis, and the total complexity on the other.

- Then consider all points close to the axes as candidates for the set
  of primitive types for your language. Select according to taste.

For ABC we came up with 5 high-level, primitive types: strings,
numbers, compounds(fixed-sized tuples), bags, tables.

> The sorts of things I use pointers for that I couldn't use arrays for
> are parse trees and graphs (both directed and undirected).  Often there
> is no "higher-level concept" I am trying to express.  Has anybody written
> code to deal with these things in APL?  Is it as ugly as it seems?

But 'parse-trees' and 'graphs' *are* the higher level concept.

Let me give an example, and use graphs as that example.
ABC has tables as a primitive type. These are generalised arrays: the
keys may be of *any* one type (including other tables), and so may the
elements. ABC also has bags (multisets).

To implement a graph, you use a table, mapping node names to bags of
node names, which represent the outward going arrows. E.g.
	graph[1] = {2; 3}
	graph[2] = {1; 3}
	graph[3] = {3}

As another example, define sets in terms of bags. here is set union
and difference:

HOW TO RETURN set1 union set2:
    FOR elem IN set1:
        IF elem not.in set2:
            INSERT elem IN set2
    RETURN set2

HOW TO RETURN set1 diff set2:
    FOR elem IN set1:
        IF elem IN set2:
            REMOVE elem FROM set2
    RETURN set2

Using these two definitions, I can now define garbage collection
(finding the minimum connected subgraph from a given node) in a dozen
lines:

HOW TO RETURN graph reachable.from root:
    PUT {root}, {} IN to.do, new
    WHILE to.do > {}:
        PUT min to.do IN node
        REMOVE node FROM to.do
        TREAT NODE
    RETURN new
TREAT NODE:
    IF node IN keys graph:
        PUT graph[node] IN new[node]
        PUT to.do union new.nodes IN to.do
new.nodes: RETURN new[node] diff (keys new)

More info on ABC? Send me mail.

One day I'll force Lambert to write an article on the type choosing
algorithm. I'll threaten to delete half his files from our overflowing
disk, or something like that :-)

Steven Pemberton, CWI, Amsterdam; steven@cwi.nl or try mcvax!steven.uucp

budd@mist.cs.orst.edu (Tim Budd) (06/10/88)

In article <3880@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne A. Christopher) writes:
>In article <350@piring.cwi.nl>, guido@cwi.nl (Guido van Rossum) writes:
>> Remember that trees are often used as a way to implement other
>> abstractions: associative arrays, sorted sets with easy insert/
>> retrieve/delete operations, etc.  It might be more appropriate to
>> add such "higher-level" data types to a language than pointers.
>
>The way you would add such data types to a language is to build them out
>of the primitives available, which are pointers (in the case of C), or
>arrays (in the case of APL).  Actually adding them as primitives would
>be a very bad idea -- you could never add enough to make everybody happy.
>
>The sorts of things I use pointers for that I couldn't use arrays for
>are parse trees and graphs (both directed and undirected).

In this particular case, however, I believe Guido speaks from personal
experience.  The language project he was formerly associated with, the ABC
language, developed by Lambert Meertens at the CWI, is designed for novice
users (a replacement for BASIC and the like).  It has exactly five data
types: numbers, texts (strings), compounds (records without field names), 
lists (which are sorted), and tables (associative arrays).  This seems to
be a fairly complete and useful set, particularly for things like graphs
and parse trees.  (I once wrote the LALR sets-of-states construction
algorithm in ABC; it was about two pages of code).

I've never been able to figure out why we think in computer science that
students should have to build their tools before they use them.  Sort of
like having to use an anvil and forge to build your hammers and
screwdrivers before you can learn carpentry.  Now really; wouldn't it make
much more sense to give people a few useful tools to start out with, and
let them get down to solving their problems that much sooner?

--tim budd, oregon state university  (budd@cs.orst.edu)

markv@uoregon.uoregon.edu (Mark VandeWettering) (06/11/88)

In article <3880@pasteur.Berkeley.Edu> faustus@ic.Berkeley.EDU (Wayne A. Christopher) writes:

>The way you would add such data types to a language is to build them out
>of the primitives available, which are pointers (in the case of C), or
>arrays (in the case of APL).  Actually adding them as primitives would
>be a very bad idea -- you could never add enough to make everybody happy.

	The plus and minus side of building complex data items into
	the language are:

	+	You don't have to code them (unless you are the language
		designer).
	+	They then form a standard reusable part of the language.
	-	You don't know how they are implemented, so they might
		be inefficient.  At the very least, you don't know the
		cost of individual primitives.
	
	The Icon programming language supplies sets, tables and lists
	as data types, and seems to provide acceptable performance (for
	an interpreter).  If optimum performance is not an issue, it can
	provide a convenient platform to do many kinds of programming
	for which C would be awkward or expensive.

>The sorts of things I use pointers for that I couldn't use arrays for
>are parse trees and graphs (both directed and undirected).  Often there
>is no "higher-level concept" I am trying to express.  Has anybody written
>code to deal with these things in APL?  Is it as ugly as it seems?
	
	I don't exactly see your point here, but there is definately a
	higher-level concept in parsing.  And chances are a language
	with list or tree datatypes would allow you to express that more
	clearly (if less efficiently) than C.   As for APL, well, if you
	are bound and determined, I am sure you could do parsing, but
	why do it when it is poorly suited for the job?

>	Wayne

mark vandewettering

smryan@garth.UUCP (Steven Ryan) (06/13/88)

Others have already equated our free use of pointers and disdain of a small set
of data structure for the sake of efficiency to the free use of gotos.

Old programmers never die, they're just following a dangling reference.

simont@praxis.co.uk (Simon Tait) (06/13/88)

In article <11829@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>
>Something that would be interesting would be a semi-automatic system to
>turn `how' into `what'.  Suppose you had an system language, where
>you might feed in the original loop:
>
>	for (i = 0; i < 10; i++) a[i] = 0;
>
>and the system would ask, `Did you really mean set a[0] to 0, then
>a[1], etc., or did you just mean to set all of a to zero?'  If you
>answered the latter, it might `create' a construct:
>

I think the question 'Did you really mean...' is irrelevant.
The statement 'for...' works equally well as a specification,
whether the generated code includes parallel or sequential operations.
I would expect the system to pick the implementation which works out
quickest on average (:-)

-------------------------------------------------------------------------------
Simon Tait,
ELLA Group,                         Telephone (0225) 444700.  
Praxis Systems plc,                 Telex 445848 PRAXIS G.  
20 Manvers Street,                  Facsimile (0225) 65205.
Bath BA11PX,
England.
-------------------------------------------------------------------------------

sommar@enea.se (Erland Sommarskog) (06/14/88)

Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>	As a side point, this examples illustrates a comon deficiency
>in programming languages -- they force the user to specify too much.
>What is wanted in this case is really something like
>	for i in {0,...,n-1} a[i] = 0;
>or
>	for all i in range of a, a[i] = 0;
>or even
>	a = 0;
>	The point is that the quoted forms specify too much, and too
>little.  They specify too much in that they specify the details of the
>calculation, and too little in that the desired result is not explicit,

Spot on it! This concrete example gets better in Ada:
       FOR i IN a'range LOOP a(i) := 0; END LOOP;
Or better:
       a := (OTHERS => 0);
The ultimate winner is VAX-Pascal, though:
       a := zero;
The predefined function zero can be used to clear any array or record. 
  It should be added that although Ada helps you in many cases, it is 
too verbose for the purposes desired here. (VAX-Pascal, monster of  
Pascal as it may be, is insufficient in many other occassions.)

>The best programming language only has one instruction:
>
>	"Do what I want"

Here's the pseudo-code for that compiler:
   Read(Line);
   Trim_and_compress(Line);
   IF Line = "Do what I want" THEN
      Write("But what do you want?");
   ELSE
      Write("Syntax error");

-- 
Erland Sommarskog           
ENEA Data, Stockholm        
sommar@enea.UUCP            
Mail your NO votes for rec.music.rock to: jfc%Athena.mit.edu@mit-eddie.UUCP

ubiquity@cs.utexas.edu (Richard Hoffman) (06/16/88)

In article <3496@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
> Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
> >	As a side point, this examples illustrates a comon deficiency
> >in programming languages -- they force the user to specify too much.
> >What is wanted in this case is really something like
> >	for i in {0,...,n-1} a[i] = 0;
> >or even
> >	a = 0;
> 
> The ultimate winner is VAX-Pascal, though:
>        a := zero;
> The predefined function zero can be used to clear any array or record. 

In APL, one has (sort of): A<-(pA)p0
(Please read "<-" as a single character, "p" as a rho, and tilt your
head a little when you look at the "A"s :-) ).

And the much-maligned PL/I wins with: A=0;

Keystrokes: PL/I=4, APL=8; Pascal=8 (not counting spaces).  But of course,
in APL no previous declaration of "A" is necessary.

> >The best programming language only has one instruction:
> >
> >	"Do what I want"

Please don't give a copy of the compiler of that language to the
Pentagon.
-- 
Richard Hoffman / 5751 Valkeith / Houston, TX 77096 / (713) 729-5716
  +- / 12166 Metric Blvd., #122 / Austin,  TX 78757 / (512) 823-1822

"Malt does more than Milton can / To justify God's ways to Man." -- ??

budd@mist.cs.orst.edu (Tim Budd) (06/18/88)

No, I'm not sure you want a ``do what I want'' command.

The following story is true, I've just forgotten some of the less important
details.  There was a Lisp system once that had something called 
DWIM (do what I mean).  If you typed an expression, if it didn't make
sense, it would try various techniques to see if something close to it made
sense, and do that.  Now a friend of mine was using this system and kept
having amazingly slow programs.  It turned out he was saying things like
(CAR ...) when the system wanted (car ...).  It would not recognize CAR,
go through some analyzing, discover that a probable meaning was car, then
do it.  Problem was there was no feedback - no indication he was doing
anything wrong, it produced the right answer, just slowly.  So there are
dangers in ``do what I want'' systems even when (and this is a big if) they
can exactly figure out what it is that you want.

karl@haddock.ISC.COM (Karl Heuer) (06/20/88)

In article <5158@orstcs.CS.ORST.EDU> budd@mist.UUCP (Tim Budd) writes:
>There was a Lisp system once that had something called DWIM (do what I mean).
>... Problem was there was no feedback

Indeed.  I have no objection to languages that warn about questionable things
before continuing with a best guess.  I dislike FORTRAN's implicit declaration
rules (modern implementations have a way to shut them off), and C's acceptance
of array notation when declaring a formal parameter.  And that silly rule in
BASIC that all undeclared arrays have size 10...

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
________
Quaxity quuxity,
Teitelman's InterLISP
Has a DWIM feature that's
Really a screw;
MacLISP has evident
Superiority,
Letting its customer
Mean what he do.
  --The Great Quux

pardo@june.cs.washington.edu (David Keppel) (06/22/88)

In article <5158@orstcs.CS.ORST.EDU> budd@mist.UUCP (Tim Budd) writes:
[ DWIM without warning ]

I believe that more recent versions of [Name Deleted] have warnings.
I also believe that yet more recent versions drop the capability
alltogheter; I think this is too bad, as I've found the interactive
"Gee, did you really mean this?" quite useful.

	;-D on  ( I wrecked my CAR, now I walk with a LISP )  Pardo