[comp.lang.icon] Survey Results : Perl vs Icon vs ....

bevan@cs.man.ac.uk (Stephen J Bevan) (03/30/91)

[Note I've crossposted to all the groups I send my original message
to.  This was at the request of some of the respondents (sp?)]

Here are the results of my question regarding which language to use for
writing programs to extract information from files, generate reports
... etc.  I initially suggested languages like Perl, Icon, Python ...

As part of my original message I said :-

> Rather than FTP all of them and wade through the documentation, I was
> wondering if anybody has experiences with them that they'd like to
> share?

I would like to thank the following people for replying :-

Dan Bernstein - brnstnd@kramden.acf.nyu.edu 
Tom Christiansen - tchrist@convex.COM
Chris Eich - chrise@hpnmdla.hp.com
Richard L. Goerwitz - goer@midway.uchicago.edu
Clinton Jeffery - cjeffery@cs.arizona.edu
Guido van Rossum - guido@cwi.nl
Randal L. Schwartz - merlyn@iWarp.intel.com
Peter da Silva - peter@ficc.ferranti.com
Alan Thew - QQ11@LIVERPOOL.AC.UK
Edward Vielmetti - emv@ox.com
?? - russell@ccu1.aukuni.ac.nz

Most of the replies were about Perl, so I didn't learn much about the
other languages I suggested (other than very general things).
Even though I was originally hoping not to have to ftp any stuff, I
ended up getting the source to Python, GAWK, TCL, Icon and the texinfo
manual for Perl.

To save you going through my list of good and bad points of the
languages I looked at, here is the summary of what I see the languages
as :-

TCL - an embedded language i.e. an extension language for large
      programs (IMHO only if you haven't got, or don't
      like, Scheme based ones like ELK).
Perl - the de facto UNIX scripting language.  You name it, and you
       can probably cobble a solution together in Perl.
       Beyond the fact that a lot of people use it, I can see nothing
       to recommend it.  It's a bit like C in that respect.
Python - Good prototyping language with a consistent design.  It might
         not have all the low level UNIX stuff built in, but by using 
         modules, its easy to add the necessary things in an ordered way.
Icon - the `nearly' language.  Well designed language, that never seemed
       to make it into general use.  Seems to cover the ground all the
       way from AWK type applications to Prolog/Lisp ones.
       If I wasn't already happy with Scheme, I'd use this for more
       general programming.
       I would recommend people at least look at this language.
GAWK - simple scripting language.  Definitely better than `old' awk.
       I would only use it if the job were really simple or if
       something like Python or TCL were not available.

Note I wouldn't expect anybody to make a choice on what I say.  I
suggest you get the source/manuals yourself and have a good long look
at the language/implementation before you decide.

For the types of things _I_ want to do, it would be a tie between Icon
and Python.  Having said that, given that I'd have to extend both to
cover the sort of things I want to do, I'll probably use Scheme
instead (ELK in particular).  The reason I didn't just use Scheme in
the first place is that I was hoping one of the languages would have
all the facilities I want without me having to extend them myself.

Before, the summary of the languages themselves, I thought I'd try and
list some of the things I was looking for.  (Actually, I showed an
earlier version of this summary to somebody and they didn't understand
some of the terms I was using, so this is my attempt at an
explanation).  Note that most of the things are to do with structuring
the code and alike.  This is not the sort of thing you usually worry
about when writing small scripts, but I plan to convert and write a
number of tools, some of which are around the 1000 LOC mark.  For
example, I'd like to convert a particular lex/yacc/C program I have
into the chosen language.

You can skip ahead to the actual summary by searching for SUMMARY.
(Well I can do this in GNUS, I don't know about other news readers
like rn)

Packages/Modules
----------------
These are a mechanism for splitting up the name space so that function
name clashes are reduced.  Most systems work by declaring a package
and then all functions listed from then on are members of that
package.  You then access the functions using the package prefix, or
import the whole package so that you don't have to use the prefix.
The following is an example in CommonLisp :-

;;; foo.lsp                     ;;; bar.lsp
(in-package 'foo)		(in-package 'bar)
(export '(bob))                 (export '(bob))
(defun bob (a b) ...)		(defun bob (x) ...)

;;; main.lsp
(foo:bob 10 20)
(bar:bob 3)

Packages are not perfect, but they do help.  You can get the same
effect by declaring implicit package prefixes :-

;;; foo.lsp			;; bar.lsp
(defun foo-bob (a b) ...)	(defun bar-bob (x) ...)

;; main.lsp
(foo-bob 10 30)
(bar-bob 4)

The advantage of packages over this is that you don't have to use a
package prefix in the package itself when you want to call a function.
This can be a saving if you have lots of functions in a package, and
only a few are exported.

Exception Handling
------------------
This is useful for dealing with error that shouldn't happen. e.g.
reaching the end of the file when you were looking for some valid
data.  For example, in CommonLisp :-

(defun foo (x y)
  ...
  (if (catch 'some-unexpexted-error (bar x y) nil)
    (handle-the-exception ...)
  
(define bar (a b)
  ...
  (if (something-wrong) (throw 'some-unexpected-error t))
  ...)

Here the function `foo' calls `bar', and if any error occurs whilst
processing, it is handled by the exception handler.  (The example is a
bit primitive as I'm trying to save space).

The advantage of this is that you don't have to explicitly pass back
all sorts of error codes from your functions to handle unusual errors.
It also usually means you won't have so many nested `if's to handle
the special cases, therefore, making your code clearer.

Records/Tuples/Aggregates/Structs
---------------------------------
It's handy to be to define objects that contain certain number of
elements.  You can then pass these objects around and access the
individual bits.  For example in CommonLisp :-

(defstruct point x y)

This declares `point' as a type containing two items called `x' and
`y'.  Some languages don't name the items, they rely on position
instead.  I see these as equivalent (assuming you have some sort of
pattern matching)

Provide/Require
---------------
This is a primitive facility for declaring that one package depends on
another one.  For example in CommonLisp :-

;;; foo.lsp
(defun bob (a b) ...)
(provide 'foo)

;;; main.lsp
(require 'foo)
(bob 10 3)

The above declares that the file `foo' provides the function `bob' and
that the file `main' requires `foo' to be loaded for it to work.
So when you load in `main' and `foo' hasn't been loaded, it is
automatically loaded by the system.

C Interface
-----------
How easy is it to call C from the language.
Is there a dynamic loading facility i.e. do I have to recompile the
program to use some arbitrary C code, or can it load in a .o file at
runtime?

Arbitrary Restrictions
----------------------
This really applies to the implementations rather than the languages.
However, as there is only one implementation for most of the languages
I'm looking at, they tend to be synonymous

If there is one thing I hate about an [implementation of] a languages
its arbitrary restrictions.  For example, `the length of the input
line must not exceed 80 characters', or "strings must be less than 255
characters long".  I can except some initial restrictions if :-

1) they are documented.
2) they will be removed in future versions.

Note. I realise that some restrictions are not arbitrary, or at least
not under the control of the language implementor e.g. the number of
open files under UNIX.

SUMMARY
-------
If you want to know more about the languages, there follows a brief
description of the languages, how to get an implementation and some
good and bad points as I see them.  Each point is preceded by a
character indicating the type of point :-

    +  good point
    -  bad point
    *  just a point to note
    !  subjective point

Other than the `*' items, I guess it is all subjective, however, I've
tried to put things that are generally good/bad in `+'/`-' and limit
really subjective statements to `!'.

                   TCL - version 4.0 patch level 1
                   -------------------------------

TCL (Tool Command Language) was developed by John Ousterhout at Berkeley.
It started out as a small language that could be embedded in
applications.  It has now been extended by some people at hackercorp
into more of a general purpose shell type programming language.
It is described by Peter Da Silva (one of the people who extended it)
as :-

> TCL is like a text-oriented Lisp, but lets you write algebraic
> expressions for simplicity and to avoid scaring people away.

The language itself for some reason reminds me of csh even though I
can only point to two things (the use of `set' and `$') which a
definitely like csh.

Unless you have other ideas about what an extension language should
look like (e.g. IMO it should be Scheme), then I'd definitely
recommend this.  It's small, and integrates easily with other C
programs (you can even have multiple TCL interpreters in an
application!)

Version 5.0 is available by anonymous ftp from sprite.berkeley.edu as
tk.tar.Z (its part of an X toolkit called Tk).  Note, although it has
a higher number than the one above, does not include the extensions
mentioned above.  These will apparently be integrated soon.

Version 4.0 pl1 is available by anonymous ftp from
media-lab.ai.mit.edu (sorry can't remember the exact path)

+  exceptions.
+  packages, called libraries
   However there is only one name-space.  The libraries are used as a
   way of storing single versions of code rather than as a solution to
   the name space pollution problem.
+  provide/require
+  C interface is excellent.  You can easily go TCL->C and C->TCL.
-  No dynamic loading ability that I'm aware of.
-  Arbitrary line length limit on `gets' and `scan'. i.e. the commands
   that read lines from files/strings.  I would guess this will go
   away in the next version.
-  No records.  The main data types are strings/lists/associative arrays
+  extensive test suite included.
!  doesn't look to have been tested on many systems.  The above
   version actually failed to link on a SPARCstation running SunOS 4.1
   as the source refers to `strerror'.  This has apparently been fixed
   in patch level 2.
+  lots of example code included in distribution.
+  extensive documentation (all in nroff)
+  Can trace execution.
!  To make arguments evaluate, you must enclose them in {} or []
   This shouldn't be a problem, except that being used to Lisp like
   languages I expect to quote constants.
!  The extensions though useful, are not seamless. e.g. some string
   facilities are in the core language and some in the extensions.
   This might happen when the hackercorp extensions are officially
   merged with the Berkeley core language and released by Berkeley.
+  As part of the extensions, you get tclsh.  This is a shell which you
   can type command directly into.
+  scan contexts.  This is sort of regular expressions on files rather
   than strings.

                        Python - version 0.9.1
                        ----------------------

Available by anonymous ftp from wuarchive.wustl.edu as
pub/python0.9.1.tar.Z or for Europeans via the info server at
hp4nl.nluug.nl

I couldn't think of a good way to describe this, so I'm blatantly
copying the following from the Python tutorial :-

    Python is a simple, yet powerful programming language that bridges
    the gap between C and shell programming, and is thus ideally
    suited for rapid prototyping.  Its syntax is put together from
    constructs borrowed from a variety of other languages; most
    prominent are influences from ABC, C, Modula-3 and Icon

So far so good, here's some more from the tutorial :-

    Because of its more general data types Python is applicable to a
    much larger problem domain that Awk or even Perl, yet most simple
    things are at least as easy in Python as in those languages.

i.e. Python seems to be designed for larger tasks than you would
undertake using the shell/awk/perl.

+  packages.
+  exceptions (based on Modula 2/3 modules)
+  records (actually tuples.  I'm not sure they do everything I want
   as the documentation is a bit vague in this area)
   Other main types are lists, sets, tables (associative arrays)
+  C interface is good.  No dynamic linking that I am aware of.
-  Arbitrary Restrictions
   line length limit on readline.
   This has been fixed and I would guess will appear in the next release.
+  lots of example python programs included.
   There is even a TCL (version 2ish) interpreter!
+  Object oriented features.
   Based on Modula 3 i.e. classes with methods, all of which are
   virtual (to use a C++ term).
*  any un caught errors produce a stack trace.
+  disassembler included
+  can inspect stack frames via traceback module
-  no single step or breakpoint facility
   (maybe in the next release)
+  functions can return multiple values.
*  The default output command `print' inserts a space between each
   field output.
!  I don't like the above, or rather I would like the option of not
   having it done.
*  Documentation includes tutorial and library reference as TeX files.
   Both are incomplete, but there is enough in them to be able to
   write Python code.  The reference manual is not yet finished, and
   is not currently distributed with the source.
+  Python mode for Emacs.
   (Its primitive, but its a start)

                           Icon - version 8
                           ----------------

To quote from one of the Icon books :-

    Icon is a high-level, general purpose programming language that
    contains many features for processing nonnumeric data,
    particularly for textual material consisting of string of
    characters.

Available :-
In USA :- ??, consult `archie'.
In UK :-  I picked up a copy form the sources archive at Imperial College.
          The JANET address is 00000510200001

-  no packages.  Everything is in one namespace.  However ...
-  no exceptions.
+  Object oriented features.
   An extension to the language called Idol is included.
   This converts Idol into standard Icon.
   Idol itself looks (to me) like Smalltalk.
+  has records.  Other types include :- sets, lists, strings, tables
+  unlimited line length when reading
   (Note. the newline is discarded)
!  The only language that has enough facilities to be able to re-write
   some of my Lex/Yacc code.
+  stack trace on error.
+  C interface is good.  Can extend the language by building `personal
   interpreter'.  No dynamic linking.
+  extensive documentation
   9 technical reports in all (PostScript and ASCII)
-  Unix interface is quite primitive.
   If you just want to use a command, you can use `callout', anything
   more complicated requires building a personal interpreter (not as
   difficult as it may sound)
+  extensive test suite
+  Usenet group exists specifically for it - comp.lang.icon
-  Unless you use Idol, all procedures are at the same level
   i.e. one scope.
-  regular expressions not supported.
   However, in many cases, you can use an Icon functions `find',
   `match', `many' and `upto' instead.
+  Can trace execution.
*  Pascal/C like syntax
   i.e. uses {} but has a few more keywords than C.
+  lots of example programs included.
+  can define your own iterators
   i.e. your own procedures for iterating through arbitrary structures.
+  co-expressions.  Powerful tool, hard to explain briefly.  See
   chapter 13 of the Icon Programming Language.
-  co-expressions haven't been implemented on Sun 4s (the type of
   machine I use)
+  has an `initial' section in procedures that is only ever executed
   once and allows you to initialise C like static variables with the
   result of other functions (unlike C).
+  arbitrary precision integers.

As well as the excellent documentation included in the source, there
are two books on Icon available (I skimmed through both of them) :-

    The Icon Programmming Language
    Ralph E. Griswold and Madge T. Griswold
    Prentice Hall 1983

    The Implementation of the Icon Programmming Language
    Ralph E. Griswold and Madge T. Griswold
    Princeton University Press 1986

The second one is particularly useful if you are considering
extending Icon yourself.  Appendix E of this book also contains a list
of projects that could be undertaken to extend and improve Icon.

Here are some projects, that if implemented, would greatly improve the
usefulness of Icon :-

E.2.4 Add a regular expression data type.  Modify the functions find
      and match to perate appropriately when their first argument is a
      regular expression.

E.2.5 \  All of these suggest extending
E.5.4  | the string scanning facilities to
E.5.5 /  cope with files and strings in a uniform way.

E.12.1 Provide a way to load functions (written in C) at runtime


                                 Perl
                                 ----
Available :-
USA :- ??, consult `archie'
UK :- Imperial sources archive

I received more responses about Perl than anything else, so I that
most people already know a lot about the language.

Here are some edited highlights from a message I received from Tom
Christiansen :-

First some good words from Tom :-

> ... I shall now reveal my true colors as perl disciple
> and perhaps not infrequent evangelist.  Perl is without question the
> greatest single program to appear to the UNIX community (although it runs
> elsewhere too) in the last 10 years.  It makes progamming fun again.  It's
> simple enough to get a quick start on, but rich enough for some very
> complex tasks.

> ... perl is a strict superset of sed and awk, so much so that s2p and
> a2p translators exist for these utilities.  You can do anything in
> perl that you can do in the shell, although perl is not strictly
> speaking a command interpreter.  It's more of a programming language.

and now some of the low points of Perl.  [Note this is only a small
part of a long post, that explained a lot of good things about Perl.
As most people seem to use/like Perl, I thought I'd highlight some of
the things wrong with the language, and what better place to get
information than from the designer of the language.  Note also that
this is from a message dated June 90, so some of it may be out of date.]

Larry Wall :-

> The basic problem with Perl is that it's not about complex data structures.
> Just as spreadsheet programs take a single data structure and try to
> cram the whole world into it, so too Perl takes a few simple data structures
> and drives them into the ground.  This is both a strength and a weakness,
> depending on the complexity and structure of the problem.
> 
> The basic underlying fault of Perl is that there isn't a real good way
> of building composite structures, or to make one variable refer to a piece
> of another variable, without giving an operational definition of it.
> 
> ...  In a sense, the problem with Perl is not that it is too
> complicated or hard to learn, but that perhaps it is not expressive
> enough for the effort you put into learning it.  Then again, maybe it
> is.  Your call.  Some people are excited about Perl because, despite
> its obvious faults, it lets them get creative.
> 
> There are many things I'd do differently if I were designing Perl from
> scratch.  It would probably be a little more object oriented.  Filehandles
> and their associated magical variables would probably be abstract types
> of some sort.  I don't like the way the use of $`, $&, $' and $<digit>
> impact the efficiency of the language.  I'd probably consider some kind
> of copy-on-write semantics like many versions of BASIC use.  The subroutine
> linkage is currently somewhat problematical in how efficiently it can
> be implemented.  And of course there are historical artifacts that wouldn't
> be there.

I think the above is a vary fair summary of the low points of the
language.  At one point it says `... perhaps it is not expressive
enought for the effort you put into learning it.  Then again maybe it
is.  Your call'.  Well _my_ call is that it is not.

Note I didn't actually pick up the source to this, just the manual.
Consequently I haven't been able to check all the points listed below.

+  packages.
!  Note in the examples that I've seen in comp.lang.perl, people don't
   seem to use the facility, instead they put everything directly in
   `main' (i.e. the top level scope) rather than in the local scope.
+  exceptions
+  provide/require
*  C Interface ??  I couldn't find this in the documentation I had.
+  No arbitrary restrictions
+  has a source level debugger
+  Well integrated with Unix (nearly all system calls are built in !)
!  However, like Unix, only one name space seems to be used (see above)
*  C like syntax
+  source contains texinfo manual.
   You can always buy the (Camel) book for more information.
-  no records.  Other types lists, strings, tables (associative arrays)
*  some types have distinct scopes.
!  You prefix the name with `@', '$', '%' to indicate which type
   you want.  This is one of the ugliest things I've ever seen.
!  Uses lots of short strings to contain often used things e.g. `$_'
   is the current input, `$.' is current line number.  I guess some
   people must like this, but I prefer names like `input' and
   `line-number' myself.
+  includes programs to convert existing awk, find and sed scripts into
   Perl.
+  Usenet news group - comp.lang.perl
+  Perl mode for Emacs.

				 GAWK
				 ----
Available :- 
USA :- prep.ai.mit.edu, probably other places as well.  Consult `archie'
UK :- Imperial sources archive.

A few points about GNU awk as it seems to fix some of the problems
with `old' awk.

-  no packages
-  no exceptions
-  no C interface 
-  no records
+  allows user defined functions
+  can read and write to arbitrary files
+  much more informative error messages than the old awk.

goer@ellis.uchicago.edu (Richard L. Goerwitz) (04/01/91)

We've seen an excellent summary of Icon's benefits and deficits, and I
think it is a good one (especially considering that the person in
question was only doing an initial survey).  Let me comment on some of
the conclusions reached in efforts to refine them, and ask some
question of my own.

In <BEVAN.91Mar29162211@panda.cs.man.ac.uk> Stephen J Bevan writes (re-
garding Icon), that it has

> -  no packages.  Everything is in one namespace.  However ...

The "however" is for Idol, I gather.  For people who don't want to add
yet another level of indirection to their Icon programs, though, naming
conflicts remain a problem.

>-  no exceptions.

Have you looked at the Icon "error conversion" capability?  Normally,
run-time errors will result in program termination.  You can, however,
turn off this feature, and catch the errors yourself, either passing
them through an exception handler, or else passing them back to the
normal termination routine via runerr().  It's not an elegant system,
since every expression that might normally cause error termination has
to be checked individually.  I wonder if there are plans to expand
this feature.

> +  Object oriented features.
>    An extension to the language called Idol is included.
>    This converts Idol into standard Icon.
>    Idol itself looks (to me) like Smalltalk.
> +  has records.  Other types include :- sets, lists, strings, tables
> +  unlimited line length when reading
>    (Note. the newline is discarded)
> !  The only language that has enough facilities to be able to re-write
>    some of my Lex/Yacc code.
> +  stack trace on error.
> +  C interface is good.  Can extend the language by building `personal
>    interpreter'.  No dynamic linking.
> +  extensive documentation
>    9 technical reports in all (PostScript and ASCII)
 
> -  Unix interface is quite primitive.
>    If you just want to use a command, you can use `callout', anything
>    more complicated requires building a personal interpreter (not as
>    difficult as it may sound)
 
It is quite true that Icon does not provide a good low-level interface
with the operating system.  Moreover this is unlikely to change, since
one of the great aims of Icon has been to keep it portable.  Luckily,
customization (as you note) is not as difficult as it might seem.

> +  extensive test suite
> +  Usenet group exists specifically for it - comp.lang.icon
> 
> -  Unless you use Idol, all procedures are at the same level
>    i.e. one scope.
> -  regular expressions not supported.
>   However, in many cases, you can use an Icon functions `find',
>   `match', `many' and `upto' instead.

"In many cases" ain't so.  ANY pattern representable by regular
expressions can also be represented via Icon's builtin string
processing control structures and functions.

I note, though, that many still want regular expressions.  The
reason usually given for NOT including them is that they lack
sufficient power.  In point of fact, they represent a miniscule subset
of the range of patterns that can be specified using Icon's native
facilities.  The advantage they would bring is that they would allow
far greater recognition speed for those patterns which can be
recognized via regular expressions, and that they would allow much
more compact expression of these patterns than can be achieved with
Icon's intrinsic functions.

Until someone does it *right*, I've written a prototype findre()
function, which is in one of the more recent IPL updates.  It
essentially combines Icon's find() function with an egrep-style
FSTN-description language.  Ideally, someone should write this in C.
Let's fool with the prototype for a while until we know exactly what
we want, and then let's try to talk some poor soul into coding it up
as part of the Icon run-time system.  A matchre() function should also
be added as well.

> +  Can trace execution.
> *  Pascal/C-like syntax
>    i.e. uses {} but has a few more keywords than C.
> +  lots of example programs included.
> +  can define your own iterators
>    i.e. your own procedures for iterating through arbitrary structures.
> +  co-expressions.  Powerful tool, hard to explain briefly.  See
>    chapter 13 of the Icon Programming Language.
 
> -  co-expressions haven't been implemented on Sun 4s (the type of
>    machine I use)
 
Please correct me if I'm wrong, but I believe I saw the coexpression
code for the Sun4 posted almost a year ago.

> +  has an `initial' section in procedures that is only ever executed
>    once and allows you to initialise C like static variables with the
>    result of other functions (unlike C).
> +  arbitrary precision integers.

Wish list:

> E.2.4 Add a regular expression data type.  Modify the functions find
>       and match to perate appropriately when their first argument is a
>       regular expression.

I'd modify this to say, add findre() and matchre() to the list of
builtin functions.  Most C libraries have regexp routines that can be
drafted to serve in these capacities.  I know that regular expression
don't fit into the traditional image of what Icon string processing
has always been.  Practical advantages of speed and compactness,
though, far outweigh this supposed disadvantage, and would make Icon
much more useful for many real-world tasks.

> E.2.5 \  All of these suggest extending
> E.5.4  | the string scanning facilities to
> E.5.5 /  cope with files and strings in a uniform way.

Not sure what you mean.
 
> E.12.1 Provide a way to load functions (written in C) at runtime

My impression is that inclusion of this feature would be
hopelessly implementation dependent, and would dramatically increase
the complexity of maintaining the many implementations that exist.
I'm curious why it is that you would see any advantage in run-time
loading other than decreased in-core mem. reqs.  If you were to use the
Icon compiler (i.e. Icon->C translator), you wouldn't even have to
worry about adding any code to any run-time system.
-- 

   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

cs450a03@uc780.umd.edu (04/01/91)

Richard Goerwitz writes:
>I'm curious why it is that you would see any advantage in run-time
>loading other than decreased in-core mem. reqs.  If you were to use the
>Icon compiler (i.e. Icon->C translator), you wouldn't even have to
>worry about adding any code to any run-time system.

Well, I don't Icon, but I'm willing to put my foot in my mouth
anyways...

(1)  If you compile an entire application, you lose the
maintainability that the {Icon} environment provides.

(2) If you have some method of adding new primitives (accessible as
proper objects of your system) you suddenly make it possible to use
{Icon} for commercial applications (e.g. where speed is important).

Also note that it should be considered in good taste to provide, along
with any object code, the {Icon} work-alike "source" so that a year or
two down the road when somebody else wants to know what this thing is
doing they can figure it out.  If you GPL, you'd want to keep around
the intermidiate language source as well.

Raul Rockwell

guido@cwi.nl (Guido van Rossum) (04/03/91)

Richard L. Goerwitz replies to Stephen Bevan, regarding Icon:

Bevan:
>> -  Unix interface is quite primitive.
>>    If you just want to use a command, you can use `callout', anything
>>    more complicated requires building a personal interpreter (not as
>>    difficult as it may sound)

Goerwitz:
>It is quite true that Icon does not provide a good low-level interface
>with the operating system.  Moreover this is unlikely to change, since
>one of the great aims of Icon has been to keep it portable.  Luckily,
>customization (as you note) is not as difficult as it might seem.

I don't buy the argument that you can't provide a good Unix interface
because of portability.

Python is designed to be just as portable as Icon (runs on the Mac,
for starters) but its Unix interface is quite good (and will improve).
The trick is that all the Unix dependencies are encapsulated in a
separate module.  Unix dependent applications won't run on non-Unix
systems, but then they are probably not needed there either.  Many
applications and library modules can be (and are!) written without the
use of explicit Unix features.  Of course, the standard I/O interface
exists on all systems.

There is no excuse for not providing a decent Unix interface for a
language that runs under Unix.  Leaving it up to local initiative
("customization") is fatal for portability.

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"Life's gotta be more than meeting pretty faces and sitting on them"

goer@ellis.uchicago.edu (Richard L. Goerwitz) (04/03/91)

In article <3252@charon.cwi.nl> guido@cwi.nl (Guido van Rossum) writes:
>
>>> -  [the Icon-]Unix interface is quite primitive.
>>>    If you just want to use a command, you can use `callout', anything
>>>    more complicated requires building a personal interpreter (not as
>>>    difficult as it may sound)
>
>Goerwitz:
>>It is quite true that Icon does not provide a good low-level interface
>>with the operating system.  Moreover this is unlikely to change, since
>>one of the great aims of Icon has been to keep it portable....
>
>I don't buy the argument that you can't provide a good Unix interface
>because of portability.
>
>Python is designed to be just as portable as Icon (runs on the Mac,
>for starters) but its Unix interface is quite good (and will improve).
>The trick is that all the Unix dependencies are encapsulated in a
>separate module.  Unix dependent applications won't run on non-Unix
>systems, but then they are probably not needed there either.

What is portability?  Portability doesn't just involve the compiler or
interpreter itself.  It's a property of code written for it as well.
Why?  Because the code is as important as the language tools themselves.
What good is it, say, to be make it easy to reimplement a compiler for
more than one system when code written for that compiler will present
a horrendous problem?

Portability is also not just a theoretical thing.  The proof is in the
pudding.  How many platforms is Python actively used on?  Here's a list
for Icon.  Note that most programs will run practically unaltered on
each of the listed platforms.  Do you know of any language for which a
similar claim could be made for so many machines and operating systems?

	MS-DOS
	OS/2
	Mac (MPW and standalone)
	Atari
	Apollo (AEGIS)
	IBM 370 (MVS/XA and VM/CMS)
	Amiga
	DEC VAX (8650, running VMS)

And for Unix-oids:

	BSD 4.3
	SunOS 4.0
	Ultrix
	AIX
	Xenix
	Mach (on the NeXT)
	SYSVR3 (4 also?)

This is just what I can think of offhand.  There are probably others as
well.

>There is no excuse for not providing a decent Unix interface for a
>language that runs under Unix.  Leaving it up to local initiative
>("customization") is fatal for portability.

I'm not sure, but I think you've got this backwards.  Customization
*creates* nonportability.  Still, I think you are right that languages
need a good OS interface in order to be useful for certain types of
tasks.  The question is, "What features would you regard as vital for
work in a Unix environment?"  I'll be curious to see your answer.
Mine would be:

	ability to call C functions
	ability to store C pointers for calls to C functions
	built-in support for conversion from Icon to C types
	intrinsic fork()/exec()/wait() ability
	intrinsic ability to work with pipes
	intrinsic system() function

These would be the basic things I'd want.  Icon has three of them.  It
lacks the other two.  Yet another it partially implements, but the inter-
face is nontrivial for complex objects (I'm talking about Icon->C type
conversions).

-Richard (goer@sophist.uchicago.edu)
-- 

   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

guido@cwi.nl (Guido van Rossum) (04/08/91)

goer@ellis.uchicago.edu (Richard L. Goerwitz) writes:

>What is portability?  Portability doesn't just involve the compiler or
>interpreter itself.  It's a property of code written for it as well.
>Why?  Because the code is as important as the language tools themselves.
>What good is it, say, to be make it easy to reimplement a compiler for
>more than one system when code written for that compiler will present
>a horrendous problem?
>
>Portability is also not just a theoretical thing.  The proof is in the
>pudding.  How many platforms is Python actively used on?  Here's a list
>for Icon.  Note that most programs will run practically unaltered on
>each of the listed platforms.  Do you know of any language for which a
>similar claim could be made for so many machines and operating systems?

If I understand you well, you can make this claim for Icon because
Icon forbids things that are inherently system-dependent.  This means
that probably a host of programs that would benefit from Icon's
high-level problem-solving abilities won't be written in Icon because
it lacks the low-level interfaces needed to gather the data or
whatever.

True, if a program opens a pipe and forks off a process that calls
sendmail it won't be portable to the Mac.  But forbidding such things
even when the OS provides the functionality forces the author to use a
non-portable solution anyway (such as writing a shell script wrapper
around an Icon program).  I argue that if the language at least allows
you to make non-portable OS calls, users are better off -- of course
assuming standard modularization techniques are available to isolate
non-portable portions of programs, and encouraging portable solutions
where they exist.

>[list of platforms on which Ican is used deleted]

I don't really want to engage in "mine is longer than yours" contests,
but just for the record: Python is out only two months now and has
already been ported to all of the Unix platforms you mention (plus
hpux) and some of the micro ones (Mac, MS-DOS, Atari ST).  I don't
claim that all Python programs run on all platforms, because some
platforms don't provide some built-in modules, but Python programs
that don't use system-dependent modules will run everywhere
without change.  The crux is that a non-portable Python program is
immediately recognizable because it imports a system-dependent module.
Also note that Python provides uniform interfaces for OS-dependent
features that are available on many systems but not all -- if you have
a symbolic link system call, it will be called posix.symlink().
Programs can dynamically test for the presence of such features (which
is unly useful if they have a way of handling their absence).

>[...]  Still, I think you are right that languages
>need a good OS interface in order to be useful for certain types of
>tasks.  The question is, "What features would you regard as vital for
>work in a Unix environment?"  I'll be curious to see your answer.
>Mine would be:
>
>	ability to call C functions
>	ability to store C pointers for calls to C functions
>	built-in support for conversion from Icon to C types
>	intrinsic fork()/exec()/wait() ability
>	intrinsic ability to work with pipes
>	intrinsic system() function

>These would be the basic things I'd want.  Icon has three of them.  It
>lacks the other two.  Yet another it partially implements, but the inter-
>face is nontrivial for complex objects (I'm talking about Icon->C type
>conversions).

You don't say which three Icon has and I don't know enough about Icon
to guess.  Python has all that you mention except fork/exec/wait and
pipes, which are easy enough to add, but since this is a one-person
project, for now I am content with system() and temporary files.

Disclaimer: maybe I seem stubborn on this point, but I have worked on
a language project where OS independence was considered so important
that the language didn't even have a primitive to open a file and read
data from it within a program.  The language didn't become a terrible
success, even though it had other properties that made it a big leap
forward from other languages...

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"Twenty years ago, Dan Bernstein would be defending Assembler against HLL's"

rh@smds.UUCP (Richard Harter) (04/09/91)

In article <1991Apr3.151153.3447@midway.uchicago.edu>, goer@ellis.uchicago.edu (Richard L. Goerwitz) writes:

> What is portability?  Portability doesn't just involve the compiler or
> interpreter itself.  It's a property of code written for it as well.
> Why?  Because the code is as important as the language tools themselves.
> What good is it, say, to be make it easy to reimplement a compiler for
> more than one system when code written for that compiler will present
> a horrendous problem?

There are some issues that weren't addressed in this discussion.  In
languages which have OS command capability one has to come to terms with
the fact that different OS's have differing command syntax and differing
file system syntax.  Portability of command code across OS's really implies
that the language must supply that portability.  Consider, for example,
path names.  UNIX and VMS both have a path naming system that amounts to
device - directory tree list - file name.  If the code refers to files by
path name then the language should provide a standard function to return
a correct path name from the components [or equivalent functionality].  I
am supposing here that the language is strong enough so that path name
elements are symbolic and switchable in a config file.

One can list a number of such requirements, depending on the objectives
of the language in question.  In general, however, portability of code
in the language requires that all host OS interface capability be portable
across the OS's being supported.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.