bevan@cs.man.ac.uk (Stephen J Bevan) (03/30/91)
[Note I've crossposted to all the groups I send my original message to. This was at the request of some of the respondents (sp?)] Here are the results of my question regarding which language to use for writing programs to extract information from files, generate reports ... etc. I initially suggested languages like Perl, Icon, Python ... As part of my original message I said :- > Rather than FTP all of them and wade through the documentation, I was > wondering if anybody has experiences with them that they'd like to > share? I would like to thank the following people for replying :- Dan Bernstein - brnstnd@kramden.acf.nyu.edu Tom Christiansen - tchrist@convex.COM Chris Eich - chrise@hpnmdla.hp.com Richard L. Goerwitz - goer@midway.uchicago.edu Clinton Jeffery - cjeffery@cs.arizona.edu Guido van Rossum - guido@cwi.nl Randal L. Schwartz - merlyn@iWarp.intel.com Peter da Silva - peter@ficc.ferranti.com Alan Thew - QQ11@LIVERPOOL.AC.UK Edward Vielmetti - emv@ox.com ?? - russell@ccu1.aukuni.ac.nz Most of the replies were about Perl, so I didn't learn much about the other languages I suggested (other than very general things). Even though I was originally hoping not to have to ftp any stuff, I ended up getting the source to Python, GAWK, TCL, Icon and the texinfo manual for Perl. To save you going through my list of good and bad points of the languages I looked at, here is the summary of what I see the languages as :- TCL - an embedded language i.e. an extension language for large programs (IMHO only if you haven't got, or don't like, Scheme based ones like ELK). Perl - the de facto UNIX scripting language. You name it, and you can probably cobble a solution together in Perl. Beyond the fact that a lot of people use it, I can see nothing to recommend it. It's a bit like C in that respect. Python - Good prototyping language with a consistent design. It might not have all the low level UNIX stuff built in, but by using modules, its easy to add the necessary things in an ordered way. Icon - the `nearly' language. Well designed language, that never seemed to make it into general use. Seems to cover the ground all the way from AWK type applications to Prolog/Lisp ones. If I wasn't already happy with Scheme, I'd use this for more general programming. I would recommend people at least look at this language. GAWK - simple scripting language. Definitely better than `old' awk. I would only use it if the job were really simple or if something like Python or TCL were not available. Note I wouldn't expect anybody to make a choice on what I say. I suggest you get the source/manuals yourself and have a good long look at the language/implementation before you decide. For the types of things _I_ want to do, it would be a tie between Icon and Python. Having said that, given that I'd have to extend both to cover the sort of things I want to do, I'll probably use Scheme instead (ELK in particular). The reason I didn't just use Scheme in the first place is that I was hoping one of the languages would have all the facilities I want without me having to extend them myself. Before, the summary of the languages themselves, I thought I'd try and list some of the things I was looking for. (Actually, I showed an earlier version of this summary to somebody and they didn't understand some of the terms I was using, so this is my attempt at an explanation). Note that most of the things are to do with structuring the code and alike. This is not the sort of thing you usually worry about when writing small scripts, but I plan to convert and write a number of tools, some of which are around the 1000 LOC mark. For example, I'd like to convert a particular lex/yacc/C program I have into the chosen language. You can skip ahead to the actual summary by searching for SUMMARY. (Well I can do this in GNUS, I don't know about other news readers like rn) Packages/Modules ---------------- These are a mechanism for splitting up the name space so that function name clashes are reduced. Most systems work by declaring a package and then all functions listed from then on are members of that package. You then access the functions using the package prefix, or import the whole package so that you don't have to use the prefix. The following is an example in CommonLisp :- ;;; foo.lsp ;;; bar.lsp (in-package 'foo) (in-package 'bar) (export '(bob)) (export '(bob)) (defun bob (a b) ...) (defun bob (x) ...) ;;; main.lsp (foo:bob 10 20) (bar:bob 3) Packages are not perfect, but they do help. You can get the same effect by declaring implicit package prefixes :- ;;; foo.lsp ;; bar.lsp (defun foo-bob (a b) ...) (defun bar-bob (x) ...) ;; main.lsp (foo-bob 10 30) (bar-bob 4) The advantage of packages over this is that you don't have to use a package prefix in the package itself when you want to call a function. This can be a saving if you have lots of functions in a package, and only a few are exported. Exception Handling ------------------ This is useful for dealing with error that shouldn't happen. e.g. reaching the end of the file when you were looking for some valid data. For example, in CommonLisp :- (defun foo (x y) ... (if (catch 'some-unexpexted-error (bar x y) nil) (handle-the-exception ...) (define bar (a b) ... (if (something-wrong) (throw 'some-unexpected-error t)) ...) Here the function `foo' calls `bar', and if any error occurs whilst processing, it is handled by the exception handler. (The example is a bit primitive as I'm trying to save space). The advantage of this is that you don't have to explicitly pass back all sorts of error codes from your functions to handle unusual errors. It also usually means you won't have so many nested `if's to handle the special cases, therefore, making your code clearer. Records/Tuples/Aggregates/Structs --------------------------------- It's handy to be to define objects that contain certain number of elements. You can then pass these objects around and access the individual bits. For example in CommonLisp :- (defstruct point x y) This declares `point' as a type containing two items called `x' and `y'. Some languages don't name the items, they rely on position instead. I see these as equivalent (assuming you have some sort of pattern matching) Provide/Require --------------- This is a primitive facility for declaring that one package depends on another one. For example in CommonLisp :- ;;; foo.lsp (defun bob (a b) ...) (provide 'foo) ;;; main.lsp (require 'foo) (bob 10 3) The above declares that the file `foo' provides the function `bob' and that the file `main' requires `foo' to be loaded for it to work. So when you load in `main' and `foo' hasn't been loaded, it is automatically loaded by the system. C Interface ----------- How easy is it to call C from the language. Is there a dynamic loading facility i.e. do I have to recompile the program to use some arbitrary C code, or can it load in a .o file at runtime? Arbitrary Restrictions ---------------------- This really applies to the implementations rather than the languages. However, as there is only one implementation for most of the languages I'm looking at, they tend to be synonymous If there is one thing I hate about an [implementation of] a languages its arbitrary restrictions. For example, `the length of the input line must not exceed 80 characters', or "strings must be less than 255 characters long". I can except some initial restrictions if :- 1) they are documented. 2) they will be removed in future versions. Note. I realise that some restrictions are not arbitrary, or at least not under the control of the language implementor e.g. the number of open files under UNIX. SUMMARY ------- If you want to know more about the languages, there follows a brief description of the languages, how to get an implementation and some good and bad points as I see them. Each point is preceded by a character indicating the type of point :- + good point - bad point * just a point to note ! subjective point Other than the `*' items, I guess it is all subjective, however, I've tried to put things that are generally good/bad in `+'/`-' and limit really subjective statements to `!'. TCL - version 4.0 patch level 1 ------------------------------- TCL (Tool Command Language) was developed by John Ousterhout at Berkeley. It started out as a small language that could be embedded in applications. It has now been extended by some people at hackercorp into more of a general purpose shell type programming language. It is described by Peter Da Silva (one of the people who extended it) as :- > TCL is like a text-oriented Lisp, but lets you write algebraic > expressions for simplicity and to avoid scaring people away. The language itself for some reason reminds me of csh even though I can only point to two things (the use of `set' and `$') which a definitely like csh. Unless you have other ideas about what an extension language should look like (e.g. IMO it should be Scheme), then I'd definitely recommend this. It's small, and integrates easily with other C programs (you can even have multiple TCL interpreters in an application!) Version 5.0 is available by anonymous ftp from sprite.berkeley.edu as tk.tar.Z (its part of an X toolkit called Tk). Note, although it has a higher number than the one above, does not include the extensions mentioned above. These will apparently be integrated soon. Version 4.0 pl1 is available by anonymous ftp from media-lab.ai.mit.edu (sorry can't remember the exact path) + exceptions. + packages, called libraries However there is only one name-space. The libraries are used as a way of storing single versions of code rather than as a solution to the name space pollution problem. + provide/require + C interface is excellent. You can easily go TCL->C and C->TCL. - No dynamic loading ability that I'm aware of. - Arbitrary line length limit on `gets' and `scan'. i.e. the commands that read lines from files/strings. I would guess this will go away in the next version. - No records. The main data types are strings/lists/associative arrays + extensive test suite included. ! doesn't look to have been tested on many systems. The above version actually failed to link on a SPARCstation running SunOS 4.1 as the source refers to `strerror'. This has apparently been fixed in patch level 2. + lots of example code included in distribution. + extensive documentation (all in nroff) + Can trace execution. ! To make arguments evaluate, you must enclose them in {} or [] This shouldn't be a problem, except that being used to Lisp like languages I expect to quote constants. ! The extensions though useful, are not seamless. e.g. some string facilities are in the core language and some in the extensions. This might happen when the hackercorp extensions are officially merged with the Berkeley core language and released by Berkeley. + As part of the extensions, you get tclsh. This is a shell which you can type command directly into. + scan contexts. This is sort of regular expressions on files rather than strings. Python - version 0.9.1 ---------------------- Available by anonymous ftp from wuarchive.wustl.edu as pub/python0.9.1.tar.Z or for Europeans via the info server at hp4nl.nluug.nl I couldn't think of a good way to describe this, so I'm blatantly copying the following from the Python tutorial :- Python is a simple, yet powerful programming language that bridges the gap between C and shell programming, and is thus ideally suited for rapid prototyping. Its syntax is put together from constructs borrowed from a variety of other languages; most prominent are influences from ABC, C, Modula-3 and Icon So far so good, here's some more from the tutorial :- Because of its more general data types Python is applicable to a much larger problem domain that Awk or even Perl, yet most simple things are at least as easy in Python as in those languages. i.e. Python seems to be designed for larger tasks than you would undertake using the shell/awk/perl. + packages. + exceptions (based on Modula 2/3 modules) + records (actually tuples. I'm not sure they do everything I want as the documentation is a bit vague in this area) Other main types are lists, sets, tables (associative arrays) + C interface is good. No dynamic linking that I am aware of. - Arbitrary Restrictions line length limit on readline. This has been fixed and I would guess will appear in the next release. + lots of example python programs included. There is even a TCL (version 2ish) interpreter! + Object oriented features. Based on Modula 3 i.e. classes with methods, all of which are virtual (to use a C++ term). * any un caught errors produce a stack trace. + disassembler included + can inspect stack frames via traceback module - no single step or breakpoint facility (maybe in the next release) + functions can return multiple values. * The default output command `print' inserts a space between each field output. ! I don't like the above, or rather I would like the option of not having it done. * Documentation includes tutorial and library reference as TeX files. Both are incomplete, but there is enough in them to be able to write Python code. The reference manual is not yet finished, and is not currently distributed with the source. + Python mode for Emacs. (Its primitive, but its a start) Icon - version 8 ---------------- To quote from one of the Icon books :- Icon is a high-level, general purpose programming language that contains many features for processing nonnumeric data, particularly for textual material consisting of string of characters. Available :- In USA :- ??, consult `archie'. In UK :- I picked up a copy form the sources archive at Imperial College. The JANET address is 00000510200001 - no packages. Everything is in one namespace. However ... - no exceptions. + Object oriented features. An extension to the language called Idol is included. This converts Idol into standard Icon. Idol itself looks (to me) like Smalltalk. + has records. Other types include :- sets, lists, strings, tables + unlimited line length when reading (Note. the newline is discarded) ! The only language that has enough facilities to be able to re-write some of my Lex/Yacc code. + stack trace on error. + C interface is good. Can extend the language by building `personal interpreter'. No dynamic linking. + extensive documentation 9 technical reports in all (PostScript and ASCII) - Unix interface is quite primitive. If you just want to use a command, you can use `callout', anything more complicated requires building a personal interpreter (not as difficult as it may sound) + extensive test suite + Usenet group exists specifically for it - comp.lang.icon - Unless you use Idol, all procedures are at the same level i.e. one scope. - regular expressions not supported. However, in many cases, you can use an Icon functions `find', `match', `many' and `upto' instead. + Can trace execution. * Pascal/C like syntax i.e. uses {} but has a few more keywords than C. + lots of example programs included. + can define your own iterators i.e. your own procedures for iterating through arbitrary structures. + co-expressions. Powerful tool, hard to explain briefly. See chapter 13 of the Icon Programming Language. - co-expressions haven't been implemented on Sun 4s (the type of machine I use) + has an `initial' section in procedures that is only ever executed once and allows you to initialise C like static variables with the result of other functions (unlike C). + arbitrary precision integers. As well as the excellent documentation included in the source, there are two books on Icon available (I skimmed through both of them) :- The Icon Programmming Language Ralph E. Griswold and Madge T. Griswold Prentice Hall 1983 The Implementation of the Icon Programmming Language Ralph E. Griswold and Madge T. Griswold Princeton University Press 1986 The second one is particularly useful if you are considering extending Icon yourself. Appendix E of this book also contains a list of projects that could be undertaken to extend and improve Icon. Here are some projects, that if implemented, would greatly improve the usefulness of Icon :- E.2.4 Add a regular expression data type. Modify the functions find and match to perate appropriately when their first argument is a regular expression. E.2.5 \ All of these suggest extending E.5.4 | the string scanning facilities to E.5.5 / cope with files and strings in a uniform way. E.12.1 Provide a way to load functions (written in C) at runtime Perl ---- Available :- USA :- ??, consult `archie' UK :- Imperial sources archive I received more responses about Perl than anything else, so I that most people already know a lot about the language. Here are some edited highlights from a message I received from Tom Christiansen :- First some good words from Tom :- > ... I shall now reveal my true colors as perl disciple > and perhaps not infrequent evangelist. Perl is without question the > greatest single program to appear to the UNIX community (although it runs > elsewhere too) in the last 10 years. It makes progamming fun again. It's > simple enough to get a quick start on, but rich enough for some very > complex tasks. > ... perl is a strict superset of sed and awk, so much so that s2p and > a2p translators exist for these utilities. You can do anything in > perl that you can do in the shell, although perl is not strictly > speaking a command interpreter. It's more of a programming language. and now some of the low points of Perl. [Note this is only a small part of a long post, that explained a lot of good things about Perl. As most people seem to use/like Perl, I thought I'd highlight some of the things wrong with the language, and what better place to get information than from the designer of the language. Note also that this is from a message dated June 90, so some of it may be out of date.] Larry Wall :- > The basic problem with Perl is that it's not about complex data structures. > Just as spreadsheet programs take a single data structure and try to > cram the whole world into it, so too Perl takes a few simple data structures > and drives them into the ground. This is both a strength and a weakness, > depending on the complexity and structure of the problem. > > The basic underlying fault of Perl is that there isn't a real good way > of building composite structures, or to make one variable refer to a piece > of another variable, without giving an operational definition of it. > > ... In a sense, the problem with Perl is not that it is too > complicated or hard to learn, but that perhaps it is not expressive > enough for the effort you put into learning it. Then again, maybe it > is. Your call. Some people are excited about Perl because, despite > its obvious faults, it lets them get creative. > > There are many things I'd do differently if I were designing Perl from > scratch. It would probably be a little more object oriented. Filehandles > and their associated magical variables would probably be abstract types > of some sort. I don't like the way the use of $`, $&, $' and $<digit> > impact the efficiency of the language. I'd probably consider some kind > of copy-on-write semantics like many versions of BASIC use. The subroutine > linkage is currently somewhat problematical in how efficiently it can > be implemented. And of course there are historical artifacts that wouldn't > be there. I think the above is a vary fair summary of the low points of the language. At one point it says `... perhaps it is not expressive enought for the effort you put into learning it. Then again maybe it is. Your call'. Well _my_ call is that it is not. Note I didn't actually pick up the source to this, just the manual. Consequently I haven't been able to check all the points listed below. + packages. ! Note in the examples that I've seen in comp.lang.perl, people don't seem to use the facility, instead they put everything directly in `main' (i.e. the top level scope) rather than in the local scope. + exceptions + provide/require * C Interface ?? I couldn't find this in the documentation I had. + No arbitrary restrictions + has a source level debugger + Well integrated with Unix (nearly all system calls are built in !) ! However, like Unix, only one name space seems to be used (see above) * C like syntax + source contains texinfo manual. You can always buy the (Camel) book for more information. - no records. Other types lists, strings, tables (associative arrays) * some types have distinct scopes. ! You prefix the name with `@', '$', '%' to indicate which type you want. This is one of the ugliest things I've ever seen. ! Uses lots of short strings to contain often used things e.g. `$_' is the current input, `$.' is current line number. I guess some people must like this, but I prefer names like `input' and `line-number' myself. + includes programs to convert existing awk, find and sed scripts into Perl. + Usenet news group - comp.lang.perl + Perl mode for Emacs. GAWK ---- Available :- USA :- prep.ai.mit.edu, probably other places as well. Consult `archie' UK :- Imperial sources archive. A few points about GNU awk as it seems to fix some of the problems with `old' awk. - no packages - no exceptions - no C interface - no records + allows user defined functions + can read and write to arbitrary files + much more informative error messages than the old awk.
goer@ellis.uchicago.edu (Richard L. Goerwitz) (04/01/91)
We've seen an excellent summary of Icon's benefits and deficits, and I think it is a good one (especially considering that the person in question was only doing an initial survey). Let me comment on some of the conclusions reached in efforts to refine them, and ask some question of my own. In <BEVAN.91Mar29162211@panda.cs.man.ac.uk> Stephen J Bevan writes (re- garding Icon), that it has > - no packages. Everything is in one namespace. However ... The "however" is for Idol, I gather. For people who don't want to add yet another level of indirection to their Icon programs, though, naming conflicts remain a problem. >- no exceptions. Have you looked at the Icon "error conversion" capability? Normally, run-time errors will result in program termination. You can, however, turn off this feature, and catch the errors yourself, either passing them through an exception handler, or else passing them back to the normal termination routine via runerr(). It's not an elegant system, since every expression that might normally cause error termination has to be checked individually. I wonder if there are plans to expand this feature. > + Object oriented features. > An extension to the language called Idol is included. > This converts Idol into standard Icon. > Idol itself looks (to me) like Smalltalk. > + has records. Other types include :- sets, lists, strings, tables > + unlimited line length when reading > (Note. the newline is discarded) > ! The only language that has enough facilities to be able to re-write > some of my Lex/Yacc code. > + stack trace on error. > + C interface is good. Can extend the language by building `personal > interpreter'. No dynamic linking. > + extensive documentation > 9 technical reports in all (PostScript and ASCII) > - Unix interface is quite primitive. > If you just want to use a command, you can use `callout', anything > more complicated requires building a personal interpreter (not as > difficult as it may sound) It is quite true that Icon does not provide a good low-level interface with the operating system. Moreover this is unlikely to change, since one of the great aims of Icon has been to keep it portable. Luckily, customization (as you note) is not as difficult as it might seem. > + extensive test suite > + Usenet group exists specifically for it - comp.lang.icon > > - Unless you use Idol, all procedures are at the same level > i.e. one scope. > - regular expressions not supported. > However, in many cases, you can use an Icon functions `find', > `match', `many' and `upto' instead. "In many cases" ain't so. ANY pattern representable by regular expressions can also be represented via Icon's builtin string processing control structures and functions. I note, though, that many still want regular expressions. The reason usually given for NOT including them is that they lack sufficient power. In point of fact, they represent a miniscule subset of the range of patterns that can be specified using Icon's native facilities. The advantage they would bring is that they would allow far greater recognition speed for those patterns which can be recognized via regular expressions, and that they would allow much more compact expression of these patterns than can be achieved with Icon's intrinsic functions. Until someone does it *right*, I've written a prototype findre() function, which is in one of the more recent IPL updates. It essentially combines Icon's find() function with an egrep-style FSTN-description language. Ideally, someone should write this in C. Let's fool with the prototype for a while until we know exactly what we want, and then let's try to talk some poor soul into coding it up as part of the Icon run-time system. A matchre() function should also be added as well. > + Can trace execution. > * Pascal/C-like syntax > i.e. uses {} but has a few more keywords than C. > + lots of example programs included. > + can define your own iterators > i.e. your own procedures for iterating through arbitrary structures. > + co-expressions. Powerful tool, hard to explain briefly. See > chapter 13 of the Icon Programming Language. > - co-expressions haven't been implemented on Sun 4s (the type of > machine I use) Please correct me if I'm wrong, but I believe I saw the coexpression code for the Sun4 posted almost a year ago. > + has an `initial' section in procedures that is only ever executed > once and allows you to initialise C like static variables with the > result of other functions (unlike C). > + arbitrary precision integers. Wish list: > E.2.4 Add a regular expression data type. Modify the functions find > and match to perate appropriately when their first argument is a > regular expression. I'd modify this to say, add findre() and matchre() to the list of builtin functions. Most C libraries have regexp routines that can be drafted to serve in these capacities. I know that regular expression don't fit into the traditional image of what Icon string processing has always been. Practical advantages of speed and compactness, though, far outweigh this supposed disadvantage, and would make Icon much more useful for many real-world tasks. > E.2.5 \ All of these suggest extending > E.5.4 | the string scanning facilities to > E.5.5 / cope with files and strings in a uniform way. Not sure what you mean. > E.12.1 Provide a way to load functions (written in C) at runtime My impression is that inclusion of this feature would be hopelessly implementation dependent, and would dramatically increase the complexity of maintaining the many implementations that exist. I'm curious why it is that you would see any advantage in run-time loading other than decreased in-core mem. reqs. If you were to use the Icon compiler (i.e. Icon->C translator), you wouldn't even have to worry about adding any code to any run-time system. -- -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer
cs450a03@uc780.umd.edu (04/01/91)
Richard Goerwitz writes: >I'm curious why it is that you would see any advantage in run-time >loading other than decreased in-core mem. reqs. If you were to use the >Icon compiler (i.e. Icon->C translator), you wouldn't even have to >worry about adding any code to any run-time system. Well, I don't Icon, but I'm willing to put my foot in my mouth anyways... (1) If you compile an entire application, you lose the maintainability that the {Icon} environment provides. (2) If you have some method of adding new primitives (accessible as proper objects of your system) you suddenly make it possible to use {Icon} for commercial applications (e.g. where speed is important). Also note that it should be considered in good taste to provide, along with any object code, the {Icon} work-alike "source" so that a year or two down the road when somebody else wants to know what this thing is doing they can figure it out. If you GPL, you'd want to keep around the intermidiate language source as well. Raul Rockwell
guido@cwi.nl (Guido van Rossum) (04/03/91)
Richard L. Goerwitz replies to Stephen Bevan, regarding Icon: Bevan: >> - Unix interface is quite primitive. >> If you just want to use a command, you can use `callout', anything >> more complicated requires building a personal interpreter (not as >> difficult as it may sound) Goerwitz: >It is quite true that Icon does not provide a good low-level interface >with the operating system. Moreover this is unlikely to change, since >one of the great aims of Icon has been to keep it portable. Luckily, >customization (as you note) is not as difficult as it might seem. I don't buy the argument that you can't provide a good Unix interface because of portability. Python is designed to be just as portable as Icon (runs on the Mac, for starters) but its Unix interface is quite good (and will improve). The trick is that all the Unix dependencies are encapsulated in a separate module. Unix dependent applications won't run on non-Unix systems, but then they are probably not needed there either. Many applications and library modules can be (and are!) written without the use of explicit Unix features. Of course, the standard I/O interface exists on all systems. There is no excuse for not providing a decent Unix interface for a language that runs under Unix. Leaving it up to local initiative ("customization") is fatal for portability. --Guido van Rossum, CWI, Amsterdam <guido@cwi.nl> "Life's gotta be more than meeting pretty faces and sitting on them"
goer@ellis.uchicago.edu (Richard L. Goerwitz) (04/03/91)
In article <3252@charon.cwi.nl> guido@cwi.nl (Guido van Rossum) writes: > >>> - [the Icon-]Unix interface is quite primitive. >>> If you just want to use a command, you can use `callout', anything >>> more complicated requires building a personal interpreter (not as >>> difficult as it may sound) > >Goerwitz: >>It is quite true that Icon does not provide a good low-level interface >>with the operating system. Moreover this is unlikely to change, since >>one of the great aims of Icon has been to keep it portable.... > >I don't buy the argument that you can't provide a good Unix interface >because of portability. > >Python is designed to be just as portable as Icon (runs on the Mac, >for starters) but its Unix interface is quite good (and will improve). >The trick is that all the Unix dependencies are encapsulated in a >separate module. Unix dependent applications won't run on non-Unix >systems, but then they are probably not needed there either. What is portability? Portability doesn't just involve the compiler or interpreter itself. It's a property of code written for it as well. Why? Because the code is as important as the language tools themselves. What good is it, say, to be make it easy to reimplement a compiler for more than one system when code written for that compiler will present a horrendous problem? Portability is also not just a theoretical thing. The proof is in the pudding. How many platforms is Python actively used on? Here's a list for Icon. Note that most programs will run practically unaltered on each of the listed platforms. Do you know of any language for which a similar claim could be made for so many machines and operating systems? MS-DOS OS/2 Mac (MPW and standalone) Atari Apollo (AEGIS) IBM 370 (MVS/XA and VM/CMS) Amiga DEC VAX (8650, running VMS) And for Unix-oids: BSD 4.3 SunOS 4.0 Ultrix AIX Xenix Mach (on the NeXT) SYSVR3 (4 also?) This is just what I can think of offhand. There are probably others as well. >There is no excuse for not providing a decent Unix interface for a >language that runs under Unix. Leaving it up to local initiative >("customization") is fatal for portability. I'm not sure, but I think you've got this backwards. Customization *creates* nonportability. Still, I think you are right that languages need a good OS interface in order to be useful for certain types of tasks. The question is, "What features would you regard as vital for work in a Unix environment?" I'll be curious to see your answer. Mine would be: ability to call C functions ability to store C pointers for calls to C functions built-in support for conversion from Icon to C types intrinsic fork()/exec()/wait() ability intrinsic ability to work with pipes intrinsic system() function These would be the basic things I'd want. Icon has three of them. It lacks the other two. Yet another it partially implements, but the inter- face is nontrivial for complex objects (I'm talking about Icon->C type conversions). -Richard (goer@sophist.uchicago.edu) -- -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer
guido@cwi.nl (Guido van Rossum) (04/08/91)
goer@ellis.uchicago.edu (Richard L. Goerwitz) writes: >What is portability? Portability doesn't just involve the compiler or >interpreter itself. It's a property of code written for it as well. >Why? Because the code is as important as the language tools themselves. >What good is it, say, to be make it easy to reimplement a compiler for >more than one system when code written for that compiler will present >a horrendous problem? > >Portability is also not just a theoretical thing. The proof is in the >pudding. How many platforms is Python actively used on? Here's a list >for Icon. Note that most programs will run practically unaltered on >each of the listed platforms. Do you know of any language for which a >similar claim could be made for so many machines and operating systems? If I understand you well, you can make this claim for Icon because Icon forbids things that are inherently system-dependent. This means that probably a host of programs that would benefit from Icon's high-level problem-solving abilities won't be written in Icon because it lacks the low-level interfaces needed to gather the data or whatever. True, if a program opens a pipe and forks off a process that calls sendmail it won't be portable to the Mac. But forbidding such things even when the OS provides the functionality forces the author to use a non-portable solution anyway (such as writing a shell script wrapper around an Icon program). I argue that if the language at least allows you to make non-portable OS calls, users are better off -- of course assuming standard modularization techniques are available to isolate non-portable portions of programs, and encouraging portable solutions where they exist. >[list of platforms on which Ican is used deleted] I don't really want to engage in "mine is longer than yours" contests, but just for the record: Python is out only two months now and has already been ported to all of the Unix platforms you mention (plus hpux) and some of the micro ones (Mac, MS-DOS, Atari ST). I don't claim that all Python programs run on all platforms, because some platforms don't provide some built-in modules, but Python programs that don't use system-dependent modules will run everywhere without change. The crux is that a non-portable Python program is immediately recognizable because it imports a system-dependent module. Also note that Python provides uniform interfaces for OS-dependent features that are available on many systems but not all -- if you have a symbolic link system call, it will be called posix.symlink(). Programs can dynamically test for the presence of such features (which is unly useful if they have a way of handling their absence). >[...] Still, I think you are right that languages >need a good OS interface in order to be useful for certain types of >tasks. The question is, "What features would you regard as vital for >work in a Unix environment?" I'll be curious to see your answer. >Mine would be: > > ability to call C functions > ability to store C pointers for calls to C functions > built-in support for conversion from Icon to C types > intrinsic fork()/exec()/wait() ability > intrinsic ability to work with pipes > intrinsic system() function >These would be the basic things I'd want. Icon has three of them. It >lacks the other two. Yet another it partially implements, but the inter- >face is nontrivial for complex objects (I'm talking about Icon->C type >conversions). You don't say which three Icon has and I don't know enough about Icon to guess. Python has all that you mention except fork/exec/wait and pipes, which are easy enough to add, but since this is a one-person project, for now I am content with system() and temporary files. Disclaimer: maybe I seem stubborn on this point, but I have worked on a language project where OS independence was considered so important that the language didn't even have a primitive to open a file and read data from it within a program. The language didn't become a terrible success, even though it had other properties that made it a big leap forward from other languages... --Guido van Rossum, CWI, Amsterdam <guido@cwi.nl> "Twenty years ago, Dan Bernstein would be defending Assembler against HLL's"
rh@smds.UUCP (Richard Harter) (04/09/91)
In article <1991Apr3.151153.3447@midway.uchicago.edu>, goer@ellis.uchicago.edu (Richard L. Goerwitz) writes: > What is portability? Portability doesn't just involve the compiler or > interpreter itself. It's a property of code written for it as well. > Why? Because the code is as important as the language tools themselves. > What good is it, say, to be make it easy to reimplement a compiler for > more than one system when code written for that compiler will present > a horrendous problem? There are some issues that weren't addressed in this discussion. In languages which have OS command capability one has to come to terms with the fact that different OS's have differing command syntax and differing file system syntax. Portability of command code across OS's really implies that the language must supply that portability. Consider, for example, path names. UNIX and VMS both have a path naming system that amounts to device - directory tree list - file name. If the code refers to files by path name then the language should provide a standard function to return a correct path name from the components [or equivalent functionality]. I am supposing here that the language is strong enough so that path name elements are symbolic and switchable in a config file. One can list a number of such requirements, depending on the objectives of the language in question. In general, however, portability of code in the language requires that all host OS interface capability be portable across the OS's being supported. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.