goer@ellis.uchicago.edu (Richard L. Goerwitz) (04/09/91)
There seems to be some misunderstanding about where Icon "fits in" in the great scheme of programming languages. Having worked with Icon very intensely for several years now, I feel I can report effectively on it, and would like to do so here. I would only offer my standard word of warning: I'm trained as a philologist, and have never taken a CS or programming course from any CS department in any university I've attended. First, let's speak diachronically. How did Icon evolve? A simple illustration will suffice here, I think: SNOBOL ----> SL5 -----\ \----> Icon / ALGOL ---> Pascal ----/ Along about '76 or so, the people at the U of Arizona working on SL5 suddenly realized that the evaluation mechanisms originally confined to string scanning in SNOBOL could actually be generalized to the entire language. You might say, "Whaddya think Prolog is?" Prolog, though, is a somewhat constrained implementation of first order predicate logic :-), and is foreign to most programmers (and even many theorists). Instead of going off on some "tangent," the people at the U of Arizona designed a language that utilized backtracking and goal-directed evaluation within the context of a more standard, Algol-derived, structures. Icon was first implemented in FORTRAN (save the barfing, please), and then later in C (1979 or so?). It's now one of the most widely implemented of the "unknown" programming languages. Okay, that's my mangled version of Icon's evolution. Now let's talk synchronically (i.e. typologically). One thing I find amusing is that people often put perl and Icon in the same category. About all they have in common is that they are both optimized for string handling of one kind or another. If there is any real relation, it's via awk, which took on a few SNOBOL-ish features (e.g. the ~ "contains" operator). Awk, perl, and Icon all have associative arrays and what not, and free the user from having to worry about storage. Many LISP dialects, though, have these same features, and I don't seem LISP as being all that closely related to Icon. Perl and awk are also regexp based, which makes them very good at recognizing simple languages and patterns (fundamentally these are the same). Icon, though slower at handling these same patterns, is much more of a general-purpose programming language, and is capable of recognizing, and effectively parsing, patterns which cannot be handled by your run-of-the-mill DFA. Icon really isn't very much like perl, except in the very, very general typological sense of being geared for more than low-level systems programming and numerical processing. Put in practical terms, Icon represents a successful admixture of Prolog-like backtracking mechanisms with an Algol-like structure and SNOBOL-inspired string handling capabilities. It is very strictly, but dynamically, typed, and offers conversion facilities allowing the user to move effortlessly from char set to string to integer or real data-types and back again. Icon is at its best doing string processing, parsing, data conversion, and anything involving a more heuristic, rather than purely algorithmic, approach. Icon is also good for prototyping and for small jobs that would ordinarily be done using awk. I use Icon mainly for medium-scale indexing and concordance programs. I also use it for fairly large-scale text retrieval engines and for things like semiautomatic collation of manuscripts, and linguistic analysis of ancient textual corpora. Icon is clearly at its worst doing anything that requires close interface with the hardware or operating system, or which requires pointer-based access to memory locations. Icon has no pointers, so there is just no way to "get at" anything on this low a level without dipping down into its C interface. Icon also lacks OS-specific I/O capabilities (e.g. under Unix there is no support for terminfo or termio-based I/O, and interfacing the curses library to it is clumsy, due to Icon's inability to store C pointers without kludges like casting them to ints, and then converting them to its integer data type). If some solution to these problems could be found, Icon would become viable for commercial software systems. As yet, it can be used for certain such projects. It is, however, not suitable for many others. Icon is popular among people in the liberal arts, specifically literary and linguistic people. One of them - Alan Corre - has in fact written a book on using Icon. It is also, ironically, popular among language theorists who like to stick offbeat and interesting feathers in their caps. Icon is also, naturally, popular among the many people who have worked on it at one time or another. Icon would be an ideal first language, since it offers all the advantages of a Pascal, without many of its disadvantages. It handles garbage collection, storage allocation, and necessary type conversions, freeing the beginning programmer to think about things that are more important. It also makes things like mathematical sets, lists, hash tables, and strings into trivially simple data objects, again freeing the programmer to think about the general typology of his or her solution, and not so much about the as-yet irrelevant details of implementation. From Icon it is not difficult to move into other Algol-derived dialects, since the overt structure is basically the same. Icon also gives one a leg up in languages like Prolog which make use of vaguely similar backtracking mechanisms. Though these advantages would be important for beginning programmers planning to move into other areas, Icon's most valuable asset as an instructional language is that teachers competent to use it would not have to separate out the "humanities" students (who are making up a larger and larger share of low-level CS classes). Icon has all the facilities that both the "Hum" and "Sci" students would need on the introductory level. I hope that this posting clears up some of the many misconceptions people seem to have about Icon. If there are inaccuracies, I'm sure that someone who's been more "officially" involved with it can clear them up. -- -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer