tchrist@convex.COM (Tom Christiansen) (06/14/90)
In article <18498@well.sf.ca.us> gregs@well.sf.ca.us (Greg Strockbine) writes: >I'm just starting to look at perl. Is there a good reason >to use it instead of sed, awk, etc.? That's a good question, the quick answer to which, IMHO, is yes. I know this'll probably spark yet another net jihad, but I'm nonetheless going to try to substantiate that claim. Most of us have written, or at least seen, shell scripts from hell. While often touted as one of UNIX's strengths because they're conglomerations of small, single-purpose tools, these shell scripts quickly grow complex that they're cumbersome and hard to understand, modify and maintain. Because perl is one program rather than a dozen others (sh, awk, sed, tr, wc, sort, grep, ...), it is usually clearer to express yourself in perl than in sh and allies, and often more efficient as well. You don't need as many pipes, temporary files, or separate processes to do the job. You don't need to go shoving your data stream out to tr and back and to sed and back and to awk and back and to sort back and then back to sed and back again. Doing so can often be slow, awkward, and/or confusing. Anyone who's ever tried to pass command line arguments into a sed script of moderate complexity or above can attest to the fact that getting the quoting right is not a pleasant task. In fact, quoting in general in the shell is just not a pleasant thing to code or to read. In a heterogeneous computing environment, the available versions of many tools varies too much from one system to the next to be utterly reliable. Does your sh understand functions on all your machines? What about your awk? What about local variables? It is very difficult to do complex programming without being able to break a problem up into subproblems of lesser complexity. You're forced to resort to using the shell to call other shell scripts and allow UNIX's power of spawning processes serve as your subroutine mechanism, which is inefficient at best. That means your script will require several separate scripts to run, and getting all these installed, working, and maintained on all the different machines in your local configuration is painful. Maybe if nawk had been available sooner and for free and for all architectures, I would use it for more, but it isn't free (yes, there's gawk, but that's not been out long) and actually isn't powerful enough for some of the things I need to do. Perl is free, and its Configure script has knowledge of how to compile perl for a veritable plethora of different hardware and software platforms. Besides being faster, perl is a more powerful tool than sh, sed, or awk. I realize these are fighting words in some camps, but so be it. There exists a substantial niche between shell programming and C programming that perl conveniently fills. Tasks of this nature seem to arise extremely often in the realm of systems administration. Since a system administrator almost invariably has far too much to do to devote a week to coding up every task before him in C, perl is especially useful for him. Larry Wall, perl's author, has been known to call it "a shell for C programmers." In what ways is perl more powerful than the individual tools? This list is pretty long, so what follows is not necessarily an exhaustive list. To begin with, you don't have to worry about arbitrary and annoying restrictions on string length, input line length, or number of elements in an array. These are all virtually unlimited, i.e. limited to your system's address space and virtual memory size. Perl's regular expression handling is far and above the best I've ever seen. For one thing, you don't have to remember which tool wants which particular flavor of regular expressions, or lament that fact that one tool doesn't allow (..|..) constructs or +'s \b's or whatever. With perl, it's all the same, and as far as I can tell, a proper superset of all the others. Perl has a fully functional symbolic debugger (written, of course, in perl) that is an indispensable aid in debugging complex programs. Neither the shell nor sed/awk/sort/tr/... have such a thing. Perl has a loop control mechanism that's more powerful even than C's. You can do the equivalent of a break or continue (last and next in perl) of any arbitrary loop, not merely the nearest enclosing one. You can even do a kind of continue that doesn't trigger the re-initialization part of a loop, something you do from time to time want to do. Perl's data-types and operators are richer than the shells' or awk's, because you have scalars, numerically-indexed arrays (lists), and string-indexed (hashed) arrays. Each of these holds arbitrary data values, including floating point numbers, for which mathematic built-in subroutines and power operators are available. As for operators, to start with, you've got all of C's (except for addressing operators, which aren't relevant) so unlink you don't have to remember whether ~ or ^ or ^= or whatever are really there, as you do in awk. Furthermore, you've got distinct relational operators for strings versus numeric operations: == for numeric equality (0x10 == 16) and 'eq' for string equality ('010' ne '8'), and all the other possibilities as well. You've got a range operator, so you can have expressions like (1..10) or even ('a'..'zzz'.) You can use it to say things like if (/^From/ .. /^$/) { # process mail header or if (/^$/ .. eof) { # process mail body There's a string repetition operator, so ('-' x 72) is a row of dashes. You can operate on entire arrays conveniently, and not just with things like push and pop and join and split, but also array slices: @a = @b[$i..$j]; and built-in mapcar-like abilities for arrays, like for (@list) { s/^foo//; } and for $x (@list) { $x *= 3; } or @x = grep(!/^#/, @y); Speaking of lisp, you can generate strings, perhaps with sprintf(), and then eval them. That way you can generate code on the fly. You can even do lambda-type functions that return newly-created functions that you can call later. The scoping of variables is dynamic, fully recursive subroutines are supported, and you can pass or return any type of data into or out of your subroutines. You have a built-in automatic formatter for generating pretty-printed forms with automatic pagination and headers and center-justified and text-filled fields like "%(|fmt)s" if you can imagine what that would actually be were it legal. There's a mechanism for writing suid programs that can be made more secure than even C programs thanks to an elaborate data-tracing mechanism that understands the "taintedness" of data derived from external sources. It won't let you do anything really stupid that you might not have thought of. You have access to just about any system-related function or system call, like ioctl's, fcntl, select, pipe and fork, getc, socket and bind and connect and attach, and indirect syscall() invocation, as well as things like getpwuid(), gethostbyname(), etc. You can read in binary data laid out by a C program or system call using structure-conversion templates. At the same time you can get at the high-level shell-type operations like the -r or -w tests on files or `backquote` command interpolation. You can do file-globbing with the <*.[ch]> notation or do low-level readdir()s as suits your fancy. Dbm files can be accessed using simple array notation. This is really nice for dealing with system databases (aliases, news, ...), efficient access mechanisms over large data-sets, and for keeping persistent data. Don't be dismayed by the apparent complexity of what I've just discussed. Perl is actually very easy to learn because so much of it derives from existing tools. It's like interpreter C with sh, sed, awk, and a lot more built in to it. I hope this answers your question. --tom -- Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
wwm@pmsmam.uucp (Bill Meahan) (06/15/90)
OK, I'm convinced - where can I get the LATEST version of perl? BTW I don't have Internet access so I can't FTP from anywhere. I used to have a method, but it doesn't seem to work any more. -- Bill Meahan WA8TZG uunet!mailrus!umich!pmsmam!wwm I don't speak for Ford - the PR department does that! "stupid cat" is unnecessarily redundant
jv@mh.nl (Johan Vromans) (06/16/90)
In article <103056@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: | ... lots of reasons why to use perl ... To which I would like to add the splendid programming support environment (including multi-window symbolic debugger) available in GNU Emacs. | "EMACS belongs in <sys/errno.h>: Editor too big!" That's the one... Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500 ------------------------ "Arms are made for hugging" -------------------------
merlyn@iwarp.intel.com (Randal Schwartz) (09/13/90)
In article <1990Sep11.211401.1556@ccu1.aukuni.ac.nz>, russell@ccu1 (Russell J Fulton;ccc032u) writes: | I noticed the two scripts posted in response to a request for reaper programs | were Pearl scripts. We are relatively new to UNIX an I have not come across | Pearl before. Could some kind soul please send me a brief description of | Pearl, and information on where to get it. (Or a pointer to where I can get | the information.) | | It looks like a powerful tool for doing admin work! A public reply, in case there are other lurkers with the same request... Perl is a freely available (under the GNU Copyleft) program written by Larry Wall, the author of the 'rn' newsreader (and a prolific hacker and writer, I might add). Perl is a mixture of sed, awk, sh, C, and your favorite wishlist. It's best at handling nearly any task that you would have used a convoluted shell script for, and then some. It's also *very* portable, thanks to a fairly robust Configure script-- it's even running on MS-DOS. Well, here, let me quote the manpage... NAME perl - Practical Extraction and Report Language DESCRIPTION Perl is an interpreted language optimized for scanning arbi- trary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, effi- cient, complete) rather than beautiful (tiny, elegant, minimal). It combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS.) Expression syntax corresponds quite closely to C expression syntax. Unlike most Unix utilities, perl does not arbitrarily limit the size of your data--if you've got the memory, perl can slurp in your whole file as a single string. Recursion is of unlimited depth. And the hash tables used by associative arrays grow as necessary to prevent degraded performance. Perl uses sophisticated pattern matching techniques to scan large amounts of data very quickly. Although optimized for scanning text, perl can also deal with binary data, and can make dbm files look like associative arrays (where dbm is available). Setuid perl scripts are safer than C programs through a dataflow tracing mechanism which prevents many stupid security holes. If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capa- bilities or must run a little faster, and you don't want to write the silly thing in C, then perl may be for you. There are also translators to turn your sed and awk scripts into perl scripts. OK, enough hype. Perl can be fetched anon-ftp from devvax.jpl.nasa.gov:/pub/perl.3.0, as well as anon-uucp from osu-cis. Many other sites also stock Perl. Perl was posted to comp.sources.unix a while back. Support is excellent. Perl has its own newsgroup "comp.lang.perl" and mailing list "perl-users-request@virginia.edu". Larry reads and posts frequently, and in his spare time even answers private mail questions. He's fast to respond to bugs/feature-wishlists. (Perl 3.0 is already at patchlevel 28 after having been released only 9 months ago.) Perl comes with a 70-page manpage, and will soon be documented in a Nutshell Handbook with over 200 pages of full examples, detailed descriptions of operations, tutorials, and cookbook-style Perl recipes. Read (and post to!) comp.lang.perl for further information. print "Just another Perl hacker," -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/
lisch@mentor.com (Ray Lischner) (10/19/90)
Compiling Perl, patchlevel 36, on an HPollo DN4500, running Domain/OS 10.3, using the native C compiler, version 6.7 (316), the optimizer generates bad code for eval.c. Using -opt 2 works (or -W0,-opt,2 for /bin/cc). -- Ray Lischner UUCP: {uunet,apollo,decwrl}!mntgfx!lisch
david@cs.odu.edu (Wm. David Vegh) (02/27/91)
I am rather new to the perl scene and I am looking for a book/reference that will help me learn it, any suggestions? Also, which is the latest version of perl? (and where to ftp from...) Thanks in advance, David ================ david@cs.odu.edu
emv@ox.com (Ed Vielmetti) (04/03/91)
(is there a uucp mapping project member in the house?) i'm interested in developing code that does sanity checking on uucp map entries. ideally you would feed it in one of the many files in comp.mail.maps and it would flag any errors and format the information in the map uniformly. it would be good to make it interactive or batch so that errors could be fixed or so that it could be used to explore the uucp maps. this could well make use of a number of external databases or servers; for instance, you'd like to get the latitude/longitude right, either from the zip code or from the city name. the telephone number should match up similarly. map entries which were too old (per the #W line) would be flagged as such. with the help of pathalias, it could form a nice browser; if you want to see who is connected to who and both ends of the link, it should be straightforward to trace the path. just thinking about all the possible things you might want to do, and what all the external dbm's you would want to keep around between runs so that you could make lookups arbitrarily quick. say you want to answer the query "show me all the sites in alabama connected to uunet". or "where is handwriting research corp". or to the extent that people put real information in their maps "who is running news and mail on a mac". thoughts? ideas? specs? working code :-) ? the geographic name server at martini.eecs.umich.edu port 3000 is a home for some of this information. there has to be a telco database lying around somewhere to get at least rough agreements on phone numbers. i have a skeleton command line parser that could be thrown at the interactive part. send me mail or post, i'll summarize as needed. if you have ideas, comp.mail.uucp would be best; if you have code, contact me and i'll try to coordinate mushing things together. -- Msen Edward Vielmetti /|--- moderator, comp.archives emv@msen.com "With all of the attention and publicity focused on gigabit networks, not much notice has been given to small and largely unfunded research efforts which are studying innovative approaches for dealing with technical issues within the constraints of economic science." RFC 1216