rfinch@caldwr.UUCP (Ralph Finch) (04/08/89)
Perl: What is it? What does it do? How do I get it? Thanks, -- Ralph Finch ...ucbvax!ucdavis!caldwr!rfinch
gregs@well.sf.ca.us (Greg Strockbine) (06/14/90)
I'm just starting to look at perl. Is there a good reason to use it instead of sed, awk, etc.?
brnstnd@kramden.acf.nyu.edu (06/14/90)
In article <18498@well.sf.ca.us> gregs@well.sf.ca.us (Greg Strockbine) writes: > I'm just starting to look at perl. Is there a good reason > to use it instead of sed, awk, etc.? Sure: it can handle much longer lines. Much much longer lines. It can do anything your shell can. It has a reasonably pleasant syntax. It is, on the other hand, somewhat more difficult to program efficiently than a judicious combination of sh, sed, and awk. ---Dan
tchrist@convex.COM (Tom Christiansen) (06/14/90)
In article <18498@well.sf.ca.us> gregs@well.sf.ca.us (Greg Strockbine) writes: >I'm just starting to look at perl. Is there a good reason >to use it instead of sed, awk, etc.? That's a good question, the quick answer to which, IMHO, is yes. I know this'll probably spark yet another net jihad, but I'm nonetheless going to try to substantiate that claim. Most of us have written, or at least seen, shell scripts from hell. While often touted as one of UNIX's strengths because they're conglomerations of small, single-purpose tools, these shell scripts quickly grow complex that they're cumbersome and hard to understand, modify and maintain. Because perl is one program rather than a dozen others (sh, awk, sed, tr, wc, sort, grep, ...), it is usually clearer to express yourself in perl than in sh and allies, and often more efficient as well. You don't need as many pipes, temporary files, or separate processes to do the job. You don't need to go shoving your data stream out to tr and back and to sed and back and to awk and back and to sort back and then back to sed and back again. Doing so can often be slow, awkward, and/or confusing. Anyone who's ever tried to pass command line arguments into a sed script of moderate complexity or above can attest to the fact that getting the quoting right is not a pleasant task. In fact, quoting in general in the shell is just not a pleasant thing to code or to read. In a heterogeneous computing environment, the available versions of many tools varies too much from one system to the next to be utterly reliable. Does your sh understand functions on all your machines? What about your awk? What about local variables? It is very difficult to do complex programming without being able to break a problem up into subproblems of lesser complexity. You're forced to resort to using the shell to call other shell scripts and allow UNIX's power of spawning processes serve as your subroutine mechanism, which is inefficient at best. That means your script will require several separate scripts to run, and getting all these installed, working, and maintained on all the different machines in your local configuration is painful. Maybe if nawk had been available sooner and for free and for all architectures, I would use it for more, but it isn't free (yes, there's gawk, but that's not been out long) and actually isn't powerful enough for some of the things I need to do. Perl is free, and its Configure script has knowledge of how to compile perl for a veritable plethora of different hardware and software platforms. Besides being faster, perl is a more powerful tool than sh, sed, or awk. I realize these are fighting words in some camps, but so be it. There exists a substantial niche between shell programming and C programming that perl conveniently fills. Tasks of this nature seem to arise extremely often in the realm of systems administration. Since a system administrator almost invariably has far too much to do to devote a week to coding up every task before him in C, perl is especially useful for him. Larry Wall, perl's author, has been known to call it "a shell for C programmers." In what ways is perl more powerful than the individual tools? This list is pretty long, so what follows is not necessarily an exhaustive list. To begin with, you don't have to worry about arbitrary and annoying restrictions on string length, input line length, or number of elements in an array. These are all virtually unlimited, i.e. limited to your system's address space and virtual memory size. Perl's regular expression handling is far and above the best I've ever seen. For one thing, you don't have to remember which tool wants which particular flavor of regular expressions, or lament that fact that one tool doesn't allow (..|..) constructs or +'s \b's or whatever. With perl, it's all the same, and as far as I can tell, a proper superset of all the others. Perl has a fully functional symbolic debugger (written, of course, in perl) that is an indispensable aid in debugging complex programs. Neither the shell nor sed/awk/sort/tr/... have such a thing. Perl has a loop control mechanism that's more powerful even than C's. You can do the equivalent of a break or continue (last and next in perl) of any arbitrary loop, not merely the nearest enclosing one. You can even do a kind of continue that doesn't trigger the re-initialization part of a loop, something you do from time to time want to do. Perl's data-types and operators are richer than the shells' or awk's, because you have scalars, numerically-indexed arrays (lists), and string-indexed (hashed) arrays. Each of these holds arbitrary data values, including floating point numbers, for which mathematic built-in subroutines and power operators are available. As for operators, to start with, you've got all of C's (except for addressing operators, which aren't relevant) so unlink you don't have to remember whether ~ or ^ or ^= or whatever are really there, as you do in awk. Furthermore, you've got distinct relational operators for strings versus numeric operations: == for numeric equality (0x10 == 16) and 'eq' for string equality ('010' ne '8'), and all the other possibilities as well. You've got a range operator, so you can have expressions like (1..10) or even ('a'..'zzz'.) You can use it to say things like if (/^From/ .. /^$/) { # process mail header or if (/^$/ .. eof) { # process mail body There's a string repetition operator, so ('-' x 72) is a row of dashes. You can operate on entire arrays conveniently, and not just with things like push and pop and join and split, but also array slices: @a = @b[$i..$j]; and built-in mapcar-like abilities for arrays, like for (@list) { s/^foo//; } and for $x (@list) { $x *= 3; } or @x = grep(!/^#/, @y); Speaking of lisp, you can generate strings, perhaps with sprintf(), and then eval them. That way you can generate code on the fly. You can even do lambda-type functions that return newly-created functions that you can call later. The scoping of variables is dynamic, fully recursive subroutines are supported, and you can pass or return any type of data into or out of your subroutines. You have a built-in automatic formatter for generating pretty-printed forms with automatic pagination and headers and center-justified and text-filled fields like "%(|fmt)s" if you can imagine what that would actually be were it legal. There's a mechanism for writing suid programs that can be made more secure than even C programs thanks to an elaborate data-tracing mechanism that understands the "taintedness" of data derived from external sources. It won't let you do anything really stupid that you might not have thought of. You have access to just about any system-related function or system call, like ioctl's, fcntl, select, pipe and fork, getc, socket and bind and connect and attach, and indirect syscall() invocation, as well as things like getpwuid(), gethostbyname(), etc. You can read in binary data laid out by a C program or system call using structure-conversion templates. At the same time you can get at the high-level shell-type operations like the -r or -w tests on files or `backquote` command interpolation. You can do file-globbing with the <*.[ch]> notation or do low-level readdir()s as suits your fancy. Dbm files can be accessed using simple array notation. This is really nice for dealing with system databases (aliases, news, ...), efficient access mechanisms over large data-sets, and for keeping persistent data. Don't be dismayed by the apparent complexity of what I've just discussed. Perl is actually very easy to learn because so much of it derives from existing tools. It's like interpreter C with sh, sed, awk, and a lot more built in to it. I hope this answers your question. --tom -- Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
chris@utgard.uucp (Chris Anderson) (06/14/90)
In article <18498@well.sf.ca.us> gregs@well.sf.ca.us (Greg Strockbine) writes: >I'm just starting to look at perl. Is there a good reason >to use it instead of sed, awk, etc.? Absolutely! It does much more for you than any of the other standard utilities. Anything you can do in them, you can do in perl... usually faster and more portably. It's regular expression handling is better than that supplied with egrep or sed, it is much more efficient than anything that I've used before for text manipulation, and you can use it with binary files as well. I use it a lot for systems administration duties, since the scripts will run without change on multiple machines (be careful, though, perl includes functions for dealing with sockets, symbolic links, and file locking if you compile it on a BSD machine; AT&T doesn't have those features yet). But you can test at runtime for missing features, so you can still write fairly portable scripts using those functions. Read comp.lang.perl for awhile, and don't be scared off by the syntax. Chris -- | Chris Anderson | | QMA, Inc. email : {csusac,sactoh0}!utgard!chris | |----------------------------------------------------------------------| | My employer never listens to me, so why should he care what I say? |
wwm@pmsmam.uucp (Bill Meahan) (06/15/90)
OK, I'm convinced - where can I get the LATEST version of perl? BTW I don't have Internet access so I can't FTP from anywhere. I used to have a method, but it doesn't seem to work any more. -- Bill Meahan WA8TZG uunet!mailrus!umich!pmsmam!wwm I don't speak for Ford - the PR department does that! "stupid cat" is unnecessarily redundant
jv@mh.nl (Johan Vromans) (06/16/90)
In article <103056@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: | ... lots of reasons why to use perl ... To which I would like to add the splendid programming support environment (including multi-window symbolic debugger) available in GNU Emacs. | "EMACS belongs in <sys/errno.h>: Editor too big!" That's the one... Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500 ------------------------ "Arms are made for hugging" -------------------------
awol@vpnet.chi.il.us (Al Oomens) (12/21/90)
Does anyone have any documentation on the perl language? I have a copy of perl that was ported too MS-DOS, but there was no documentation whatsoever! If you have any documentation which you could mail, please let me know. Please don't just mail me a large doc file, I'll respond by mail, that way I won't get several copies mailed to me from all over. Thanks! Al