stef@zweig.exodus (Stephane Payrard) (12/08/90)
> > From: chip@tct.uucp (Chip Salzenberg) > Newsgroups: comp.lang.perl > Subject: Re: Needed: a pointer for a perl compare script (long, sorry..) > Date: 6 Dec 90 17:09:27 GMT > Organization: Teltronics/TCT, Sarasota, FL > > According to goer@quads.uchicago.edu (Richard L. Goerwitz): > >Perl is not the only language around that is optimized for file, > >string, and symbol processing, which has associative arrays, and handles > >sorting and printing elegantly. If you can't think of any examples off- > >hand then mail me, and I'll be glad to provide you with a few. > > Come now, Richard. If you criticize in public, you must put up your > facts in public. Name these other languages. Oh yes, and please > include availability and cost information. > -- > Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> > "I'm really sorry I feel this need to insult some people..." > -- John F. Haugh II (He thinks HE'S sorry?) I agree very much with Chip; if you know a better tool than perl, you should not let us in the dark. I am curious to know which tools are better to do the dirty tasks which involve string pattern-matching, some non trivial processing and some system calls with as main input a big file (say .5 to .1 MB) in a reasonable amount of time (say less than 1 minute) Surely not nawk. I am sure that nawk, sed ,ex (or any tool (orcombination of) which come with a "standard Unix distribution) would never allow to write the kind of programs I have written with perl: -it has not the functionalities offered by perl -it has not the performance of perl -it offers no direct access to the OS (system-calls) I don't pretend that perl is an answer to every problem, but it is certainly the best I know of for the class of program I defined in the first paragraph of this mail. The idea of combining basic tools using pipes, backquotes or whatever, is a UNIX myth propagated by most of the UNIX books. Each time I have tried to do a non trivial task this way, it happened to almost be impossible for a simple minded guy for me ;-). Each command/shell has a a different set of metacharacters; this makes the combination of this atomic tools very tricky ("How many backote should I put before this character?"). Moreover, none of these tools come with a debugger. Anyway, if they did, it would not be very useful if your script is a complicated combination of those tools. I am sure when the perl book will come-up, it will be more easy to learn perl that to acquire the UNIX expertise necessary to use and combines the UNIX atomic tools or shells (grep, sed, wc, awk. sh, csh, expr...). I am confident that, someday, someone will come with a program which will allow to use perl as an extensible interactive shell; this will relegate sh and csh in the rank of historically interesting tools. In the mean time you need the UNIX expertise because the perl documentation constantly refers to the UNIX one. An extreme example of what can be done with perl: I have written a 600 line program in perl which deals with a PostScript file generated by FrameMaker; it allows to preview and interactively browse the corresponding document (using NeWS/TNT); it extracts information to build menus; one of the menus allows me to go directly to any chapter of the documentation, keyboard accelerator allow to go from the current page to the next/previous. This program is able to browse a .5MB file, to generate a data file (used for subsequent runs) and pop-up the browser window in about 30 seconds using a (Sun 4/110) and assuming the TNT toolkit already loaded. Subsequents run pop-up the window in 5-6 seconds. Dont ask for this program: it used a not yet released version of TNT and make many assumptions about the browsed file. I am quite sure Larry has never intended perl to be used to write simple windowd tools, but with perl/TNT, it fit the bill. It is quite exciting to use NeWS with perl because NeWS is an interpretor as well. So perl can generate "in the fly" the NeWS code which deals with the windowed part of the tool. I prefer not to imagine a program such as the one I described written with whatever X toolkit and Display PostScript. fooey. In fact, perl is so powerful that I am very much tempted to write stuff I should write in C. And I will write more in perl, if Larry come up some day with an equivalent of the C structs, because pack() and unpack() is an horrible kludge . I don't know very much how Larry could fit syntactically and semantically such an extension to the language. stef -- Stephane Payrard -- stef@eng.sun.com -- (415) 336 3726 SMI 2550 Garcia Avenue M/S 10-09 Mountain View CA 94043
goer@ellis.uchicago.edu (Richard L. Goerwitz) (12/08/90)
In article <STEF.90Dec7130410@zweig.exodus> stef@eng.sun.com writes: >> >> According to goer@quads.uchicago.edu (Richard L. Goerwitz): >> >Perl is not the only language around that is optimized for file, >> >string, and symbol processing, which has associative arrays, and handles >> >sorting and printing elegantly. If you can't think of any examples off- >> >hand then mail me, and I'll be glad to provide you with a few. >> >> Come now, Richard. If you criticize in public, you must put up your >> facts in public. Name these other languages. Oh yes, and please >> include availability and cost information. > >I agree very much with Chip; if you know a better tool than perl, you >should not let us in the dark. I think everyone is getting the wrong impression. When I posted, I had just read a description of a very specific problem. I then read a res- ponse in which someone declared perl uniquely able to handle it. While in some cases this is true, it was not true in the case I had just read about. The point was not that there were other tools out there which could replace perl, but rather that certain features found in perl (e.g. good string handling facilities, associative arrays, and what not) were by no means unique, and that for problems which required such facilities, perl was by no means a unique tool. I fully expect that once perl stabilizes, and the documentation begins to become readily accessible, it will become widely installed, and will become the tool of choice for most tasks now whipped together using a bunch of heterogenous tools, and glued in place with /bin/sh. Perl is filling a very important niche. Please continue perling! -Richard
les@chinet.chi.il.us (Leslie Mikesell) (12/09/90)
In article <1990Dec8.020706.28417@midway.uchicago.edu> goer@ellis.uchicago.edu (Richard L. Goerwitz) writes: >The point was not that there were other tools out there which could >replace perl, but rather that certain features found in perl (e.g. good >string handling facilities, associative arrays, and what not) were by no >means unique, and that for problems which required such facilities, perl >was by no means a unique tool. Ok, sticking to the text handling features relating to the original question, there may be other languages that would easily sort text by keys. But there was also a mention of needing to manipulate it when a match occured. Does anything else let you do those wonderful combination test, assign and regexp extract like perl's: if (($got1,$got2,$got3) =($var =~ /(pattern1) (pattern2) (pattern3))) { ... do whatever you want with $got1 etc. } Or handle multi-line regexps like this piece from the example I posted where it takes everything between a SUMMARY: line and STATUS: line in one item and inserts it before the STATUS: in an update which lacks the SUMMARY information? local ($*) = 1 ; # multi-line match needed [...] # snarf summary from old - note multi-line if (($status) = $oitems{$oldid} =~ /(^SUMMARY:\n[^\0]*)^STATUS:/) { # and insert into new substr($nitems{$newid},index($nitems{$newid},"STATUS:\n"),0) = $status ; } Yes, you could loop over the lines (or characters) explicitly, but why? Les Mikesell les@chinet.chi.il.us
goer@ellis.uchicago.edu (Richard L. Goerwitz) (12/10/90)
In article <1990Dec09.052353.18018@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes: > >Ok, sticking to the text handling features relating to the original question, >there may be other languages that would easily sort text by keys. But there >was also a mention of needing to manipulate it when a match occured. >Does anything else let you do those wonderful combination test, assign >and regexp extract like perl's: > >if (($got1,$got2,$got3) =($var =~ /(pattern1) (pattern2) (pattern3))) { > ... do whatever you want with $got1 etc. >} > >Or handle multi-line regexps like this piece from the example I posted >where it takes everything between a SUMMARY: line and STATUS: line >in one item and inserts it before the STATUS: in an update which lacks >the SUMMARY information? > >local ($*) = 1 ; # multi-line match needed >[...] ># snarf summary from old - note multi-line >if (($status) = $oitems{$oldid} =~ /(^SUMMARY:\n[^\0]*)^STATUS:/) { ># and insert into new >substr($nitems{$newid},index($nitems{$newid},"STATUS:\n"),0) = $status ; >} Again, it's not the string processing tools that make perl unique. It's the combination of tools and their particularly facile integration with the operating system that make perl unique. The regexp stuff you mention above is peanuts in languages like Snobol and Icon. In fact, regular expressions are felt, by Snobol and Icon programmers, to be insufficiently powerful for the sorts of things they do. Multi-line matches, non-regular languages, and other bits of trickery are the bread and butter of languages like Snobol and Icon. Note, though, that to do the things you mention above takes more space in at least Icon than perl - that is, if you restrict yourself to pat- terns that can be recognized using a deterministic finite state auto- maton. And for this restricted pattern-type, perl will probably run faster than Icon and Snobol (but what about Spitbol?). There are ups and downs to everything. I guess what I'm saying is that statements like the one I'm responding to above indicate that people really don't know about the grand old tradition of nonnumeric processing we see in systems like COMIT (ee gads), SNOBOL4, Spitbol, Icon, and offshoots like awk, nawk, and now languages which incorporate elements from these, like perl. I really never wanted to get into any argument here. I've never taken a course from a computer science departement in my life (I'm currently finishing up a PhD in Near Eastern Languages), and I feel out of my element. When people started taking me to task for saying that perl wasn't uniquely suited to sorting, hashing, and matching tasks, I guess I felt I had to say something. As I've said before, perl is neat tool, and if it had no usefulness, I would not be here. Keep on perling! -Richard (goer@sophist.uchicago.edu)
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/10/90)
Here are the three biggest things I can't really do in Perl but can do with (some) other UNIX tools: 1. Compile some large subset of the language to portable C code. 2. Pass descriptors back and forth between programs. This is hellishly useful for combining programs in different languages, for passing messages securely, and for minimizing the overhead of a modular resource controller. Practically every system in existence has some mechanism for descriptor passing, but Perl doesn't standardize it. 3. Use signal-schedule (aka non-preemptive) threads. In various languages I can schedule threads to execute when the program receives a ``signal''---including signals such as ``descriptor 2 is writable,'' ``we have just taken control of resource x,'' etc. This makes coroutines and multithreaded programs a joy rather than a pain to write. Different kinds of signals are available under different UNIX variants, but Perl could certainly standardize the basic mechanism. If Perl had these features, my objections about portability, efficiency, and interoperability would almost disappear. ---Dan
tchrist@convex.COM (Tom Christiansen) (12/11/90)
In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >2. Pass descriptors back and forth between programs. This is hellishly >useful for combining programs in different languages, for passing >messages securely, and for minimizing the overhead of a modular resource >controller. Practically every system in existence has some mechanism for >descriptor passing, but Perl doesn't standardize it. I'm not sure what you want here. It's pretty easy in perl to connect processes through a file descriptor: if (open(HANDLE, "|-")) { # parent code writes to HANDLE } else { # child code just reads from STDIN per usual } or else: if (open(HANDLE, "-|")) { # parent code reads from HANDLE } else { # child code just writes to STDOUT per usual } (I know -- I didn't check that open returned undefined.) You can also play more elaborate games using explicit pipe() calls. For unrelated processes, you're going to have to use named pipes or sockets. How does C offer a more standard mechanism for passing descriptors which Perl can't use? --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/12/90)
As quoted from <1990Dec8.020706.28417@midway.uchicago.edu> by goer@ellis.uchicago.edu (Richard L. Goerwitz): +--------------- | I fully expect that once perl stabilizes, and the documentation begins | to become readily accessible, it will become widely installed, and will +--------------- I daresay Perl is more widely installed than Icon. And more widely installed than nawk. As far as documentation goes --- the Perl manpage was enough to get me going in Perl. I have yet to find Grswold&Griswold locally, and I'm not in a position to order it from Prentice-Hall; the Icon interpreter sits, compiled but unused, on my machine waiting for me to learn enough Icon to try to use it. Additionally, the number of Perl examples in the distribution is more than enough to get one started even without the manual. (I say "started", not "fully knowedgeable"... but I can't even get started from the Icon examples.) On the other hand, I want to learn Icon. Show me something along the lines of the Perl manpage and I'll see what I can accomplish. ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/12/90)
As quoted from <110275@convex.convex.com> by tchrist@convex.COM (Tom Christiansen): +--------------- | In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: | >2. Pass descriptors back and forth between programs. This is hellishly | | I'm not sure what you want here. It's pretty easy in perl to | connect processes through a file descriptor: +--------------- I think he means ioctl(streamfd, I_SENDFD, fd) or the socket equivalent. Problem is, I use plenty of machines that *don't* support it. This is about as portable as that alarm() replacement that uses setitimer... less so, in fact, as SVR3 with Streams support has I_SENDFD. ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
goer@quads.uchicago.edu (Richard L. Goerwitz) (12/12/90)
In article <1990Dec12.005636.17687@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: >On the other hand, I want to learn Icon. Show me something along the lines of >the Perl manpage and I'll see what I can accomplish. Brief overviews of Icon can be ftp'd from a number of sites, the best being cs.arizona.edu. Cd to icon/ and grab "technical" report 90-1, 90-2, 90-6. Don't expect Icon to fill perl's shoes. It's not a good system administra- tion language. It occupies a different niche. -Richard
worley@compass.com (Dale Worley) (12/13/90)
X-Name: Brandon S. Allbery KB8JRR I have yet to find Grswold&Griswold locally, and I'm not in a position to order it from Prentice-Hall; Most bookstores will special order books. Dale Dale Worley Compass, Inc. worley@compass.com -- The workers ceased to be afraid of the bosses. It's as if they suddenly threw off their chains. -- a Soviet journalist, about the Donruss coal strike
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/14/90)
In article <1990Dec12.010203.18075@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: > As quoted from <110275@convex.convex.com> by tchrist@convex.COM (Tom Christiansen): > | In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > | >2. Pass descriptors back and forth between programs. This is hellishly > | I'm not sure what you want here. It's pretty easy in perl to > | connect processes through a file descriptor: > I think he means ioctl(streamfd, I_SENDFD, fd) or the socket equivalent. Yes. I seem to have this huge pile of secure resource managers, all of which create a descriptor pointing to a secure resource, then fork off a child process with access to that descriptor. In the latest program I tried an option for passing the descriptor up to another process, which would take control. You can't imagine how much better life would be if there were a standard protocol and library routine for this job. Add non-preemptive threads to this message-passing language, and it would finally be conceivable that UNIX system resources be implemented in---and used by---Perl. > Problem is, I use plenty of machines that *don't* support it. This is about > as portable as that alarm() replacement that uses setitimer... less so, in > fact, as SVR3 with Streams support has I_SENDFD. Uh, other way around? SVR3 with Streams does indeed have I_SENDFD, which is why descriptor passing *is* so portable. ---Dan
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/16/90)
As quoted from <15024:Dec1322:59:4090@kramden.acf.nyu.edu> by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): +--------------- | In article <1990Dec12.010203.18075@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: | > as portable as that alarm() replacement that uses setitimer... less so, in | > fact, as SVR3 with Streams support has I_SENDFD. | | Uh, other way around? SVR3 with Streams does indeed have I_SENDFD, which | is why descriptor passing *is* so portable. +--------------- Oops. Mental typo. Yeah, but the machine I use most often doesn't have Streams (we have the add-on package but have yet to install it because the network board we want to use it with is so unreliable...). Non-preemptive multithreading: yesterday at work, I laid out a nonpreemptive thread system of sorts. It's not particularly easy to rewrite something big like Perl to use the implementation I came up with, but it's there. (I have some fairly bizarre convolutions between a 4GL and a Prolog interpreter at work to get a job done --- bizarre it may be, but it runs 20x faster than the 4GL-only version. The threading is for the interface to the Prolog, so if necessary I can have more than one running.) ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY