tchrist@convex.COM (Tom Christiansen) (12/11/90)
In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >1. Compile some large subset of the language to portable C code. We usually say "well, but not evals of course." I've a suspicion that this rules out a lot of code. For example, a user guy mailed me recently with a problem that had a quick eval answer, and I'm thinking that saying "no evals in compiled code" really limits a large subset of the language. Here's the problem: I'm still working with the problem I was attempting to describe to you last night. It involves a simple search and replace, but the delimiting of the search string will vary, and I would like the new string to maintain the same variable delimiters. The strings can be internally delimited by a combination of underscores, spaces and newlines, and externally by newlines and commas. I want to replace with the same delimeters. For example: search string: my_search_string new string: this_is_it may be matched by: replace should be: - ------------------ ------------------ my search string this is it my_search_string this_is_it my_search this_is string it my this search string is it and so on. I know there must be some straightforward way to do it, but so far I have not figured it out. I've got the general one word case, and fixed number of words, but not a variable number solution. The code he was trying to use was this: ########################################################################### #!/usr/bin/perl # # gl - global replace for variable format strings $#ARGV == 3 || die "Invalid no. of arguments"; ($infile, $outfile, $oldexp, $newexp) = @ARGV; @old = split(/[ _]/,$oldexp); @new = split(/[ _]/,$newexp); open(in,"$infile") || die "Can't open $infile: $!"; open(out,">$outfile") || die "Can't open $outfile: $!"; $foo = <in>; while ( <in> ) { $foo .= $_; } #First pass, single line to single line # new expression may contain underscores if (!$#old) { # The following searches for the label making sure it begins # and ends with a space, comma or newline and replaces the # label and whatever separators it found around it. $foo =~ s/(\d\n|,\n|,)([ ]*)$oldexp([ ]*)(,|\n,|\n\d)/\1\2$newexp\3\4/g; print "Finished, output in $outfile.\n"; } # Multi-line to multi line, equal size # Need to parameterize for any size if ($#old) { $test = $foo; $foo =~ s/(\d\n|,\n|,)([ ]*)$old[0]([ _\n])$old[1]([ _\n])$old[2]([ ]*)(,|\n,|\n\d)/\1\2$ new[0]\3$new[1]\4$new[2]\5\6/g; print "Finished 2, output in $outfile.\n"; } print out $foo; ########################################################################### Which I found to be pretty convoluted. My solution was this: #!/usr/local/bin/perl # sanity checks first die "usage: $0 string1 string2 [files ...]" if @ARGV < 2; die "unbalanced underbars" unless ($count = $ARGV[0] =~ tr/_/_/) == ($ARGV[1] =~ tr/_/_/); die "too many underbars" unless $count < 10; ($find = shift) =~ s/[\s_]/([\\s_]+)/g; ($repl = shift) =~ s/[\s_]/'$'.++$i/eg; print STDERR "replacing all ``$find'' with ``$repl''\n"; undef $/; $_ = <>; eval "s/$find/$repl/g"; print; Notice that I've used not one but two evals in this little program. Of course, this is too short to bother wanting to compile (unless someone has other motivations than speed for compilations), but I think it illustrates the problem: evals are just too darn convenient. I don't really want to think about how I might do that if I couldn't have an eval, but I don't know how to compile it with one either. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
pphillip@cs.ubc.ca (Peter Phillips) (12/12/90)
In article <110306@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: >In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >>1. Compile some large subset of the language to portable C code. > >We usually say "well, but not evals of course." I've a suspicion >that this rules out a lot of code. For example, a user guy mailed >me recently with a problem that had a quick eval answer, and I'm >thinking that saying "no evals in compiled code" really limits >a large subset of the language. Here's the problem: [ string replacing problem omitted ] >Notice that I've used not one but two evals in this little program. >Of course, this is too short to bother wanting to compile (unless >someone has other motivations than speed for compilations), but I >think it illustrates the problem: evals are just too darn convenient. >I don't really want to think about how I might do that if I couldn't >have an eval, but I don't know how to compile it with one either. For some perl scripts, eval is indispensible. The debugger wouldn't work without it. For other scripts, eval can be replaced by less powerful operations. Eval is often used to get at the regular expression compiler built into perl. If perl had a regular expression variable and a regular expression compile function, code fragments like: eval "s/$find/$repl/g"; Could be replaced with the translatable-to-C code version: $pat1 = &compile_pattern($find); $pat2 = &compile_pattern($repl); s/$pat1/$repl/g; Something like this could be added to perl, I think. There are other common uses for eval, like simulating references. I think with the right modifications, most uses of eval could be eliminated. Perhaps the greatest and wisest perl hackers should get together, examine their scripts which use eval, and decide what reasonable extensions to perl would eliminate 90% of the use for eval. -- Peter Phillips, pphillip@cs.ubc.ca | "It's worse than that ... He has {alberta,uunet}!ubc-cs!pphillip | no brain." -- McCoy, "Spock's Brain"
tchrist@convex.COM (Tom Christiansen) (12/12/90)
In article <1990Dec12.064530.22356@cs.ubc.ca> pphillip@cs.ubc.ca (Peter Phillips) writes:
:There are other common uses for eval, like simulating references.
:I think with the right modifications, most uses of eval could be
:eliminated. Perhaps the greatest and wisest perl hackers should
:get together, examine their scripts which use eval, and decide
:what reasonable extensions to perl would eliminate 90% of the
:use for eval.
Yes, although I think in many cases you can use the *foo notation,
and it will be faster, too. I hope that wouldn't be barred as
well, as it's far too useful.
Two other reasons for using eval are for dynamic formats and
for the creatures that hp2h creates, although as I show in h2pl,
these can often be reduced.
Plus don't forget that s///e counts as an eval also.
Let's keep a list here. I also suspect that there'll be a fair number
of perl hackers at USENIX next month. More at the end than at the
beginning if I have anything to do with it. :-)
--tom
--
Tom Christiansen tchrist@convex.com convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
to look at yourself in the mirror the next morning." -me
weisberg@hpcc01.HP.COM (Len Weisberg) (12/13/90)
Peter Phillips writes: > For some perl scripts, eval is indispensible. The debugger wouldn't > work without it. For other scripts, eval can be replaced by less > powerful operations. Eval is often used to get at the regular > expression compiler built into perl. > ... <some supporting details omitted> ... > There are other common uses for eval, like simulating references. > I think with the right modifications, most uses of eval could be > eliminated. Perhaps the greatest and wisest perl hackers should > get together, examine their scripts which use eval, and decide > what reasonable extensions to perl would eliminate 90% of the > use for eval. Hear, hear!! My opinion exactly!! Sorry for taking up bandwidth with this, but Peter has said it so well, I just wanted to underline it. I think the development outlined here would be a tremendous boost to the usability and the use of perl. - Len Weisberg - HP Corp Computing & Services - weisberg@corp.HP.COM
pvo@sapphire.OCE.ORST.EDU (Paul O'Neill) (12/13/90)
In article <110306@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: > .............. > eval "s/$find/$repl/g"; > ................ > Gee, I've always glossed over this eval stuff. Now that I'm paying attention I'm befuddled. Why is the eval needed, Tom? Why does the substitution 1/2 work w/o the eval? The $find is parsed and found but the $repl gets shoved in literally. I just hate it when I don't have a model that will predict code's behavior and have to "just try it" to see what it does. >Notice that I've used not one but two evals in this little program. Boy, I am dense. Where's the other one? Thanks. Paul O'Neill pvo@oce.orst.edu DoD 000006 Coastal Imaging Lab OSU--Oceanography Corvallis, OR 97331 503-737-3251
tneff@bfmny0.BFM.COM (Tom Neff) (12/13/90)
I won't be at USENIX but here are my thoughts on compiled Perl: 1. Even with limited functionality it would be a godsend. 2. For many of us, it would be enough to be able to make fast-loadable "Perl object files," i.e., write all data structures to disk after compilation & before execution. The resulting "compiled scripts" would run faster because the parsing pass would be eliminated. Especially wonderful with large scripts! 3. A lot of the really troublesome 'eval' examples are hacks for the purpose of coaxing a little faster performance out of the interpreter. Presumably in exchange for the inherent speed of a compiled script you could give some of that up. 4. If the Perl 'eval' compiler were put into a shared library, compiled scripts could run and have access to a single, reentrant copy of the evaluator if they need it. Scripts themselves could stay small. -- Anthrax Rampant in Kirghizia: Oo*oO Tom Neff Izvestia Comment -- TASS * *O* * tneff@bfmny0.BFM.COM
tchrist@convex.COM (Tom Christiansen) (12/14/90)
From the keyboard of pvo@sapphire.OCE.ORST.EDU (Paul O'Neill):
:In article <110306@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes:
:> eval "s/$find/$repl/g";
:Gee, I've always glossed over this eval stuff. Now that I'm paying attention
:I'm befuddled. Why is the eval needed, Tom?
:Why does the substitution 1/2 work w/o the eval? The $find is parsed and
:found but the $repl gets shoved in literally. I just hate it when I don't
:have a model that will predict code's behavior and have to "just try it" to
:see what it does.
Because perl only does one level of evaluation. If you want
more, have you to ask for it. There are $1 and $2 references
inside of $repl.
:>Notice that I've used not one but two evals in this little program.
:Boy, I am dense. Where's the other one?
It's hidden in the substitute that creates repl:
($repl = shift) =~ s/[\s_]/'$'.++$i/eg;
--tom
--
Tom Christiansen tchrist@convex.com convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
to look at yourself in the mirror the next morning." -me
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/14/90)
In article <110306@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: > In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > >1. Compile some large subset of the language to portable C code. > We usually say "well, but not evals of course." Even without evals this would make Perl a lot more useful. Of course, half the advantage disappears if the Perl-in-C library isn't freely redistributable---but at least that, unlike the entire language, can be rewritten in pieces. The other half of the advantage stays in any case: no parsing time, single executable, easy hand optimization, easy use of fast calculation. And there's no reason an eval can't be compiled. ``It's too much work to stick the compiler into the library!'' you say. Well, most evals in practice are just fixed operations applied to variable string arguments. There's no reason your example couldn't be compiled into fixed code--- the only parsing left after compilation would be the regexp parsing. ---Dan
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/14/90)
In article <1990Dec12.064530.22356@cs.ubc.ca> pphillip@cs.ubc.ca (Peter Phillips) writes: > For some perl scripts, eval is indispensible. The debugger wouldn't > work without it. I imagine that the debugger would remain one of the advantages of the interpreted language. > Perhaps the greatest and wisest perl hackers should > get together, examine their scripts which use eval, and decide > what reasonable extensions to perl would eliminate 90% of the > use for eval. This is a good idea for any language. ---Dan
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/14/90)
In article <93725765@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: > 2. For many of us, it would be enough to be able to make fast-loadable > "Perl object files," i.e., write all data structures to disk after > compilation & before execution. Supposedly perl -u does that, but it doesn't work on many systems. As an alternative I might suggest that you try to work my pmckpt checkpointer into Perl. pmckpt 0.95 (which I just made available for anonymous ftp from stealth.acf.nyu.edu) has been reported to work on (gasp) System V machines, as well as my native environment. Both Larry and Tom seemed slightly interested in the code a few weeks ago, but appear to have abandoned it (sigh). The reason pmckpt is so portable, btw, is that it doesn't use setjmp() or longjmp(). Guess what it uses instead... ---Dan
tneff@bfmny0.BFM.COM (Tom Neff) (12/14/90)
In article <15591:Dec1323:30:2490@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <93725765@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >> 2. For many of us, it would be enough to be able to make fast-loadable >> "Perl object files," i.e., write all data structures to disk after >> compilation & before execution. > >Supposedly perl -u does that, but it doesn't work on many systems. Perl -u is supposed to undump your core image to create a SELF CONTAINED, executable program. Where this does work, the result is HUGE, bigger than Perl itself (by definition). What I want is to store JUST the compiled script data, suitable for immediate interpretation by the regular Perl program. The results should be quite small, and you save the parsing pass later on. I think 'checkpointing' would be a good way to go if the results stored compactly... haven't seen Dan's invention yet, maybe that qualifies. -- "We plan absentee ownership. I'll stick to `o' Tom Neff building ships." -- George Steinbrenner, 1973 o"o tneff@bfmny0.BFM.COM
gee@client2.DRETOR.UUCP (Thomas Gee ) (12/15/90)
In article <1990Dec12.064530.22356@cs.ubc.ca> pphillip@cs.ubc.ca (Peter Phillips) writes: >In article <110306@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes: >>In article <9592:Dec920:40:5190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >>>1. Compile some large subset of the language to portable C code. > >For some perl scripts, eval is indispensible. A related point on perl compilation. If I am correct, perl "compiles" the input code to another internal representation, and interprets the result. This results in a significant pause at invocation before the program (ie perl script) begins executing. Would it be possible to save the internal representation to which the script is translated and feed that directly into the interpretor? I have at least one system that uses a "vast" number of perl scripts which execute in sequence, and the overhead for the initial translation is noticeable and non-trivial. I believe this suggestion did come up in the last "where's my perl compiler" flood, but was never addressed. Thanks, Tom. ------------------------------------------------------------------------------- Thomas Gee | Aerospace Group | a man in search of a quote DCIEM, DND | Canada | gee@dretor.dciem.dnd.ca -------------------------------------------------------------------------------
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/16/90)
As quoted from <93725765@bfmny0.BFM.COM> by tneff@bfmny0.BFM.COM (Tom Neff): +--------------- | 2. For many of us, it would be enough to be able to make fast-loadable | "Perl object files," i.e., write all data structures to disk after | compilation & before execution. The resulting "compiled scripts" | would run faster because the parsing pass would be eliminated. | Especially wonderful with large scripts! +--------------- I mentioned this to Larry once; he pointed out that Perl's internal structures aren't particularly easy to save/restore in a portable way. Of course, it might be possible to write(savefd, etext, sbrk(0) - etext), but this is also nonportable. +--------------- | 4. If the Perl 'eval' compiler were put into a shared library, compiled | scripts could run and have access to a single, reentrant copy of the | evaluator if they need it. Scripts themselves could stay small. +--------------- ..and shared libraries are another nonportable feature. Not to mention that I have yet to make any sense out of the SVR3 version. (Of course, that may simply be *my* problem, not a problem with the shared library implementation.) I may look into compiling a *subset* of Perl. It wouldn't accept everything, and it might not treat everything the same as the interpreter does (i.e. "do" would be reated as an include request... although most uses ofthis are now subsumed by "require"), but the speed increase would probably be worth the loss in functionality, as you say. Of course, I need to find time to do this (grrr!). ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/17/90)
In article <1990Dec15.161911.27401@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: > I mentioned this to Larry once; he pointed out that Perl's internal structures > aren't particularly easy to save/restore in a portable way. Of course, it > might be possible to write(savefd, etext, sbrk(0) - etext), but this is also > nonportable. I wrote pmckpt exactly to prove that a checkpointer *can* be portable. pmckpt assumes all the basic UNIX process structure. It doesn't make any allowances for systems that don't conform (except that it automatically figures out which way your stack grows). Yet people have reported pmckpt working on several System V variants, as well as BSD. How much more portable can you get? ---Dan
les@chinet.chi.il.us (Leslie Mikesell) (12/18/90)
In article <1990Dec15.161911.27401@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: >As quoted from <93725765@bfmny0.BFM.COM> by tneff@bfmny0.BFM.COM (Tom Neff): >+--------------- >| 2. For many of us, it would be enough to be able to make fast-loadable >| "Perl object files," i.e., write all data structures to disk after >| compilation & before execution. The resulting "compiled scripts" >| would run faster because the parsing pass would be eliminated. >| Especially wonderful with large scripts! >+--------------- >I mentioned this to Larry once; he pointed out that Perl's internal structures >aren't particularly easy to save/restore in a portable way. Of course, it >might be possible to write(savefd, etext, sbrk(0) - etext), but this is also >nonportable. A reasonable solution is to not require the saved copy to be portable or even explicitly saved. Instead, add a statement and/or command line option to specify a directory to cache the parsed output allowing the usual expansions of ~/, $HOME, etc. to give a choice between saving in a public-writable directory or making a private copy for each user. Then, if the directory exists and some quick checks establish that the cached copy was written later than the script on a machine with the same variable types, the parsing pass could be skipped. Otherwise a parsed copy would be saved in that directory (if permissions allow) for the new run to use. I think this would be a big help on machines with slow disks and demand paged executables since it would likely avoid the need to page in a lot of the perl program that would otherwise be needed for the compile pass. It might chew up some disk space, but probably nowhere near to the extent that perl -u does, and this way you still get the advantage of shared text when multiple copies of perl are running. Les Mikesell les@chinet.chi.il.us
tneff@bfmny0.BFM.COM (Tom Neff) (12/18/90)
In article <1990Dec15.161911.27401@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: >As quoted from <93725765@bfmny0.BFM.COM> by tneff@bfmny0.BFM.COM (Tom Neff): >| 2. For many of us, it would be enough to be able to make fast-loadable >| "Perl object files," i.e., write all data structures to disk after >| compilation & before execution. The resulting "compiled scripts" >| would run faster because the parsing pass would be eliminated. >| Especially wonderful with large scripts! > >I mentioned this to Larry once; he pointed out that Perl's internal structures >aren't particularly easy to save/restore in a portable way. Of course, it ^^^^^^^^ >might be possible to write(savefd, etext, sbrk(0) - etext), but this is also >nonportable. Is portability the issue here? This would be a proposed speed optimization for individual sites. Precompiled scripts would not be inherently portable across disparate OS's or machine architectures; but neither are today's UNDUMP executables! Also, precompiled scripts might not be portable across major Perl versions even on the same platform; but it would be fairly straightforward to record the version number at the beginning of the precompiled script file, so that Perl could check for incompatibilities before beginning execution. -- "DO NOT, repeat, DO NOT blow the hatch!" /)\ Tom Neff "Roger....hatch blown!" \(/ tneff@bfmny0.BFM.COM
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (12/20/90)
As quoted from <12432668@bfmny0.BFM.COM> by tneff@bfmny0.BFM.COM (Tom Neff): +--------------- | In article <1990Dec15.161911.27401@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes: | >aren't particularly easy to save/restore in a portable way. Of course, it | ^^^^^^^^ +--------------- "Portable" may not be the word. I have used systems where this will fail because a different execution of a program has a few things at different addresses, so just restoring the data and bss from a file leaves pointers dangling. (Consider that stdio is already initialized by the time the data and bss are loaded.) ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
flee@cs.psu.edu (Felix Lee) (12/21/90)
Everyone seems to be giving up too easily. I'm nearly convinced that Perl can be effectively compiled. I've decided to attempt a Perl to Scheme compiler in my copious spare time (tm). Don't hold your breath. -- Felix Lee flee@cs.psu.edu