lawrence@epps.kodak.com (Scott Lawrence) (03/13/91)
I think that I have discovered a problem with substitution and recursion. Please someone demonstrate that I am wrong. I was working on translating simple logical expressions into perl expressions so that I could eval them. A simplified version of my program is included below to demonstrate the problem. The function Expr uses a substitution operator with both the 'e' and 'g' modifiers to match each word in an expression, and pass it to a function (Atom) which returns a perl fragment that could be evaluated to a boolean value (the evaluation is not important to the demo, so I left it out). For simple words, Atom just wraps the word in a subroutine invocation (as in case 'foo' below). If the word has been defined to be a set ( using the associative array %Set ), Atom calls Expr with the value of the set, having wrapped it in parens. The problem occurs when a member of one set is another set, so that Expr is called a third time; the remainder of the original call is lost. -------------- example demo% ./demo Case = 'foo' # This case is simple Expr = 'foo' # The expression is just one atom Atom = 'foo' # which is not a set Result => &In('foo') # the result is correct Case = 'test' # Slightly more complicated because test Expr = 'test' # is defined as a set by: Atom = 'test' # $Set{ 'test' } = 'foo & bar'; Expr = '( foo & bar )' # which works, Atom = 'foo' # foo and bar are not sets Atom = 'bar' Result => ( &In('foo') & &In('bar') ) # so the result is correct Case = 'set' # 'set' is defined as a set by Expr = 'set' # $Set{'set'} = "test | done"; Atom = 'set' # Expr = '( test | done )' # set is correctly expanded Atom = 'test' # but the first atom (test) is also a set Expr = '( foo & bar )' # which is correctly expanded here Atom = 'foo' # and each atom is expanded ok Atom = 'bar' # but result is wrong - Atom is never called Result => ( ( &In('foo') & &In('bar') ) # for 'done' The final result should have been: Result => ( ( &In('foo') & &In('bar') ) | &In('done') ) ^^^^^^^^^^^^^^^^- omitted I can rewrite this so that it doesn't rely on the s///eg to work, but if it did work it would be much more elegant. It looks to me as though I either have a variable scoping problem or the return stack is getting messed up. Any suggestions? Perl source for demo follows, with my perl version info... $Header: perly.c,v 3.0.1.10 91/01/11 18:22:48 lwall Locked $ Patch level: 44 SunOS Release 4.1 (GENERIC) #1: Tue Mar 6 17:27:17 PST 1990 ----------------- begin demo ----------------- #!/usr/local/bin/perl $Set{'test'} = "foo & bar"; # one level of substitution $Set{'set'} = "test | done"; # recursive substitution @Cases = ( 'foo', 'test', 'set' ); test: while( $_ = shift @Cases ) { print "\nCase = '$_'\n"; $Result = &Expr( $_ ); print "Result => $Result\n"; } sub Expr { local( $Expr ) = $_[0]; print "Expr = '$Expr'\n"; $Expr =~ s/(\w+)/&Atom($1)/eg; # <<<<<<<<<<<<<<<< return $Expr; } sub Atom { local( $Atom ) = $_[0]; print "Atom = '$Atom'\n"; return defined $Set{ $Atom } ? &Expr("( $Set{$Atom} )") : " &In('$Atom') "; } ----------------------- end of demo --------------------- -- -- Scott Lawrence <lawrence@epps.kodak.com> Voice: 508-670-4023 Atex Advanced Publishing Systems Fax: 508-670-4033 Atex, Inc; 165 Lexington St. MS 400/165L; Billerica MA 01821
brocher@urz.unibas.ch (Dominic Brocher) (03/13/91)
In article <5128@atexnet.UUCP>, lawrence@epps.kodak.com (Scott Lawrence) writes: > I think that I have discovered a problem with substitution and > recursion. Please someone demonstrate that I am wrong. I have executed your script on a microVAX 3500 running Ultrix 4.1 with the same version of perl you used: This is perl, version 3.0 $Header: perly.c,v 3.0.1.10 91/01/11 18:22:48 lwall Locked $ Patch level: 44 I get exactly the same (wrong) result: Result => ( ( &In('foo') & &In('bar') ) But on a NeXT running NeXT Mach 1.0 and the same version of Perl I get the right result! Result => ( ( &In('foo') & &In('bar') ) | &In('done') ) I have compiled Perl myself on both machines from the same source code and they both passed all test. I'd really like to know the reason for this behaviour (and have a fix for it, of course :-) > -- > Scott Lawrence <lawrence@epps.kodak.com> Voice: 508-670-4023 > Atex Advanced Publishing Systems Fax: 508-670-4033 > Atex, Inc; 165 Lexington St. MS 400/165L; Billerica MA 01821 -- Dominic -------------------------------------------------------------------------------- I am not bound to please thee with my answers. | Dominic Brocher -- Shylock, in The Merchant of Venice (IV, 1/65) | brocher@urz.unibas.ch ================================================================================
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (03/14/91)
In article <5128@atexnet.UUCP> lawrence@epps.kodak.com (Scott Lawrence) writes:
: I think that I have discovered a problem with substitution and
: recursion. Please someone demonstrate that I am wrong.
You're right, but you'll be wrong when 4.0 comes out. :-)
When you do a pattern match, the regular expression routines sometimes
save a copy of the input string so that $1, $&, etc. work right after
the pattern match. The substitution operator was depending on this
string to stay there so that it could continue the substitution using
the value of $', more or less. Unfortunately, the recursion clobbered
that temporary value. The do_subst() routine just needed to make sure
it could restore $' after evaluating the right-hand side, and I figured
out a way to do that by manipulating the pointers, so I don't have to
actually copy the contents of $' around.
Note, however that order of evaluation will still be important. If you say
s/(whatever)/&recurse($1) . $1/eg;
The first $1 refers to the $1 from this substitution, while the second $1
refers to the $1 from pattern match done within &recurse. (I think.)
Larry