lawrence@epps.kodak.com (Scott Lawrence) (03/13/91)
I think that I have discovered a problem with substitution and
recursion. Please someone demonstrate that I am wrong.
I was working on translating simple logical expressions into perl
expressions so that I could eval them. A simplified version of
my program is included below to demonstrate the problem.
The function Expr uses a substitution operator with both the 'e'
and 'g' modifiers to match each word in an expression, and pass
it to a function (Atom) which returns a perl fragment that could
be evaluated to a boolean value (the evaluation is not important
to the demo, so I left it out). For simple words, Atom just
wraps the word in a subroutine invocation (as in case 'foo'
below). If the word has been defined to be a set ( using the
associative array %Set ), Atom calls Expr with the value of the
set, having wrapped it in parens. The problem occurs when a
member of one set is another set, so that Expr is called a third
time; the remainder of the original call is lost.
-------------- example
demo% ./demo
Case = 'foo' # This case is simple
Expr = 'foo' # The expression is just one atom
Atom = 'foo' # which is not a set
Result => &In('foo') # the result is correct
Case = 'test' # Slightly more complicated because test
Expr = 'test' # is defined as a set by:
Atom = 'test' # $Set{ 'test' } = 'foo & bar';
Expr = '( foo & bar )' # which works,
Atom = 'foo' # foo and bar are not sets
Atom = 'bar'
Result => ( &In('foo') & &In('bar') ) # so the result is correct
Case = 'set' # 'set' is defined as a set by
Expr = 'set' # $Set{'set'} = "test | done";
Atom = 'set' #
Expr = '( test | done )' # set is correctly expanded
Atom = 'test' # but the first atom (test) is also a set
Expr = '( foo & bar )' # which is correctly expanded here
Atom = 'foo' # and each atom is expanded ok
Atom = 'bar' # but result is wrong - Atom is never called
Result => ( ( &In('foo') & &In('bar') ) # for 'done'
The final result should have been:
Result => ( ( &In('foo') & &In('bar') ) | &In('done') )
^^^^^^^^^^^^^^^^- omitted
I can rewrite this so that it doesn't rely on the s///eg to work,
but if it did work it would be much more elegant. It looks to me
as though I either have a variable scoping problem or the return
stack is getting messed up. Any suggestions? Perl source for
demo follows, with my perl version info...
$Header: perly.c,v 3.0.1.10 91/01/11 18:22:48 lwall Locked $
Patch level: 44
SunOS Release 4.1 (GENERIC) #1: Tue Mar 6 17:27:17 PST 1990
----------------- begin demo -----------------
#!/usr/local/bin/perl
$Set{'test'} = "foo & bar"; # one level of substitution
$Set{'set'} = "test | done"; # recursive substitution
@Cases = ( 'foo', 'test', 'set' );
test: while( $_ = shift @Cases )
{
print "\nCase = '$_'\n";
$Result = &Expr( $_ );
print "Result => $Result\n";
}
sub Expr
{
local( $Expr ) = $_[0];
print "Expr = '$Expr'\n";
$Expr =~ s/(\w+)/&Atom($1)/eg; # <<<<<<<<<<<<<<<<
return $Expr;
}
sub Atom
{
local( $Atom ) = $_[0];
print "Atom = '$Atom'\n";
return defined $Set{ $Atom }
? &Expr("( $Set{$Atom} )") : " &In('$Atom') ";
}
----------------------- end of demo ---------------------
--
--
Scott Lawrence <lawrence@epps.kodak.com> Voice: 508-670-4023
Atex Advanced Publishing Systems Fax: 508-670-4033
Atex, Inc; 165 Lexington St. MS 400/165L; Billerica MA 01821brocher@urz.unibas.ch (Dominic Brocher) (03/13/91)
In article <5128@atexnet.UUCP>, lawrence@epps.kodak.com (Scott Lawrence) writes: > I think that I have discovered a problem with substitution and > recursion. Please someone demonstrate that I am wrong. I have executed your script on a microVAX 3500 running Ultrix 4.1 with the same version of perl you used: This is perl, version 3.0 $Header: perly.c,v 3.0.1.10 91/01/11 18:22:48 lwall Locked $ Patch level: 44 I get exactly the same (wrong) result: Result => ( ( &In('foo') & &In('bar') ) But on a NeXT running NeXT Mach 1.0 and the same version of Perl I get the right result! Result => ( ( &In('foo') & &In('bar') ) | &In('done') ) I have compiled Perl myself on both machines from the same source code and they both passed all test. I'd really like to know the reason for this behaviour (and have a fix for it, of course :-) > -- > Scott Lawrence <lawrence@epps.kodak.com> Voice: 508-670-4023 > Atex Advanced Publishing Systems Fax: 508-670-4033 > Atex, Inc; 165 Lexington St. MS 400/165L; Billerica MA 01821 -- Dominic -------------------------------------------------------------------------------- I am not bound to please thee with my answers. | Dominic Brocher -- Shylock, in The Merchant of Venice (IV, 1/65) | brocher@urz.unibas.ch ================================================================================
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (03/14/91)
In article <5128@atexnet.UUCP> lawrence@epps.kodak.com (Scott Lawrence) writes:
: I think that I have discovered a problem with substitution and
: recursion. Please someone demonstrate that I am wrong.
You're right, but you'll be wrong when 4.0 comes out. :-)
When you do a pattern match, the regular expression routines sometimes
save a copy of the input string so that $1, $&, etc. work right after
the pattern match. The substitution operator was depending on this
string to stay there so that it could continue the substitution using
the value of $', more or less. Unfortunately, the recursion clobbered
that temporary value. The do_subst() routine just needed to make sure
it could restore $' after evaluating the right-hand side, and I figured
out a way to do that by manipulating the pointers, so I don't have to
actually copy the contents of $' around.
Note, however that order of evaluation will still be important. If you say
s/(whatever)/&recurse($1) . $1/eg;
The first $1 refers to the $1 from this substitution, while the second $1
refers to the $1 from pattern match done within &recurse. (I think.)
Larry