[comp.sources.bugs] GCC, perl and regexp: Him no work proper!!

gnb@bby.oz (Gregory N. Bond) (07/07/89)

[ This article is in gnu.gcc.bug and comp.sources.bugs because I am
not sure if the error is in gcc or perl.  Please followup to the
correct group depending on your analysis of the error! ]

Environment: GCC 1.35, configure.gcc sun3, SunOs 3.5Export, Sun3/60
Software: perl 2, patchlevel18, and Henry Spencer's REGEXP library
(mod.sources, vol 3 no 89, 19 Jan 1986 (Really pre-cambrian!))

I recently installed GCC 1.35, and recompiled perl to get a bit extra
speed (about 10% on the important tests here).  However, one perl
script ceased to function.  It had argument parsing like this:

	#! /usr/bin/perl -w
 
	while ($_ = $ARGV[0], /^-/) {
  	   shift;
   	   if (/^-s/) {
    	      $sortflag = 1;
    	   } 
	}

And when run, produces these error messages: 
	Use of uninitialized variable at ./t.perl line 3.
	Use of uninitialized variable at ./t.perl line 5.
	Possible typo: "sortflag" at ./t.perl line 6.

The same script only gave the last error when the exact same perl
source code was compiled with the standard sun compiler.

OK so far?

Now some work using options to gcc and various collections of binaries
eventually discovers the file in error: if the perl file regexp.c is
comiled with cc or gcc and no optimising, then it works as originally
expected.  If regexp.c is compiled gcc -O, then it fails.  All other
objects in the perl binary are irrelevant (cc, gcc, or gcc -O -
doesn't make any difference).

Still with me?  Looks like regexp.c it the problem!

Well the regexp.c file in perl is a hacked copy of Henry Spencer's
regexp library posted to mod.sources in about 1850.  I have a copy of
that library, so I compiled the original regexp with GCC and ran the
regression tests that Henry has thoughtfully provided.  They failed.
It turns out that this failure is actually a portability bug in the
regexp library (the regsub routine does a *(p-1) in situations where p
points to the start of an array).  Once this bug is fixed, then gcc
compiles a regsub library that passes the regexp regression test,
either with or without -O.  And that file (regsub.c) is not even
included in the perl source, so is not the source of my original
problem.

I have thus far been unable to isolate any specific problem with the
perl version of regexp.c; nor can I understand enough about the
structure of perl to work out in what way the regexp code is failing
and causing those spurious warnings, as the warnings seem to rely in
code not associated with the regexp code.  As time permits I may try
to make a standalone version of a buggy regexp.c that has different
behaviour when compiled with -O.  I post this in the hope that, perl
being a common program, others have noticed this and have a fix.  I
WANT that 10%!!

Greg.

--
Gregory Bond, Burdett Buckeridge & Young Ltd, Melbourne, Australia
Internet: gnb@melba.bby.oz.au    non-MX: gnb%melba.bby.oz@uunet.uu.net
Uucp: {uunet,pyramid,ubc-cs,ukc,mcvax,prlb2,nttlab...}!munnari!melba.bby.oz!gnb