brossard@sic.epfl.ch (Alain Brossard EPFL-SIC/SII) (05/03/91)
While trying to write a perl script, one of my regular
expression didn't work and I believe it is due to a bug in perl 4.003.
sasun1[15]$ perl
$pat = ' fwef ';
print $pat =~ /\s*([\S]+)/; # doesn't work
sasun1[16]$ perl
$pat = ' fwef ';
print $pat =~ /\s*(\S+)/; # does work
fwefsasun1[17]$ perl
$pat = ' fwef ';
print $pat =~ /\s*([f]+)/; # works for [f]
fsasun1[18]$ perl
$pat = ' fwef ';
print $pat =~ /\s*(f+)/; # yep, [f] and f are equivalent
fsasun1[19]$
So, shoudn't [\S]+ be equivalent to \S+ or [\S] to \S?
Another bug I have tickled causes my perl script to core
dump after printing a lot (>>100) lines with an error (?) message:
Word too long.
Word too long.
Word too long.
/sic/news/spool: write failed, file system is full (#core will be incomplete)
[7] Segmentation fault news_du
news%sicsun[338]$ cd ../spool
news%sicsun[339]$ dbx /sic/public/bin/perl
warning: cannot read pcb in core file: registers' values may be wrong
Reading symbolic information...
Read 19931 symbols
warning: core file read error: address not in data space
warning: core file read error: address not in data space
warning: core file read error: address not in data space
program terminated by signal SEGV (no mapping at the fault address)
(dbx) where
warning: core file read error: address not in data space
warning: core file read error: address not in data space
warning: core file read error: address not in data space
safemalloc(size = 791621423), line 2526 in "install_public/src/sun4-4.1.1/langages/perl-4.003/util.c"
do_subr(arg = 0x2f2f2f2f, gimme = warning: core file read error: address not indata space
bad data address
I have repeated this "expirement" a few time trying to get a proper
core, but between the file system filling up (core > 36MBytes)
and unusable core due
to missing -g or dynamic linking this is the best I have come up with.
The program worked on subsets of news/spool, but the whole tree makes
it croak.
This is on a sun4, Sunos 4.1, with -g, or -O, or -O4 with and
without dynamic linking. I'm including the perl program below in the
hope that it can be reproduced elsewhere. (Any suggestions on how to
improve it will be appreciated):
#!/usr/bin/perl
$spool = '/sic/news/spool';
$data = '/sic/news/lib/groups_size';
$rec_size = 9000; # if spool directory > nn blocks, go down recursively
$min_size = 200; # don't report spool directory if size <= nn
$diff_size = 1000; # warn if changes is > nn blocks
$percent_change = 10; # warn if changes is > nn percent
# Directories which are too big, they should not be scanned directly
@bigdir = ( 'alt', 'comp', 'rec', 'comp/sys' );
chdir $spool || die "Couldn't chdir to $spool: $!";
if( open( FILE, "<$data" ) ) {
while( <FILE> ) {
chop;
($group, $size) = split( ' ', $_ );
$groups{$group} = $size;
}
close FILE;
}
&scan( "" ); # Never exits from scan, core dumps first!
print "After scan\n";
print "End\n";
sub scan {
local($DIR, $dir ) = @_; # Forgot to local(@du), could this be it?
@dirs = <$DIR[a-z]*>;
$dirs = join( ' ', @dirs );
foreach $dir ( @bigdir ) {
if ( $dirs =~ /\b$dir\b/ ) {
push( @scan, $dir );
$dirs =~ s#\b$dir\b##;
}
}
foreach $dir ( @scan ) {
&scan( "$DIR$dir/" );
}
open( DU, "du -s $dirs|" ) || die "Couldn't exec du: $!\n";
while( <DU> ) {
chop;
($size, $group) = split( ' ', $_ );
if( ! ($old_size = $groups{$group}) ) { # new group
$new_groups{$group} = $size if $size > $min_size ;
$groups{$group} = $size if $size > $min_size ;
} else {
$diff = $size - $old_size;
$percent = (100 * $diff)/$old_size;
if( $percent > $percent_change || $diff > $diff_size ) {
printf ( "%25s %6d %3d%% increase: %d\n",
$group, $size, $percent, $diff);
}
$groups{$group} = $size;
}
if ( $size > $rec_size ) { &scan( "$group/" ); }
}
}
--
Alain Brossard, Ecole Polytechnique Federale de Lausanne,
SIC/SII, EL-Ecublens, CH-1015 Lausanne, Suisse
brossard@sasun1.epfl.chlwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (05/07/91)
In article <1991May3.175219@sic.epfl.ch> brossard@sasun1.epfl.ch writes:
: So, shoudn't [\S]+ be equivalent to \S+ or [\S] to \S?
You won't find any documentation that says it works. I only implemented
it for the lower case versions--it seemed too easy to say [^\s]+.
I suppose I could be argued out of this...
: Another bug I have tickled causes my perl script to core
: dump after printing a lot (>>100) lines with an error (?) message:
:
: Word too long.
: Word too long.
: Word too long.
This is a message from your shell, not from perl. Probably because you said:
: @dirs = <$DIR[a-z]*>;
This makes use of the shell to do globbing, which has its advantages and
its disadvantages. You've just discovered the primary disadvantage--shells
have arbitrary limits.
When writing a program like this, it's better to use opendir() and readdir().
It won't run into the limits of the shell, and it runs faster too.
Larrybbs@hankel.rutgers.edu (Barry Schwartz) (05/12/91)
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) writes:
]This is a message from your shell, not from perl. Probably because you said:
]
]: @dirs = <$DIR[a-z]*>;
]
]This makes use of the shell to do globbing, which has its advantages and
]its disadvantages. You've just discovered the primary disadvantage--shells
]have arbitrary limits.
]
]When writing a program like this, it's better to use opendir() and readdir().
]It won't run into the limits of the shell, and it runs faster too.
I just want to make a pitch for readdir(). At first it would
seem easier to use shell globbing, but using readdir is easy
once you start using it, _plus it gives you the power of Perl
regular expressions as your globbing mechanism_. That's saved
me trouble on at least one occasion.
--
Barry Schwartz bbs@hankel.rutgers.edu trashman@kb2ear.uucp