brossard@sic.epfl.ch (Alain Brossard EPFL-SIC/SII) (05/03/91)
While trying to write a perl script, one of my regular expression didn't work and I believe it is due to a bug in perl 4.003. sasun1[15]$ perl $pat = ' fwef '; print $pat =~ /\s*([\S]+)/; # doesn't work sasun1[16]$ perl $pat = ' fwef '; print $pat =~ /\s*(\S+)/; # does work fwefsasun1[17]$ perl $pat = ' fwef '; print $pat =~ /\s*([f]+)/; # works for [f] fsasun1[18]$ perl $pat = ' fwef '; print $pat =~ /\s*(f+)/; # yep, [f] and f are equivalent fsasun1[19]$ So, shoudn't [\S]+ be equivalent to \S+ or [\S] to \S? Another bug I have tickled causes my perl script to core dump after printing a lot (>>100) lines with an error (?) message: Word too long. Word too long. Word too long. /sic/news/spool: write failed, file system is full (#core will be incomplete) [7] Segmentation fault news_du news%sicsun[338]$ cd ../spool news%sicsun[339]$ dbx /sic/public/bin/perl warning: cannot read pcb in core file: registers' values may be wrong Reading symbolic information... Read 19931 symbols warning: core file read error: address not in data space warning: core file read error: address not in data space warning: core file read error: address not in data space program terminated by signal SEGV (no mapping at the fault address) (dbx) where warning: core file read error: address not in data space warning: core file read error: address not in data space warning: core file read error: address not in data space safemalloc(size = 791621423), line 2526 in "install_public/src/sun4-4.1.1/langages/perl-4.003/util.c" do_subr(arg = 0x2f2f2f2f, gimme = warning: core file read error: address not indata space bad data address I have repeated this "expirement" a few time trying to get a proper core, but between the file system filling up (core > 36MBytes) and unusable core due to missing -g or dynamic linking this is the best I have come up with. The program worked on subsets of news/spool, but the whole tree makes it croak. This is on a sun4, Sunos 4.1, with -g, or -O, or -O4 with and without dynamic linking. I'm including the perl program below in the hope that it can be reproduced elsewhere. (Any suggestions on how to improve it will be appreciated): #!/usr/bin/perl $spool = '/sic/news/spool'; $data = '/sic/news/lib/groups_size'; $rec_size = 9000; # if spool directory > nn blocks, go down recursively $min_size = 200; # don't report spool directory if size <= nn $diff_size = 1000; # warn if changes is > nn blocks $percent_change = 10; # warn if changes is > nn percent # Directories which are too big, they should not be scanned directly @bigdir = ( 'alt', 'comp', 'rec', 'comp/sys' ); chdir $spool || die "Couldn't chdir to $spool: $!"; if( open( FILE, "<$data" ) ) { while( <FILE> ) { chop; ($group, $size) = split( ' ', $_ ); $groups{$group} = $size; } close FILE; } &scan( "" ); # Never exits from scan, core dumps first! print "After scan\n"; print "End\n"; sub scan { local($DIR, $dir ) = @_; # Forgot to local(@du), could this be it? @dirs = <$DIR[a-z]*>; $dirs = join( ' ', @dirs ); foreach $dir ( @bigdir ) { if ( $dirs =~ /\b$dir\b/ ) { push( @scan, $dir ); $dirs =~ s#\b$dir\b##; } } foreach $dir ( @scan ) { &scan( "$DIR$dir/" ); } open( DU, "du -s $dirs|" ) || die "Couldn't exec du: $!\n"; while( <DU> ) { chop; ($size, $group) = split( ' ', $_ ); if( ! ($old_size = $groups{$group}) ) { # new group $new_groups{$group} = $size if $size > $min_size ; $groups{$group} = $size if $size > $min_size ; } else { $diff = $size - $old_size; $percent = (100 * $diff)/$old_size; if( $percent > $percent_change || $diff > $diff_size ) { printf ( "%25s %6d %3d%% increase: %d\n", $group, $size, $percent, $diff); } $groups{$group} = $size; } if ( $size > $rec_size ) { &scan( "$group/" ); } } } -- Alain Brossard, Ecole Polytechnique Federale de Lausanne, SIC/SII, EL-Ecublens, CH-1015 Lausanne, Suisse brossard@sasun1.epfl.ch
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (05/07/91)
In article <1991May3.175219@sic.epfl.ch> brossard@sasun1.epfl.ch writes:
: So, shoudn't [\S]+ be equivalent to \S+ or [\S] to \S?
You won't find any documentation that says it works. I only implemented
it for the lower case versions--it seemed too easy to say [^\s]+.
I suppose I could be argued out of this...
: Another bug I have tickled causes my perl script to core
: dump after printing a lot (>>100) lines with an error (?) message:
:
: Word too long.
: Word too long.
: Word too long.
This is a message from your shell, not from perl. Probably because you said:
: @dirs = <$DIR[a-z]*>;
This makes use of the shell to do globbing, which has its advantages and
its disadvantages. You've just discovered the primary disadvantage--shells
have arbitrary limits.
When writing a program like this, it's better to use opendir() and readdir().
It won't run into the limits of the shell, and it runs faster too.
Larry
bbs@hankel.rutgers.edu (Barry Schwartz) (05/12/91)
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) writes:
]This is a message from your shell, not from perl. Probably because you said:
]
]: @dirs = <$DIR[a-z]*>;
]
]This makes use of the shell to do globbing, which has its advantages and
]its disadvantages. You've just discovered the primary disadvantage--shells
]have arbitrary limits.
]
]When writing a program like this, it's better to use opendir() and readdir().
]It won't run into the limits of the shell, and it runs faster too.
I just want to make a pitch for readdir(). At first it would
seem easier to use shell globbing, but using readdir is easy
once you start using it, _plus it gives you the power of Perl
regular expressions as your globbing mechanism_. That's saved
me trouble on at least one occasion.
--
Barry Schwartz bbs@hankel.rutgers.edu trashman@kb2ear.uucp