lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/09/88)
System: perl version 2.0 Patch #: 14 Priority: MEDIUM Subject: case insensitive search speedup Subject: searches should now work on chars with the 8th bit set Subject: plugged memory leak on searches compiled at run time Subject: some patterns such as /[Cc]at/ could fail Subject: /foo/ && s//bar/ could fail Subject: % should now work with a negative left argument Subject: closing a pipe now returns child process status in $? Subject: eof() will no longer dump core when no files are opened with <> Subject: printf no longer drops last argument after %% Subject: printf now works more like C version in weird cases Subject: srand always returns true now Subject: documented necessity of seek between reads and writes Subject: $foo = `echo $foo` now works right Subject: backreferences weren't treated as variable length Subject: attempted fix for machines where $* = 1 was failing Subject: added detection of "sort" not used as keyword Subject: evals of long strings could use up gobs of memory Subject: $) and $| weren't properly evaluated in `` or "" Subject: man pages for ld and cc probably not in $mansrc Description: Case insensitive searches have been sped up 5 to 10 times thanks to some super work by Bob Best. Searches should now work on chars with the 8th bit set. I had to double the table size for the Boyer-Moore searches and declare some chars as unsigned to make the pointer math work right. Patterns containing variables are compiled every time they are executed. Unfortunately, the storage from the last pattern compilation was not reclaimed, resulting in memory gobbling. Some patterns such as /[Cc]at/ could fail, in that the optimization that first searched for the substring "at" failed to back off the correct distance to the beginning of the pattern. This only happened to optimizations that couldn't be hoisted to the cmd level but remained at the spat level. Saying /foo/ && s//bar/ could fail if the pattern /foo/ was optimized at the spat level, since it failed to set lastspat correctly. The % operator should now work with a negative left argument. Since some compilers punt on this, I had to catch it and calculate it explicitly myself, using only positive operands to %. It does mod not rem, so -1 % 5 is 4. Closing a pipe now returns child process status in $?. I thought it did this already but I was wrong. The eof() function will no longer dump core when no files are opened with <>. Though why anyone would want eof() in that situation is beyond me. The printf operator no longer drops last argument after %%. This was caused by decrementing the number of fields remaining for the printf even though %% didn't consume one. printf also now works more like the C version when it meets strange things like % with an invalid format letter. Since the return value of srand is void on some machines and undocumented on others, perl's srand always returns true now. Most stdio packages require you to do an fseek or reach eof between reads and writes to the same stream. I've now documented the restriction in perl, which after all uses stdio. $foo = `echo $foo` used to destroy $foo before getting its value. In patterns, backreferences weren't treated as variable length, so they didn't disable optimizations that depend on finding constant substrings. In particular /(a*) b \1 c \1 d/ produced a constant substring of " b c d ", obviously wrong. I've heard of machines where the assignment $* = 1 doesn't work right. Nobody's sent me a fix (that I know of), so I've take a guess at what is going wrong. Lemme know if it still fails. A number of perl 1.0 scripts had "sort" as a filehandle. Since perl 2.0 added sort as a reserved word, I've added a diagnostic that catches most uses of "sort" as a non-keyword and complains appropriately. There's a nifty trick, when searching for lots of different strings that are not known till runtime, in which you build up a long string of commands and eval that so that the patterns are compiled only once. Unfortunately, the way the tokener handled the scanning of quoted strings caused enormous gobs of memory to be used up and not returned to the free memory pool. Throwing in a realloc at the appropriate place fixes this. References to variables $) and $| are illegal in search patterns because they look like the end-of-string test at the end of an optional pattern. Unfortunately, they were also made illegal in `` and "", where they should be perfectly legal. This has been remedied. Configure looked for manual pages in $mansrc. Since that's usually local or new manual pages, ld.1 and cc.1 probably weren't there. Some random cleanup: "make realclean" now deletes perl.man missing " in README op.sprintf now tests %% in x2p/Makefile.SH added redirection of stderr to /dev/null in a2py.c walk() needed to be declared outside of main() spelled caesar right in manual Fix: From rn, say "| patch -p -N -d DIR", where DIR is your perl source directory. Outside of rn, say "cd DIR; patch -p -N <thisarticle". If you don't have the patch program, apply the following by hand, or get patch (version 2.0, latest patchlevel). After patching: Configure -d make depend make make test make install If patch indicates that patchlevel is the wrong version, you may need to apply one or more previous patches, or the patch may already have been applied. See the patchlevel.h file to find out what has or has not been applied. In any event, don't continue with the patch. If you are missing previous patches they can be obtained from me: Larry Wall lwall@jpl-devvax.jpl.nasa.gov If you send a mail message of the following form it will greatly speed processing: Subject: Command @SH mailpatch PATH perl 2.0 LIST ^ note the c where PATH is a return path FROM ME TO YOU either in Internet notation, or in bang notation from some well-known host, and LIST is the number of one or more patches you need, separated by spaces, commas, and/or hyphens. Saying 35- says everything from 35 to the end. You can also get the patches via anonymous FTP from jpl-devvax.jpl.nasa.gov (128.149.8.43). Index: patchlevel.h Prereq: 13 1c1 < #define PATCHLEVEL 13 --- > #define PATCHLEVEL 14 Index: Configure Prereq: 2.0.1.4 *** Configure.old Wed Sep 7 17:16:11 1988 --- Configure Wed Sep 7 17:16:14 1988 *************** *** 8,14 **** # and edit it to reflect your system. Some packages may include samples # of config.h for certain machines, so you might look for one of those.) # ! # $Header: Configure,v 2.0.1.4 88/08/05 01:23:27 root Exp $ # # Yes, you may rip this off to use in other distribution packages. # (Note: this Configure script was generated automatically. Rather than --- 8,14 ---- # and edit it to reflect your system. Some packages may include samples # of config.h for certain machines, so you might look for one of those.) # ! # $Header: Configure,v 2.0.1.5 88/09/07 16:28:09 lwall Locked $ # # Yes, you may rip this off to use in other distribution packages. # (Note: this Configure script was generated automatically. Rather than *************** *** 163,168 **** --- 163,169 ---- attrlist="$attrlist ns32000 ns16000 iAPX286 mc300 mc500 mc700 sparc" attrlist="$attrlist nsc32000 sinix xenix venix posix ansi M_XENIX" attrlist="$attrlist $mc68k __STDC__ UTS M_I8086 M_I186 M_I286 M_I386" + attrlist="$attrlist i186" pth="/usr/ucb /bin /usr/bin /usr/local /usr/local/bin /usr/lbin /etc /usr/lib /lib /usr/local/lib" d_newshome="/usr/NeWS" defvoidused=7 *************** *** 776,783 **** *split) case "$split" in '') ! if $contains '\-i' $mansrc/ld.1 >/dev/null 2>&1 || \ ! $contains '\-i' $mansrc/cc.1 >/dev/null 2>&1; then dflt='-i' else dflt='none' --- 777,784 ---- *split) case "$split" in '') ! if $contains '\-i' /usr/man/man1/ld.1 >/dev/null 2>&1 || \ ! $contains '\-i' /usr/man/man1/cc.1 >/dev/null 2>&1; then dflt='-i' else dflt='none' Index: Makefile.SH Prereq: 2.0.1.4 *** Makefile.SH.old Wed Sep 7 17:16:21 1988 --- Makefile.SH Wed Sep 7 17:16:22 1988 *************** *** 25,33 **** echo "Extracting Makefile (with variable substitutions)" cat >Makefile <<!GROK!THIS! ! # $Header: Makefile.SH,v 2.0.1.4 88/08/03 22:00:44 root Exp $ # # $Log: Makefile.SH,v $ # Revision 2.0.1.4 88/08/03 22:00:44 root # # patch11: make install doesn't modify current directory any more --- 25,36 ---- echo "Extracting Makefile (with variable substitutions)" cat >Makefile <<!GROK!THIS! ! # $Header: Makefile.SH,v 2.0.1.5 88/09/07 16:29:26 lwall Locked $ # # $Log: Makefile.SH,v $ + # Revision 2.0.1.5 88/09/07 16:29:26 lwall + # patch14: make realclean now deletes perl.man + # # Revision 2.0.1.4 88/08/03 22:00:44 root # # patch11: make install doesn't modify current directory any more *************** *** 184,190 **** rm -f *.o realclean: ! rm -f perl *.orig */*.orig *~ */*~ *.o core $(addedbyconf) # The following lint has practically everything turned on. Unfortunately, # you have to wade through a lot of mumbo jumbo that can't be suppressed. --- 187,193 ---- rm -f *.o realclean: ! rm -f perl *.orig */*.orig *~ */*~ *.o core $(addedbyconf) perl.man # The following lint has practically everything turned on. Unfortunately, # you have to wade through a lot of mumbo jumbo that can't be suppressed. Index: x2p/Makefile.SH Prereq: 2.0.1.1 *** x2p/Makefile.SH.old Wed Sep 7 17:19:18 1988 --- x2p/Makefile.SH Wed Sep 7 17:19:19 1988 *************** *** 18,26 **** esac echo "Extracting x2p/Makefile (with variable substitutions)" cat >Makefile <<!GROK!THIS! ! # $Header: Makefile.SH,v 2.0.1.1 88/07/11 23:13:39 root Exp $ # # $Log: Makefile.SH,v $ # Revision 2.0.1.1 88/07/11 23:13:39 root # patch2: now expects more shift/reduce errors # --- 18,29 ---- esac echo "Extracting x2p/Makefile (with variable substitutions)" cat >Makefile <<!GROK!THIS! ! # $Header: Makefile.SH,v 2.0.1.2 88/09/07 17:13:30 lwall Locked $ # # $Log: Makefile.SH,v $ + # Revision 2.0.1.2 88/09/07 17:13:30 lwall + # patch14: added redirection of stderr to /dev/null + # # Revision 2.0.1.1 88/07/11 23:13:39 root # patch2: now expects more shift/reduce errors # *************** *** 93,99 **** # won't work with csh export PATH || exit 1 - mv $(bin)/a2p $(bin)/a2p.old 2>/dev/null ! - mv $(bin)/s2p $(bin)/s2p.old - if test `pwd` != $(bin); then cp $(public) $(bin); fi cd $(bin); \ for pub in $(public); do \ --- 96,102 ---- # won't work with csh export PATH || exit 1 - mv $(bin)/a2p $(bin)/a2p.old 2>/dev/null ! - mv $(bin)/s2p $(bin)/s2p.old 2>/dev/null - if test `pwd` != $(bin); then cp $(public) $(bin); fi cd $(bin); \ for pub in $(public); do \ Index: README *** README.old Wed Sep 7 17:16:27 1988 --- README Wed Sep 7 17:16:27 1988 *************** *** 66,72 **** 7) Read the manual entry before running perl. ! 8) Go down to the x2p directory and do a "make depend, a "make" and a "make install" to create the awk to perl and sed to perl translators. 9) IMPORTANT! Help save the world! Communicate any problems and suggested --- 66,72 ---- 7) Read the manual entry before running perl. ! 8) Go down to the x2p directory and do a "make depend", a "make" and a "make install" to create the awk to perl and sed to perl translators. 9) IMPORTANT! Help save the world! Communicate any problems and suggested Index: x2p/a2py.c Prereq: 2.0.1.2 *** x2p/a2py.c.old Wed Sep 7 17:19:25 1988 --- x2p/a2py.c Wed Sep 7 17:19:26 1988 *************** *** 1,6 **** ! /* $Header: a2py.c,v 2.0.1.2 88/08/03 22:50:05 root Exp $ * * $Log: a2py.c,v $ * Revision 2.0.1.2 88/08/03 22:50:05 root * patch11: in a2p, numbers couldn't start with '.' * --- 1,9 ---- ! /* $Header: a2py.c,v 2.0.1.3 88/09/07 17:15:57 lwall Locked $ * * $Log: a2py.c,v $ + * Revision 2.0.1.3 88/09/07 17:15:57 lwall + * patch14: walk() needed to be declared outside of main() + * * Revision 2.0.1.2 88/08/03 22:50:05 root * patch11: in a2p, numbers couldn't start with '.' * *************** *** 20,25 **** --- 23,29 ---- char *filename; int checkers = 0; + STR *walk(); main(argc,argv,env) register int argc; *************** *** 29,35 **** register STR *str; register char *s; int i; - STR *walk(); STR *tmpstr; linestr = str_new(80); --- 33,38 ---- Index: arg.c Prereq: 2.0.1.3 *** arg.c.old Wed Sep 7 17:16:35 1988 --- arg.c Wed Sep 7 17:16:38 1988 *************** *** 1,6 **** ! /* $Header: arg.c,v 2.0.1.3 88/08/03 22:06:41 root Exp $ * * $Log: arg.c,v $ * Revision 2.0.1.3 88/08/03 22:06:41 root * patch11: support for broken 386 compiler * patch11: join of null array could leave destination string non-null --- 1,17 ---- ! /* $Header: arg.c,v 2.0.1.4 88/09/07 16:46:25 lwall Locked $ * * $Log: arg.c,v $ + * Revision 2.0.1.4 88/09/07 16:46:25 lwall + * patch14: case insensitive search speedup + * patch14: plugged memory leak on searches compiled at run time + * patch14: some patterns such as /[Cc]at/ could fail + * patch14: /foo/ && s//bar/ could fail + * patch14: closing a pipe now returns child process status in $? + * patch14: eof() will no longer dump core when no files are opened with <> + * patch14: printf no longer drops last argument after %% + * patch14: printf now works more like C version in weird cases + * patch14: searches should now work on chars with the 128 bit set + * * Revision 2.0.1.3 88/08/03 22:06:41 root * patch11: support for broken 386 compiler * patch11: join of null array could leave destination string non-null *************** *** 28,33 **** --- 39,45 ---- #include <errno.h> extern int errno; + extern char fold[]; STR * do_match(arg,retary,sarg,ptrmaxsarg,sargoff,cushion) *************** *** 64,69 **** --- 76,83 ---- if (debug & 8) deb("2.SPAT /%s/\n",t); #endif + if (spat->spat_regexp) + regfree(spat->spat_regexp); spat->spat_regexp = regcomp(t,spat->spat_flags & SPAT_FOLD,1); if (!*spat->spat_regexp->precomp && lastspat) spat = lastspat; *************** *** 119,125 **** goto nope; else if (spat->spat_flags & SPAT_ALL) goto yup; ! else if (spat->spat_regexp->regback >= 0) { ++*(long*)&spat->spat_short->str_nval; s -= spat->spat_regexp->regback; if (s < t) --- 133,139 ---- goto nope; else if (spat->spat_flags & SPAT_ALL) goto yup; ! if (s && spat->spat_regexp->regback >= 0) { ++*(long*)&spat->spat_short->str_nval; s -= spat->spat_regexp->regback; if (s < t) *************** *** 183,188 **** --- 197,205 ---- yup: ++*(long*)&spat->spat_short->str_nval; + lastspat = spat; + if (spat->spat_flags & SPAT_ONCE) + spat->spat_flags |= SPAT_USED; if (sawampersand) { char *tmps; *************** *** 216,221 **** --- 233,240 ---- fatal("panic: do_subst"); else if (spat->spat_runtime) { m = str_get(eval(spat->spat_runtime,Null(STR***),-1)); + if (spat->spat_regexp) + regfree(spat->spat_regexp); spat->spat_regexp = regcomp(m,spat->spat_flags & SPAT_FOLD,1); } #ifdef DEBUGGING *************** *** 249,255 **** } else if (!(s = fbminstr(s, strend, spat->spat_short))) goto nope; ! else if (spat->spat_regexp->regback >= 0) { ++*(long*)&spat->spat_short->str_nval; s -= spat->spat_regexp->regback; if (s < m) --- 268,274 ---- } else if (!(s = fbminstr(s, strend, spat->spat_short))) goto nope; ! if (s && spat->spat_regexp->regback >= 0) { ++*(long*)&spat->spat_short->str_nval; s -= spat->spat_regexp->regback; if (s < m) *************** *** 363,368 **** --- 382,389 ---- arg_free(spat->spat_runtime); /* it won't change, so */ spat->spat_runtime = Nullarg; /* no point compiling again */ } + if (spat->spat_regexp) + regfree(spat->spat_regexp); spat->spat_regexp = regcomp(m,spat->spat_flags & SPAT_FOLD,1); } #ifdef DEBUGGING *************** *** 725,732 **** return FALSE; } if (stio->fp) { ! if (stio->type == '|') ! retval = (pclose(stio->fp) >= 0); else if (stio->type == '-') retval = TRUE; else { --- 746,756 ---- return FALSE; } if (stio->fp) { ! if (stio->type == '|') { ! status = pclose(stio->fp); ! retval = (status >= 0); ! statusvalue = (unsigned)status & 0xffff; ! } else if (stio->type == '-') retval = TRUE; else { *************** *** 755,762 **** register STIO *stio; int ch; ! if (!stab) /* eof() */ ! stio = argvstab->stab_io; else stio = stab->stab_io; --- 779,790 ---- register STIO *stio; int ch; ! if (!stab) { /* eof() */ ! if (argvstab) ! stio = argvstab->stab_io; ! else ! return TRUE; ! } else stio = stab->stab_io; *************** *** 1105,1120 **** break; /* not enough % patterns, oh well */ for (t++; *sarg && *t && t != s; t++) { switch (*t) { ! case '\0': ! t--; ! break; ! case '%': ch = *(++t); *t = '\0'; sprintf(buf,s); s = t; *(t--) = ch; break; case 'l': dolong = TRUE; break; --- 1133,1153 ---- break; /* not enough % patterns, oh well */ for (t++; *sarg && *t && t != s; t++) { switch (*t) { ! default: ch = *(++t); *t = '\0'; sprintf(buf,s); s = t; *(t--) = ch; + len++; break; + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + case '.': case '#': case '-': case '+': + break; + case '\0': + t--; + break; case 'l': dolong = TRUE; break; *************** *** 1751,1757 **** do_study(str) STR *str; { ! register char *s = str_get(str); register int pos = str->str_cur; register int ch; register int *sfirst; --- 1784,1790 ---- do_study(str) STR *str; { ! register unsigned char *s = (unsigned char*)(str_get(str)); register int pos = str->str_cur; register int ch; register int *sfirst; *************** *** 1794,1799 **** --- 1827,1839 ---- else snext[pos] = -pos; sfirst[ch] = pos; + + /* If there were any case insensitive searches, we must assume they + * all are. This speeds up insensitive searches much more than + * it slows down sensitive ones. + */ + if (sawi) + sfirst[fold[ch]] = pos; } str->str_pok |= 4; Index: eval.c Prereq: 2.0.1.5 *** eval.c.old Wed Sep 7 17:16:53 1988 --- eval.c Wed Sep 7 17:16:56 1988 *************** *** 1,6 **** ! /* $Header: eval.c,v 2.0.1.5 88/08/03 22:17:04 root Exp $ * * $Log: eval.c,v $ * Revision 2.0.1.5 88/08/03 22:17:04 root * patch11: support for incompetent 386 compiler * patch11: support for Sun compiler that can't cast double to unsigned long. --- 1,10 ---- ! /* $Header: eval.c,v 2.0.1.6 88/09/07 16:49:52 lwall Locked $ * * $Log: eval.c,v $ + * Revision 2.0.1.6 88/09/07 16:49:52 lwall + * patch14: % should now work with a negative left argument + * patch14: srand always returns true now + * * Revision 2.0.1.5 88/08/03 22:17:04 root * patch11: support for incompetent 386 compiler * patch11: support for Sun compiler that can't cast double to unsigned long. *************** *** 489,496 **** case O_MODULO: if ((tmplong = (long) str_gnum(sarg[2])) == 0L) fatal("Illegal modulus zero"); ! value = str_gnum(sarg[1]); ! value = (double)(((long)value) % tmplong); goto donumset; case O_ADD: value = str_gnum(sarg[1]); --- 493,503 ---- case O_MODULO: if ((tmplong = (long) str_gnum(sarg[2])) == 0L) fatal("Illegal modulus zero"); ! when = (long)str_gnum(sarg[1]); ! if (when >= 0) ! value = (double)(when % tmplong); ! else ! value = (double)(tmplong - (-when % tmplong)); goto donumset; case O_ADD: value = str_gnum(sarg[1]); *************** *** 1080,1087 **** #endif goto donumset; case O_SRAND: ! value = (double)srand((int)str_gnum(sarg[1])); ! goto donumset; case O_EXP: value = exp(str_gnum(sarg[1])); goto donumset; --- 1087,1095 ---- #endif goto donumset; case O_SRAND: ! srand((int)str_gnum(sarg[1])); ! str = &str_yes; ! break; case O_EXP: value = exp(str_gnum(sarg[1])); goto donumset; Index: t/op.sprintf Prereq: 2.0 *** t/op.sprintf.old Wed Sep 7 17:18:53 1988 --- t/op.sprintf Wed Sep 7 17:18:54 1988 *************** *** 1,8 **** #!./perl ! # $Header: op.sprintf,v 2.0 88/06/05 00:14:40 root Exp $ print "1..1\n"; ! $x = sprintf("%3s %-4s foo %5d%c%3.1f","hi",123,456,65,3.0999); ! if ($x eq ' hi 123 foo 456A3.1') {print "ok 1\n";} else {print "not ok 1\n";} --- 1,8 ---- #!./perl ! # $Header: op.sprintf,v 2.0.1.1 88/09/07 17:04:35 lwall Locked $ print "1..1\n"; ! $x = sprintf("%3s %-4s%%foo %5d%c%3.1f","hi",123,456,65,3.0999); ! if ($x eq ' hi 123 %foo 456A3.1') {print "ok 1\n";} else {print "not ok 1\n";} Index: perl.h Prereq: 2.0.1.2 *** perl.h.old Wed Sep 7 17:17:04 1988 --- perl.h Wed Sep 7 17:17:05 1988 *************** *** 1,6 **** ! /* $Header: perl.h,v 2.0.1.2 88/08/03 22:19:11 root Exp $ * * $Log: perl.h,v $ * Revision 2.0.1.2 88/08/03 22:19:11 root * patch11: some support for crippled compilers that don't grok str_get macro * patch11: str_peek improperly reused a buffer --- 1,9 ---- ! /* $Header: perl.h,v 2.0.1.3 88/09/07 16:51:18 lwall Locked $ * * $Log: perl.h,v $ + * Revision 2.0.1.3 88/09/07 16:51:18 lwall + * patch14: added sawi variable to optimize study when no //i found + * * Revision 2.0.1.2 88/08/03 22:19:11 root * patch11: some support for crippled compilers that don't grok str_get macro * patch11: str_peek improperly reused a buffer *************** *** 231,236 **** --- 234,240 ---- EXT bool allstabs INIT(FALSE); /* init all customary symbols in symbol table?*/ EXT bool sawampersand INIT(FALSE); /* must save all match strings */ EXT bool sawstudy INIT(FALSE); /* do fbminstr on all strings */ + EXT bool sawi INIT(FALSE); /* study must assume case insensitive */ #define TMPPATH "/tmp/perl-eXXXXXX" EXT char *e_tmpname; Index: perl.man.1 Prereq: 2.0.1.4 *** perl.man.1.old Wed Sep 7 17:17:16 1988 --- perl.man.1 Wed Sep 7 17:17:20 1988 *************** *** 1,7 **** .rn '' }` ! ''' $Header: perl.man.1,v 2.0.1.4 88/08/03 22:21:28 root Exp $ ''' ''' $Log: perl.man.1,v $ ''' Revision 2.0.1.4 88/08/03 22:21:28 root ''' patch11: random typos and clarifications ''' --- 1,10 ---- .rn '' }` ! ''' $Header: perl.man.1,v 2.0.1.5 88/09/07 16:52:04 lwall Locked $ ''' ''' $Log: perl.man.1,v $ + ''' Revision 2.0.1.5 88/09/07 16:52:04 lwall + ''' patch14: documented setting $? by closing pipe + ''' ''' Revision 2.0.1.4 88/08/03 22:21:28 root ''' patch11: random typos and clarifications ''' *************** *** 1261,1266 **** --- 1264,1270 ---- does not. Also, closing a pipe will wait for the process executing on the pipe to complete, in case you want to look at the output of the pipe afterwards. + Closing a pipe explicitly also puts the status value of the command into $?. Example: .nf Index: perl.man.2 Prereq: 2.0.1.5 *** perl.man.2.old Wed Sep 7 17:17:37 1988 --- perl.man.2 Wed Sep 7 17:17:42 1988 *************** *** 1,7 **** ''' Beginning of part 2 ! ''' $Header: perl.man.2,v 2.0.1.5 88/08/05 01:27:31 root Exp $ ''' ''' $Log: perl.man.2,v $ ''' Revision 2.0.1.5 88/08/05 01:27:31 root ''' patch13: clarified goto problems ''' --- 1,13 ---- ''' Beginning of part 2 ! ''' $Header: perl.man.2,v 2.0.1.6 88/09/07 16:54:49 lwall Locked $ ''' ''' $Log: perl.man.2,v $ + ''' Revision 2.0.1.6 88/09/07 16:54:49 lwall + ''' patch14: spelled caesar right + ''' patch14: generalized $? slightly + ''' patch14: removed caveat about % of negative numbers + ''' patch14: documented necessity of seek between reads and writes + ''' ''' Revision 2.0.1.5 88/08/05 01:27:31 root ''' patch13: clarified goto problems ''' *************** *** 233,239 **** open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved) ! open(article, "caeser <$article |"\|); # decrypt article open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process# --- 239,245 ---- open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved) ! open(article, "caesar <$article |"\|); # decrypt article open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process# *************** *** 323,329 **** open(FOO, "\-|") || exec \'cat\', \'\-n\', $file; .fi ! Explicitly closing the filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. .Ip "ord(EXPR)" 8 3 Returns the ascii value of the first character of EXPR. --- 329,335 ---- open(FOO, "\-|") || exec \'cat\', \'\-n\', $file; .fi ! Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. .Ip "ord(EXPR)" 8 3 Returns the ascii value of the first character of EXPR. *************** *** 1270,1276 **** running this script. (Mnemonic: same as shells.) .Ip $? 8 ! The status returned by the last backtick (\`\`) command or .I system operator. Note that this is the status word returned by the wait() system --- 1276,1282 ---- running this script. (Mnemonic: same as shells.) .Ip $? 8 ! The status returned by the last pipe close, backtick (\`\`) command or .I system operator. Note that this is the status word returned by the wait() system *************** *** 1630,1637 **** .PP .I Perl is at the mercy of the C compiler's definitions of various operations ! such as % and atof(). ! In particular, don't trust % on negative numbers. .PP While none of the built-in data types have any arbitrary size limits (apart from memory size), there are still a few arbitrary limits: --- 1636,1646 ---- .PP .I Perl is at the mercy of the C compiler's definitions of various operations ! such atof(). ! .PP ! If your stdio requires an seek or eof between reads and writes on a particular ! stream, so does ! .IR perl . .PP While none of the built-in data types have any arbitrary size limits (apart from memory size), there are still a few arbitrary limits: Index: perl.y Prereq: 2.0.1.3 *** perl.y.old Wed Sep 7 17:17:53 1988 --- perl.y Wed Sep 7 17:17:55 1988 *************** *** 1,6 **** ! /* $Header: perl.y,v 2.0.1.3 88/08/03 22:25:12 root Exp $ * * $Log: perl.y,v $ * Revision 2.0.1.3 88/08/03 22:25:12 root * patch11: deleted fossilized join syntax * patch11: fixed join('a','b') --- 1,9 ---- ! /* $Header: perl.y,v 2.0.1.4 88/09/07 16:55:41 lwall Locked $ * * $Log: perl.y,v $ + * Revision 2.0.1.4 88/09/07 16:55:41 lwall + * patch14: case insensitive search speedup + * * Revision 2.0.1.3 88/08/03 22:25:12 root * patch11: deleted fossilized join syntax * patch11: fixed join('a','b') *************** *** 656,662 **** | FUNC2 '(' sexpr ',' expr ')' { $$ = make_op($1, 2, $3, $5, Nullarg, 0); if ($1 == O_INDEX && $$[2].arg_type == A_SINGLE) ! fbmcompile($$[2].arg_ptr.arg_str); } | FUNC3 '(' sexpr ',' sexpr ',' expr ')' { $$ = make_op($1, 3, $3, $5, $7, 0); } | STABFUN '(' WORD ')' --- 659,665 ---- | FUNC2 '(' sexpr ',' expr ')' { $$ = make_op($1, 2, $3, $5, Nullarg, 0); if ($1 == O_INDEX && $$[2].arg_type == A_SINGLE) ! fbmcompile($$[2].arg_ptr.arg_str,0); } | FUNC3 '(' sexpr ',' sexpr ',' expr ')' { $$ = make_op($1, 3, $3, $5, $7, 0); } | STABFUN '(' WORD ')' Index: perly.c Prereq: 2.0.1.6 *** perly.c.old Wed Sep 7 17:18:11 1988 --- perly.c Wed Sep 7 17:18:17 1988 *************** *** 1,6 **** ! char rcsid[] = "$Header: perly.c,v 2.0.1.6 88/08/05 01:29:43 root Exp $"; /* * $Log: perly.c,v $ * Revision 2.0.1.6 88/08/05 01:29:43 root * patch13: fixed loop stack overflow on goto * patch13: fixed recursive subroutine storage management --- 1,10 ---- ! char rcsid[] = "$Header: perly.c,v 2.0.1.7 88/09/07 16:57:47 lwall Locked $"; /* * $Log: perly.c,v $ + * Revision 2.0.1.7 88/09/07 16:57:47 lwall + * patch14: $foo = `echo $foo` now works right + * patch14: % should now work with a negative left argument + * * Revision 2.0.1.6 88/08/05 01:29:43 root * patch13: fixed loop stack overflow on goto * patch13: fixed recursive subroutine storage management *************** *** 1103,1110 **** (chld[1].arg_type == A_INDREAD && !(arg[1].arg_flags & AF_SPECIAL)) || (chld[1].arg_type == A_GLOB && !(arg[1].arg_flags & AF_SPECIAL)) ! || ! chld[1].arg_type == A_BACKTICK ) ) ) ) { arg[2].arg_type = chld[1].arg_type; arg[2].arg_ptr = chld[1].arg_ptr; free_arg(chld); --- 1107,1113 ---- (chld[1].arg_type == A_INDREAD && !(arg[1].arg_flags & AF_SPECIAL)) || (chld[1].arg_type == A_GLOB && !(arg[1].arg_flags & AF_SPECIAL)) ! ) ) ) ) { arg[2].arg_type = chld[1].arg_type; arg[2].arg_ptr = chld[1].arg_ptr; free_arg(chld); *************** *** 1184,1189 **** --- 1187,1193 ---- register char *tmps; int i; unsigned long tmplong; + long tmp2; double exp(), log(), sqrt(), modf(); char *crypt(); double sin(), cos(), atan2(), pow(); *************** *** 1227,1233 **** tmplong = (long)str_gnum(s2); if (tmplong == 0L) fatal("Illegal modulus of constant zero"); ! str_numset(str,(double)(((long)str_gnum(s1)) % tmplong)); break; case O_ADD: value = str_gnum(s1); --- 1231,1241 ---- tmplong = (long)str_gnum(s2); if (tmplong == 0L) fatal("Illegal modulus of constant zero"); ! tmp2 = (long)str_gnum(s1); ! if (tmp2 >= 0) ! str_numset(str,(double)(tmp2 % tmplong)); ! else ! str_numset(str,(double)(tmplong - (-tmp2 % tmplong))); break; case O_ADD: value = str_gnum(s1); Index: regexp.c Prereq: 2.0.1.4 *** regexp.c.old Wed Sep 7 17:18:30 1988 --- regexp.c Wed Sep 7 17:18:34 1988 *************** *** 7,15 **** * blame Henry for some of the lack of readability. */ ! /* $Header: regexp.c,v 2.0.1.4 88/08/03 22:37:26 root Exp $ * * $Log: regexp.c,v $ * Revision 2.0.1.4 88/08/03 22:37:26 root * patch11: deleted regchar() * patch11: fixed some pointer arithmetic that didn't work on the 286 --- 7,19 ---- * blame Henry for some of the lack of readability. */ ! /* $Header: regexp.c,v 2.0.1.5 88/09/07 17:02:10 lwall Locked $ * * $Log: regexp.c,v $ + * Revision 2.0.1.5 88/09/07 17:02:10 lwall + * patch14: case insensitive search speedup + * patch14: backreferences weren't treated as variable length + * * Revision 2.0.1.4 88/08/03 22:37:26 root * patch11: deleted regchar() * patch11: fixed some pointer arithmetic that didn't work on the 286 *************** *** 154,160 **** */ /* The following have no fixed length. */ ! char varies[] = {BRANCH,BACK,STAR,PLUS,REF,0}; /* The following always have a length of 1. */ char simple[] = {ANY,ANYOF,ANYBUT,ALNUM,NALNUM,SPACE,NSPACE,DIGIT,NDIGIT,0}; --- 158,165 ---- */ /* The following have no fixed length. */ ! char varies[] = {BRANCH,BACK,STAR,PLUS, ! REF+1,REF+2,REF+3,REF+4,REF+5,REF+6,REF+7,REF+8,REF+9,0}; /* The following always have a length of 1. */ char simple[] = {ANY,ANYOF,ANYBUT,ALNUM,NALNUM,SPACE,NSPACE,DIGIT,NDIGIT,0}; *************** *** 337,343 **** r->regback = -1; r->regstclass = Nullch; scan = r->program+1; /* First BRANCH. */ ! if (!fold && OP(regnext(scan)) == END) {/* Only one top-level choice. */ scan = NEXTOPER(scan); first = scan; --- 342,348 ---- r->regback = -1; r->regstclass = Nullch; scan = r->program+1; /* First BRANCH. */ ! if (OP(regnext(scan)) == END) {/* Only one top-level choice. */ scan = NEXTOPER(scan); first = scan; *************** *** 347,354 **** first = NEXTOPER(first); /* Starting-point info. */ ! if (OP(first) == EXACTLY) r->regstart = str_make(OPERAND(first)+1); else if ((exp = index(simple,OP(first))) && exp > simple) r->regstclass = first; else if (OP(first) == BOUND || OP(first) == NBOUND) --- 352,362 ---- first = NEXTOPER(first); /* Starting-point info. */ ! if (OP(first) == EXACTLY) { r->regstart = str_make(OPERAND(first)+1); + if (r->regstart->str_cur > !(sawstudy|fold)) + fbmcompile(r->regstart,fold); + } else if ((exp = index(simple,OP(first))) && exp > simple) r->regstclass = first; else if (OP(first) == BOUND || OP(first) == NBOUND) *************** *** 411,418 **** if (back < 0) back = -1; r->regback = back; ! if (len > !(sawstudy)) ! fbmcompile(r->regmust); *(long*)&r->regmust->str_nval = 100; } else --- 419,426 ---- if (back < 0) back = -1; r->regback = back; ! if (len > !(sawstudy|fold)) ! fbmcompile(r->regmust,fold); *(long*)&r->regmust->str_nval = 100; } else Index: stab.c Prereq: 2.0.1.4 *** stab.c.old Wed Sep 7 17:18:42 1988 --- stab.c Wed Sep 7 17:18:44 1988 *************** *** 1,6 **** ! /* $Header: stab.c,v 2.0.1.4 88/08/03 22:38:51 root Exp $ * * $Log: stab.c,v $ * Revision 2.0.1.4 88/08/03 22:38:51 root * patch11: added sanity check on $- going negative * --- 1,9 ---- ! /* $Header: stab.c,v 2.0.1.5 88/09/07 17:03:28 lwall Locked $ * * $Log: stab.c,v $ + * Revision 2.0.1.5 88/09/07 17:03:28 lwall + * patch14: attempted fix for machines where $* = 1 was failing + * * Revision 2.0.1.4 88/08/03 22:38:51 root * patch11: added sanity check on $- going negative * *************** *** 286,292 **** } break; case '*': ! multiline = (int)str_gnum(str) != 0; break; case '/': record_separator = *str_get(str); --- 289,296 ---- } break; case '*': ! i = (int)str_gnum(str); ! multiline = (i != 0); break; case '/': record_separator = *str_get(str); Index: str.h Prereq: 2.0.1.1 *** str.h.old Wed Sep 7 17:18:49 1988 --- str.h Wed Sep 7 17:18:50 1988 *************** *** 1,6 **** ! /* $Header: str.h,v 2.0.1.1 88/08/03 22:43:53 root Exp $ * * $Log: str.h,v $ * Revision 2.0.1.1 88/08/03 22:43:53 root * patch11: support for botched C compilers that ungrok && outside of conditionals * --- 1,9 ---- ! /* $Header: str.h,v 2.0.1.2 88/09/07 17:04:00 lwall Locked $ * * $Log: str.h,v $ + * Revision 2.0.1.2 88/09/07 17:04:00 lwall + * patch14: searches should now work on chars with the 128 bit set + * * Revision 2.0.1.1 88/08/03 22:43:53 root * patch11: support for botched C compilers that ungrok && outside of conditionals * *************** *** 20,27 **** } str_link; char str_pok; /* state of str_ptr */ char str_nok; /* state of str_nval */ ! char str_rare; /* used by search strings */ ! char str_prev; /* also used by search strings */ }; #define Nullstr Null(STR*) --- 23,30 ---- } str_link; char str_pok; /* state of str_ptr */ char str_nok; /* state of str_nval */ ! unsigned char str_rare; /* used by search strings */ ! unsigned char str_prev; /* also used by search strings */ }; #define Nullstr Null(STR*) Index: toke.c Prereq: 2.0.1.4 *** toke.c.old Wed Sep 7 17:19:02 1988 --- toke.c Wed Sep 7 17:19:05 1988 *************** *** 1,6 **** ! /* $Header: toke.c,v 2.0.1.4 88/08/03 22:47:39 root Exp $ * * $Log: toke.c,v $ * Revision 2.0.1.4 88/08/03 22:47:39 root * patch11: unterminated literal strings blew up tokener in eval * --- 1,12 ---- ! /* $Header: toke.c,v 2.0.1.5 88/09/07 17:09:52 lwall Locked $ * * $Log: toke.c,v $ + * Revision 2.0.1.5 88/09/07 17:09:52 lwall + * patch14: added detection of "sort" not used as keyword + * patch14: case insensitive search speedup + * patch14: evals of long strings could use up gobs of memory + * patch14: $) and $| weren't properly evaluated in `` or "" + * * Revision 2.0.1.4 88/08/03 22:47:39 root * patch11: unterminated literal strings blew up tokener in eval * *************** *** 591,596 **** --- 597,605 ---- if (strEQ(d,"symlink")) FUN2(O_SYMLINK); if (strEQ(d,"sort")) { + while (*s && isspace(*s)) s++; + if (*s == ';' || *s == ')') + fatal("sort is now a reserved word\n"); yylval.ival = O_SORT; OPERATOR(LISTOP); } *************** *** 781,786 **** --- 790,796 ---- { register SPAT *spat = (SPAT *) safemalloc(sizeof (SPAT)); register char *d; + SPAT savespat; bzero((char *)spat, sizeof(SPAT)); spat->spat_next = spat_root; /* link into spat list */ *************** *** 804,809 **** --- 814,820 ---- s++; if (*s == 'i') { s++; + sawi = TRUE; spat->spat_flags |= SPAT_FOLD; } for (d=tokenbuf; *d; d++) { *************** *** 817,843 **** goto got_pat; /* skip compiling for now */ } } ! if (!(spat->spat_flags & SPAT_FOLD)) { ! if (*tokenbuf == '^') { ! spat->spat_short = scanconst(tokenbuf+1); ! if (spat->spat_short) { ! spat->spat_slen = strlen(spat->spat_short->str_ptr); ! if (spat->spat_slen == strlen(tokenbuf+1)) ! spat->spat_flags |= SPAT_ALL; ! } } - else { - spat->spat_flags |= SPAT_SCANFIRST; - spat->spat_short = scanconst(tokenbuf); - if (spat->spat_short) { - spat->spat_slen = strlen(spat->spat_short->str_ptr); - if (spat->spat_slen == strlen(tokenbuf)) - spat->spat_flags |= SPAT_ALL; - } - } } ! spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1); ! hoistmust(spat); got_pat: yylval.arg = make_match(O_MATCH,stab2arg(A_STAB,defstab),spat); return s; --- 828,873 ---- goto got_pat; /* skip compiling for now */ } } ! if (spat->spat_flags & SPAT_FOLD) ! #ifdef STRUCTCOPY ! savespat = *spat; ! #else ! bcopy((char *)spat, (char *)&savespat, sizeof(SPAT)); ! #endif ! if (*tokenbuf == '^') { ! spat->spat_short = scanconst(tokenbuf+1); ! if (spat->spat_short) { ! spat->spat_slen = strlen(spat->spat_short->str_ptr); ! if (spat->spat_slen == strlen(tokenbuf+1)) ! spat->spat_flags |= SPAT_ALL; } } ! else { ! spat->spat_flags |= SPAT_SCANFIRST; ! spat->spat_short = scanconst(tokenbuf); ! if (spat->spat_short) { ! spat->spat_slen = strlen(spat->spat_short->str_ptr); ! if (spat->spat_slen == strlen(tokenbuf)) ! spat->spat_flags |= SPAT_ALL; ! } ! } ! if ((spat->spat_flags & SPAT_ALL) && (spat->spat_flags & SPAT_SCANFIRST)) { ! fbmcompile(spat->spat_short, spat->spat_flags & SPAT_FOLD); ! spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1); ! /* Note that this regexp can still be used if someone says ! * something like /a/ && s//b/; so we can't delete it. ! */ ! } ! else { ! if (spat->spat_flags & SPAT_FOLD) ! #ifdef STRUCTCOPY ! *spat = savespat; ! #else ! bcopy((char *)&savespat, (char *)spat, sizeof(SPAT)); ! #endif ! spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1); ! hoistmust(spat); ! } got_pat: yylval.arg = make_match(O_MATCH,stab2arg(A_STAB,defstab),spat); return s; *************** *** 878,884 **** spat->spat_short = scanconst(tokenbuf); if (spat->spat_short) spat->spat_slen = strlen(spat->spat_short->str_ptr); ! } d = savestr(tokenbuf); get_repl: s = scanstr(s); --- 908,914 ---- spat->spat_short = scanconst(tokenbuf); if (spat->spat_short) spat->spat_slen = strlen(spat->spat_short->str_ptr); ! } d = savestr(tokenbuf); get_repl: s = scanstr(s); *************** *** 902,920 **** } if (*s == 'i') { s++; spat->spat_flags |= SPAT_FOLD; } } if (!spat->spat_runtime) { spat->spat_regexp = regcomp(d, spat->spat_flags & SPAT_FOLD,1); hoistmust(spat); safefree(d); } - if (spat->spat_flags & SPAT_FOLD) { /* Oops, disable optimization */ - str_free(spat->spat_short); - spat->spat_short = Nullstr; - spat->spat_slen = 0; - } yylval.arg = make_match(O_SUBST,stab2arg(A_STAB,defstab),spat); return s; } --- 932,953 ---- } if (*s == 'i') { s++; + sawi = TRUE; spat->spat_flags |= SPAT_FOLD; + if (!(spat->spat_flags & SPAT_SCANFIRST)) { + str_free(spat->spat_short); /* anchored opt doesn't do */ + spat->spat_short = Nullstr; /* case insensitive match */ + spat->spat_slen = 0; + } } } + if (spat->spat_short && (spat->spat_flags & SPAT_SCANFIRST)) + fbmcompile(spat->spat_short, spat->spat_flags & SPAT_FOLD); if (!spat->spat_runtime) { spat->spat_regexp = regcomp(d, spat->spat_flags & SPAT_FOLD,1); hoistmust(spat); safefree(d); } yylval.arg = make_match(O_SUBST,stab2arg(A_STAB,defstab),spat); return s; } *************** *** 1161,1167 **** int sqstart = line; char *tmps; ! tmpstr = str_new(strlen(s)); s = str_append_till(tmpstr,s+1,term,leave); while (!*s) { /* multiple line string? */ if (!rsfp || !(s = str_gets(linestr, rsfp))) { --- 1194,1200 ---- int sqstart = line; char *tmps; ! tmpstr = str_new(0); s = str_append_till(tmpstr,s+1,term,leave); while (!*s) { /* multiple line string? */ if (!rsfp || !(s = str_gets(linestr, rsfp))) { *************** *** 1172,1177 **** --- 1205,1214 ---- s = str_append_till(tmpstr,s,term,leave); } s++; + if (tmpstr->str_cur + 5 < tmpstr->str_len) { + tmpstr->str_len = tmpstr->str_cur + 1; + tmpstr->str_ptr = saferealloc(tmpstr->str_ptr,tmpstr->str_len); + } if (term == '\'') { arg[1].arg_ptr.arg_str = tmpstr; break; *************** *** 1182,1188 **** if (*s == '\\' && s[1] && isdigit(s[1]) && !isdigit(s[2]) && !index("`\"",term) ) *s = '$'; /* grandfather \digit in subst */ ! if (*s == '$' && s[1] && s[1] != ')' && s[1] != '|') { makesingle = FALSE; /* force interpretation */ } else if (*s == '\\' && s[1]) { --- 1219,1226 ---- if (*s == '\\' && s[1] && isdigit(s[1]) && !isdigit(s[2]) && !index("`\"",term) ) *s = '$'; /* grandfather \digit in subst */ ! if (*s == '$' && s[1] && ! (index("`\"",term) || (s[1] != ')' && s[1] != '|'))) { makesingle = FALSE; /* force interpretation */ } else if (*s == '\\' && s[1]) { Index: util.c Prereq: 2.0.1.3 *** util.c.old Wed Sep 7 17:19:12 1988 --- util.c Wed Sep 7 17:19:14 1988 *************** *** 1,6 **** ! /* $Header: util.c,v 2.0.1.3 88/08/03 22:48:34 root Exp $ * * $Log: util.c,v $ * Revision 2.0.1.3 88/08/03 22:48:34 root * patch11: fiddled with declarations to keep some compilers happy * --- 1,10 ---- ! /* $Header: util.c,v 2.0.1.4 88/09/07 17:12:49 lwall Locked $ * * $Log: util.c,v $ + * Revision 2.0.1.4 88/09/07 17:12:49 lwall + * patch14: case insensitive search speedup + * patch14: searches should now work on chars with the 128 bit set + * * Revision 2.0.1.3 88/08/03 22:48:34 root * patch11: fiddled with declarations to keep some compilers happy * *************** *** 201,210 **** register int i; register int len = str->str_cur; ! str_grow(str,len+128); s = str->str_ptr; table = s + len; ! for (i = 1; i < 128; i++) { table[i] = len; } i = 0; --- 205,214 ---- register int i; register int len = str->str_cur; ! str_grow(str,len+256); s = str->str_ptr; table = s + len; ! for (i = 1; i < 256; i++) { table[i] = len; } i = 0; *************** *** 219,224 **** --- 223,263 ---- } #endif /* NOTDEF */ + unsigned char fold[] = { + 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15, + 16, 17, 18, 19, 20, 21, 22, 23, + 24, 25, 26, 27, 28, 29, 30, 31, + 32, 33, 34, 35, 36, 37, 38, 39, + 40, 41, 42, 43, 44, 45, 46, 47, + 48, 49, 50, 51, 52, 53, 54, 55, + 56, 57, 58, 59, 60, 61, 62, 63, + 64, 'a', 'b', 'c', 'd', 'e', 'f', 'g', + 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', + 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', + 'x', 'y', 'z', 91, 92, 93, 94, 95, + 96, 'A', 'B', 'C', 'D', 'E', 'F', 'G', + 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', + 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', + 'X', 'Y', 'Z', 123, 124, 125, 126, 127, + 128, 129, 130, 131, 132, 133, 134, 135, + 136, 137, 138, 139, 140, 141, 142, 143, + 144, 145, 146, 147, 148, 149, 150, 151, + 152, 153, 154, 155, 156, 157, 158, 159, + 160, 161, 162, 163, 164, 165, 166, 167, + 168, 169, 170, 171, 172, 173, 174, 175, + 176, 177, 178, 179, 180, 181, 182, 183, + 184, 185, 186, 187, 188, 189, 190, 191, + 192, 193, 194, 195, 196, 197, 198, 199, + 200, 201, 202, 203, 204, 205, 206, 207, + 208, 209, 210, 211, 212, 213, 214, 215, + 216, 217, 218, 219, 220, 221, 222, 223, + 224, 225, 226, 227, 228, 229, 230, 231, + 232, 233, 234, 235, 236, 237, 238, 239, + 240, 241, 242, 243, 244, 245, 246, 247, + 248, 249, 250, 251, 252, 253, 254, 255 + }; + static unsigned char freq[] = { 1, 2, 84, 151, 154, 155, 156, 157, 165, 246, 250, 3, 158, 7, 18, 29, *************** *** 255,274 **** }; void ! fbmcompile(str) STR *str; { ! register char *s; ! register char *table; register int i; register int len = str->str_cur; int rarest = 0; int frequency = 256; ! str_grow(str,len+128); table = str->str_ptr + len; /* actually points at final '\0' */ s = table - 1; ! for (i = 1; i < 128; i++) { table[i] = len; } i = 0; --- 294,314 ---- }; void ! fbmcompile(str, iflag) STR *str; + int iflag; { ! register unsigned char *s; ! register unsigned char *table; register int i; register int len = str->str_cur; int rarest = 0; int frequency = 256; ! str_grow(str,len+256); table = str->str_ptr + len; /* actually points at final '\0' */ s = table - 1; ! for (i = 1; i < 256; i++) { table[i] = len; } i = 0; *************** *** 275,293 **** while (s >= str->str_ptr) { if (!isascii(*s)) return; ! if (table[*s] == len) ! table[*s] = i; s--,i++; } str->str_pok |= 2; /* deep magic */ s = str->str_ptr; /* deeper magic */ ! for (i = 0; i < len; i++) { ! if (freq[s[i]] < frequency) { ! rarest = i; ! frequency = freq[s[i]]; } } str->str_rare = s[rarest]; str->str_prev = rarest; #ifdef DEBUGGING --- 315,352 ---- while (s >= str->str_ptr) { if (!isascii(*s)) return; ! if (table[*s] == len) { ! if (iflag) ! table[*s] = table[fold[*s]] = i; ! else ! table[*s] = i; ! } s--,i++; } str->str_pok |= 2; /* deep magic */ s = str->str_ptr; /* deeper magic */ ! if (iflag) { ! register int tmp, foldtmp; ! str->str_pok |= 8; ! for (i = 0; i < len; i++) { ! tmp=freq[s[i]]; ! foldtmp=freq[fold[s[i]]]; ! if (tmp < frequency && foldtmp < frequency) { ! rarest = i; ! /* choose most frequent among the two */ ! frequency = (tmp > foldtmp) ? tmp : foldtmp; ! } } } + else { + for (i = 0; i < len; i++) { + if (freq[s[i]] < frequency) { + rarest = i; + frequency = freq[s[i]]; + } + } + } str->str_rare = s[rarest]; str->str_prev = rarest; #ifdef DEBUGGING *************** *** 330,345 **** register char *bigend; STR *littlestr; { ! register char *s; register int tmp; register int littlelen; ! register char *little; ! register char *table; ! register char *olds; ! register char *oldlittle; register int min; ! if (littlestr->str_pok != 3) return instr(big,littlestr->str_ptr); littlelen = littlestr->str_cur; --- 389,404 ---- register char *bigend; STR *littlestr; { ! register unsigned char *s; register int tmp; register int littlelen; ! register unsigned char *little; ! register unsigned char *table; ! register unsigned char *olds; ! register unsigned char *oldlittle; register int min; ! if (littlestr->str_pok < 3) return instr(big,littlestr->str_ptr); littlelen = littlestr->str_cur; *************** *** 346,369 **** table = littlestr->str_ptr + littlelen; s = big + --littlelen; oldlittle = little = table - 1; ! while (s < bigend) { ! top: ! if (tmp = table[*s]) { ! s += tmp; } ! else { ! tmp = littlelen; /* less expensive than calling strncmp() */ ! olds = s; ! while (tmp--) { ! if (*--s == *--little) ! continue; ! s = olds + 1; /* here we pay the price for failure */ ! little = oldlittle; ! if (s < bigend) /* fake up continue to outer loop */ ! goto top; ! return Nullch; } ! return s; } } return Nullch; --- 405,452 ---- table = littlestr->str_ptr + littlelen; s = big + --littlelen; oldlittle = little = table - 1; ! if (littlestr->str_pok & 8) { /* case insensitive? */ ! while (s < bigend) { ! top1: ! if (tmp = table[*s]) { ! s += tmp; ! } ! else { ! tmp = littlelen; /* less expensive than calling strncmp() */ ! olds = s; ! while (tmp--) { ! if (*--s == *--little || fold[*s] == *little) ! continue; ! s = olds + 1; /* here we pay the price for failure */ ! little = oldlittle; ! if (s < bigend) /* fake up continue to outer loop */ ! goto top1; ! return Nullch; ! } ! return s; ! } } ! } ! else { ! while (s < bigend) { ! top2: ! if (tmp = table[*s]) { ! s += tmp; } ! else { ! tmp = littlelen; /* less expensive than calling strncmp() */ ! olds = s; ! while (tmp--) { ! if (*--s == *--little) ! continue; ! s = olds + 1; /* here we pay the price for failure */ ! little = oldlittle; ! if (s < bigend) /* fake up continue to outer loop */ ! goto top2; ! return Nullch; ! } ! return s; ! } } } return Nullch; *************** *** 374,385 **** STR *bigstr; STR *littlestr; { ! register char *s, *x; ! register char *big = bigstr->str_ptr; register int pos; register int previous; register int first; ! register char *little; if ((pos = screamfirst[littlestr->str_rare]) < 0) return Nullch; --- 457,468 ---- STR *bigstr; STR *littlestr; { ! register unsigned char *s, *x; ! register unsigned char *big = bigstr->str_ptr; register int pos; register int previous; register int first; ! register unsigned char *little; if ((pos = screamfirst[littlestr->str_rare]) < 0) return Nullch; *************** *** 391,410 **** if (!(pos += screamnext[pos])) return Nullch; } ! do { ! if (big[pos] != first) ! continue; ! for (x=big+pos+1,s=little; *s; /**/ ) { ! if (!*x) ! return Nullch; ! if (*s++ != *x++) { ! s--; ! break; } ! } ! if (!*s) ! return big+pos; ! } while (pos += screamnext[pos]); return Nullch; } --- 474,511 ---- if (!(pos += screamnext[pos])) return Nullch; } ! if (littlestr->str_pok & 8) { /* case insignificant? */ ! do { ! if (big[pos] != first && big[pos] != fold[first]) ! continue; ! for (x=big+pos+1,s=little; *s; /**/ ) { ! if (!*x) ! return Nullch; ! if (*s++ != *x++ && fold[*(s-1)] != *(x-1)) { ! s--; ! break; ! } } ! if (!*s) ! return big+pos; ! } while (pos += screamnext[pos]); ! } ! else { ! do { ! if (big[pos] != first) ! continue; ! for (x=big+pos+1,s=little; *s; /**/ ) { ! if (!*x) ! return Nullch; ! if (*s++ != *x++) { ! s--; ! break; ! } ! } ! if (!*s) ! return big+pos; ! } while (pos += screamnext[pos]); ! } return Nullch; }