tarvaine@tukki.jyu.fi (Tapani Tarvainen) (06/17/89)
I tried porting Gnu e?grep to MS-DOS (Turbo C 2.0). In the process I found something of a bug, or at least a piece of not-so-portable code in regex.c. The program compiled easily, with only a few trivial modifications like different include files and specifying stack size - in only one place I changed actual code (in displaying usage it assumes directory separator is /). And of course makefile had to be changed rather drastically to get Borland's make digest it. At first it worked fine, until I tried a rather complicated regexp and got "Memory exhausted". Well, I recompiled it with -mc (compact memory model, i.e., far data pointers). After which it still gave "Memory exhausted" for just about anything but fixed strings, regardless of how much memory was available. I traced the problem to the following macro, used in function re_compile_pattern in regex.c: #define EXTEND_BUFFER \ { char *old_buffer = bufp->buffer; \ if (bufp->allocated == (1<<16)) goto too_big; \ bufp->allocated *= 2; \ if (bufp->allocated > (1<<16)) bufp->allocated = (1<<16); \ if (!(bufp->buffer = (char *) realloc (bufp->buffer, bufp->allocated))) \ goto memory_exhausted; \ c = bufp->buffer - old_buffer; \ b += c; \ if (fixup_jump) \ fixup_jump += c; \ if (laststart) \ laststart += c; \ begalt += c; \ if (pending_exact) \ pending_exact += c; \ } What do you think a stupid compiler with 16-bit ints makes out of an expression like 1<<16? Right, zero. I substituted 1L<<16 (what would be the aesthetically correct form?) and changed the definition of allocated in struct re_pattern_buffer in regex.h from int to long, and the problem disappeared. (BTW, is there some machine where this could harm anything? I mean, both ints and longs are 32 bits in 32-bit machines anyway, aren't they?) So far so good. But then I tried some even more complicated regexps and - the machine crashed. Oh well, debugger out again, and so it turned out the problem was again in the above macro. Look at this piece of code: c = bufp->buffer - old_buffer; b += c; Pointer subtraction is only guaranteed to work when the pointers point to the same structure, which is not the case here. And indeed, in 80x86 large memory model pointer subtraction is done by subtracting offsets only, which is OK as long as individual structures are <64K, *as long as the segments are same*. And here they may not be. Using huge pointers would solve the problem but waste time, and I wanted a portable (and standard-conforming) solution. This one seems to fit the bill: #define EXTEND_BUFFER \ { char *old_buffer = bufp->buffer; \ if (bufp->allocated == (1L<<16)) goto too_big; \ bufp->allocated *= 2; \ if (bufp->allocated > (1L<<16)) bufp->allocated = (1L<<16); \ if (!(bufp->buffer = (char *) realloc (bufp->buffer, bufp->allocated))) \ goto memory_exhausted; \ c = b - old_buffer; \ b = bufp->buffer + c; \ if (fixup_jump) { \ c = fixup_jump - old_buffer; \ fixup_jump = bufp->buffer + c; \ } \ if (laststart) { \ c = laststart - old_buffer; \ laststart = bufp->buffer + c; \ } \ c = begalt - old_buffer; \ begalt = bufp->buffer + c; \ if (pending_exact) { \ c = pending_exact - old_buffer; \ pending_exact = bufp->buffer + c; \ } \ } I *think* b = bufp->buffer + (b - old_buffer); etc should also work, but some compiler might rearrange it as b = (bufp->buffer - old_buffer) + b; which again would fail. Anyway, a decent compiler (like gcc) should produce as good code either way. -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi
tarvaine@tukki.jyu.fi (Tapani Tarvainen) (06/18/89)
In article <920@tukki.jyu.fi> I wrote about problems in porting
Gnu e?grep to MS-DOS. It turned out I hadn't found them all:
It still failed with certain regexps. It didn't crash, just gave
wrong results, making the bug much harder to track down.
At this point it fortunately occurred me that using large data model
wasn't really necessary after all - the "Memory exhausted" problem
with small model was only the result of the erroneous test described
in my previous posting. And indeed, after compiling with small model
I got perfect results every time. So I could hunt the bug in large
data model by tracing the versions side by side.
The problem was in regex.c, which presumably has been written when
compilers that violate its assumptions were but a dream in some
ANSI-committee-member-to-be's mind. Actually, there were two
related problems:
One, the code assumes that the result of subtracting pointers is
an int. According to pANS as well as TurboC, it is ptrdiff_t --
and in large data models the latter defines it as long.
Two, occasionally 0 (zero) is passed as a pointer argument without
cast, which I think is valid only when ints and pointers are of the
same size and the representation of null pointer is (int)0.
(If NULL had been used, it would've worked with Turbo C as it
defines NULL as 0L in large data models, but that is something
of a kludge, and might fail with some other compiler.)
Both had the effect of passing 4-byte values to functions
expecting 2-byte ones, with predictable results.
To fix these I changed most ints in regex.c and a few in grep.c
into ptrdiff_t and added casts where necessary (and probably some
that strictly aren't), including ptrdiff_t type constants.
To maintain compatibility with pre-ANSI compilers I added
#ifndef __STDC__
typedef int ptrdiff_t;
#endif
It is perhaps worth noting that with ANSI-style prototypes
these problems would not have occurred in the first place.
I _think_ there are no more bugs left, but I'll experiment a little
yet to make sure.
By the way, what is the recommended place for sending ports of Gnu
stuff like this - I mean if there are no bugs (for which this group is
for), only things like different #include's or makefile? Where should
I send the diffs for this & TC makefile once I'm satisfied with the
thing?
--
Tapani Tarvainen BitNet: tarvainen@finjyu
Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi
tarvaine@tukki.jyu.fi (Tapani Tarvainen) (06/23/89)
Here are the patches I've made to get Gnu e?grep 1.3 compile with Turbo C 2.0 (and run after it ...). Most of the changes are obvious (changed #includes &c) and isolated with #ifdef __TURBOC__ or #ifdef __MSDOS__. Otherwise the only change that really had to be made was changing 1<<16 in EXTEND_BUFFER to 1L<<16 (see previous posting), however I made a number of other changes to make this compile with large data models also (even though there isn't really much reason to do that): a number of ints have been changed to ptrdiff_t and some casts and declarations added. One of them perhaps deserves an extra comment, from getopt.c: { char c = *nextchar++; char *temp = (char *) index (optstring, c); What is that cast doing there, index() is of type (char *) already, no? Its only effect is to remove a compiler/lint warning ... which would have been necessary: index() isn't declared, so the compiler thinks it returns an int, and when ints are 16 and pointers 32 bits ... you get the idea. I added the declaration and removed the cast (I think this is a good example of how casts should NOT be used). The definition of EXTEND_BUFFER in regex.c here is different from the one I posted earlier. It has been brought to my attention since that systems may exist where it still wouldn't work: systems may exist where subtracting pointers to freed areas cannot be done at all ... this one avoids that problem. I'm still not perfectly happy with this. Despite the effort to make it compile under large data models, it still wouldn't work too well with them, if the buffer grows so big that using them would actually be necessary; actually I suspect it may crash if the buffer exceeds 32K. -- As I am writing this I realize it almost certainly will, EXTEND_BUFFER isn't foolproof still: bufp->allocated, now a long, is passed to realloc() which expects size_t. TurboC's include files with prototypes prevent this from causing trouble, _as long as it fits_ - but when it doesn't, truncating it may have all kinds of odd effects. Hmmm ... actually it always will, the test against 1L<<16 ensures that ... except when it becomes exactly 2^16. There are also some things which may fail if the regex or a source line is longer than 32K. I don't have time to investigate this further right now (for a week at least :->), but I may return to it later; if anybody can come up with an example which the small model version can't handle, I would be very interest in seeing it. In any event it should still work everywhere it worked before (only EXTEND_BUFFER should produce different code), that is in machines with sizeof(int)==sizeof(pointer) and flat address space. In addition to the changes needed to make it work, as MS-DOS lacks standard help system I added a piece of built-in documentation: compile with LONGHELP defined (my tcc makefile does) and the one-line "usage:"-message is replaced with a screenfull summarizing options & regex special chars. Incidentally, I tried compiling this with Microsoft C 5.0 too (which apparently is full of bugs, I guess I should get 5.1): the definition of EXTEND_BUFFER was too long for it, and it actually crashed with dfa.c. OK, enough blathering: feed this to patch(1) and the result should compile without further trouble. Oh yes, the makefile is meant for Borland's MAKE, I don't know if it'll work with others. diff -c ./alloca.c tc/alloca.c *** ./alloca.c Fri Mar 3 12:44:50 1989 --- tc/alloca.c Fri Jun 23 16:19:28 1989 *************** *** 26,31 **** --- 26,35 ---- static char SCCSid[] = "@(#)alloca.c 1.1"; /* for the "what" utility */ #endif + #ifdef __MSDOS__ + #define STACK_DIRECTION -1 + #endif + #ifdef emacs #include "config.h" #ifdef static diff -c ./dfa.c tc/dfa.c *** ./dfa.c Fri Mar 3 13:46:34 1989 --- tc/dfa.c Fri Jun 23 16:19:29 1989 *************** *** 2220,2223 **** ifree(mp[i].is); } free((char *) mp); - } --- 2220,2222 ---- diff -c ./dfa.h tc/dfa.h *** ./dfa.h Fri Mar 3 16:53:14 1989 --- tc/dfa.h Fri Jun 23 16:19:30 1989 *************** *** 1,6 **** /* dfa.h - declarations for GNU deterministic regexp compiler ! Copyright (C) 1988 Free Software Foundation, Inc. Written June, 1988 by Mike Haertel NO WARRANTY --- 1,7 ---- /* dfa.h - declarations for GNU deterministic regexp compiler ! Copyright (C) 1988, 1989 Free Software Foundation, Inc. Written June, 1988 by Mike Haertel + TurboC mods June, 1989 by Tapani Tarvainen NO WARRANTY *************** *** 103,108 **** --- 104,112 ---- You are forbidden to forbid anyone else to use, share and improve what you give them. Help stamp out software-hoarding! */ + #ifdef __TURBOC__ + #define USG + #endif #ifdef USG #include <string.h> *************** *** 113,126 **** #endif #ifdef __STDC__ - /* Missing include files for GNU C. */ /* #include <stdlib.h> */ typedef int size_t; extern void *calloc(int, size_t); extern void *malloc(size_t); extern void *realloc(void *, size_t); extern void free(void *); extern char *bcopy(), *bzero(); --- 117,131 ---- #endif #ifdef __STDC__ /* Missing include files for GNU C. */ /* #include <stdlib.h> */ + #ifndef __TURBOC__ typedef int size_t; extern void *calloc(int, size_t); extern void *malloc(size_t); extern void *realloc(void *, size_t); extern void free(void *); + #endif extern char *bcopy(), *bzero(); diff -c ./getopt.c tc/getopt.c *** ./getopt.c Fri Mar 3 12:44:54 1989 --- tc/getopt.c Fri Jun 23 16:19:30 1989 *************** *** 1,6 **** /* Getopt for GNU. ! Copyright (C) 1987 Free Software Foundation, Inc. NO WARRANTY BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY --- 1,8 ---- /* Getopt for GNU. ! Copyright (C) 1987, 1989 Free Software Foundation, Inc. ! MS-DOS/TurboC mods June 1989 by Tapani Tarvainen + NO WARRANTY BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY *************** *** 108,119 **** --- 110,129 ---- GNU application programs can use a third alternative mode in which they can distinguish the relative order of options and other arguments. */ + #include <stdio.h> + #ifdef __TURBOC__ + #define USG + #include <stdlib.h> + #include <string.h> + void * alloca (unsigned); + #endif #ifdef sparc #include <alloca.h> #endif #ifdef USG + extern char * index (); #define bcopy(s, d, l) memcpy((d), (s), (l)) #endif *************** *** 358,364 **** { char c = *nextchar++; ! char *temp = (char *) index (optstring, c); /* Increment `optind' when we start to process its last character. */ if (*nextchar == 0) --- 368,374 ---- { char c = *nextchar++; ! char *temp = index (optstring, c); /* Increment `optind' when we start to process its last character. */ if (*nextchar == 0) diff -c ./grep.c tc/grep.c *** ./grep.c Fri Mar 3 18:05:52 1989 --- tc/grep.c Fri Jun 23 16:19:31 1989 *************** *** 1,8 **** /* grep - print lines matching an extended regular expression ! Copyright (C) 1988 Free Software Foundation, Inc. Written June, 1988 by Mike Haertel BMG speedups added July, 1988 by James A. Woods and Arthur David Olson NO WARRANTY --- 1,9 ---- /* grep - print lines matching an extended regular expression ! Copyright (C) 1988, 1989 Free Software Foundation, Inc. Written June, 1988 by Mike Haertel BMG speedups added July, 1988 by James A. Woods and Arthur David Olson + MS-DOS/TurboC mods by Tapani Tarvainen June, 1989 NO WARRANTY *************** *** 104,118 **** In other words, you are welcome to use, share and improve this program. You are forbidden to forbid anyone else to use, share and improve what you give them. Help stamp out software-hoarding! */ #include <ctype.h> #include <stdio.h> #ifdef USG - #include <memory.h> #include <string.h> ! #else #include <strings.h> ! #endif #include "dfa.h" #include "regex.h" --- 105,134 ---- In other words, you are welcome to use, share and improve this program. You are forbidden to forbid anyone else to use, share and improve what you give them. Help stamp out software-hoarding! */ + + + #ifdef __TURBOC__ + #define USG + #endif + #include <ctype.h> #include <stdio.h> + #ifdef USG #include <string.h> ! #ifdef __TURBOC__ ! #include <stdlib.h> ! #include <alloc.h> ! #include <mem.h> ! unsigned _stklen = 20000; ! #else !__TURBOC__ ! #include <memory.h> ! #endif __TURBOC__ ! #else !USG #include <strings.h> ! #endif USG ! #include "dfa.h" #include "regex.h" *************** *** 298,310 **** static grep() { ! int retain = 0; /* Number of bytes to retain on next call to fill_buffer_retaining(). */ char *search_limit; /* Pointer to the character after the last newline in the buffer. */ char saved_char; /* Character after the last newline. */ char *resume; /* Pointer to where to resume search. */ ! int resume_index = 0; /* Count of characters to ignore after refilling the buffer. */ int line_count = 1; /* Line number. */ int try_backref; /* Set to true if we need to verify the --- 314,326 ---- static grep() { ! ptrdiff_t retain = 0; /* Number of bytes to retain on next call to fill_buffer_retaining(). */ char *search_limit; /* Pointer to the character after the last newline in the buffer. */ char saved_char; /* Character after the last newline. */ char *resume; /* Pointer to where to resume search. */ ! ptrdiff_t resume_index = 0; /* Count of characters to ignore after refilling the buffer. */ int line_count = 1; /* Line number. */ int try_backref; /* Set to true if we need to verify the *************** *** 369,375 **** a backtracking matcher to make sure the line is a match. */ if (try_backref && re_search(®ex, matching_line, next_line - matching_line - 1, ! 0, next_line - matching_line - 1, NULL) < 0) { --- 385,391 ---- a backtracking matcher to make sure the line is a match. */ if (try_backref && re_search(®ex, matching_line, next_line - matching_line - 1, ! (ptrdiff_t)0, next_line - matching_line - 1, NULL) < 0) { *************** *** 537,544 **** usage_and_die() { fprintf(stderr, ! "usage: %s [-CVbchilnsvwx] [-<num>] [-AB <num>] [-f file] [-e] expr [files]\n", ! prog); exit(ERROR); } --- 553,596 ---- usage_and_die() { fprintf(stderr, ! "usage: %s [-CVbchilnsvwx] [-<num>] [-AB <num>] [-f file] [-e] expr [files]\n" ! #ifdef LONGHELP ! /* this assumes compiler merges adjacent strings */ ! "\n-A <num> context after match\t\t-h\tdon't display filenames\n" ! "-B <num> context before match\t\t-i\tignore case\n" ! "-<num>\t context on each side\t\t-l\tlist files only\n" ! "-V\t version number\t\t-n\tline numbers\n" ! "-b\t byte offsets\t\t\t-s\trun silently\n" ! "-c\t total count only\t\t-v\tnon-matching lines only\n" ! "-e <expr> search for <expr>\t\t-w\tmatch only complete words\n" ! "-f <file> take <expr> from <file>\t-x\tmatch only whole lines\n\n" ! "In the regular expression:\n" ! ".\tany single character\t\t^\tbeginning of line\n" ! #ifndef EGREP ! "\\" ! #endif ! "?\trepeat 0 or 1 times\t\t$\tend of line\n" ! "*\trepeat 0 or more times\t\t\\<\tbeginning of word\n" ! #ifndef EGREP ! "\\" ! #endif ! "+\trepeat 1 or more times\t\t\\>\tend of word\n" ! "[ ]\tcharacter set, [^ ] complement\t" ! #ifdef EGREP ! "( )" ! #else ! "\\( \\)" ! #endif ! "\tgrouping\n" ! #ifndef EGREP ! "\\" ! #endif ! "|\tOR\t\t\t\t\\<n> \ttext inside <n>th parentheses\n" ! "\\w\t[a-zA-Z0-9]\t\t\t\\b\tat the edge of a word\n" ! "\\W\t[^a-zA-Z0-9]\t\t\t\\B\tnot at the edge of a word\n" ! "\\\tliteralize following special character\n" ! #endif ! ,prog); exit(ERROR); } *************** *** 562,570 **** --- 614,632 ---- char *regex_errmesg; /* Error message from regex routines. */ char translate[_NOTCHAR]; /* Translate table for case conversion (needed by the backtracking matcher). */ + int bmg_setup (); /* keep lint happy */ + #ifdef __MSDOS__ + if (prog = strrchr(argv[0], '\\')) { + char *p; + ++prog; + if (p = strrchr(prog, '.')) + *p = 0; + } + #else if (prog = strrchr(argv[0], '/')) ++prog; + #endif else prog = argv[0]; *************** *** 754,760 **** if (regex_errmesg = re_compile_pattern(the_regexp, regexp_len, ®ex)) regerror(regex_errmesg); ! /* Find the longest metacharacter-free string which must occur in the regexpr, before short-circuiting regexecute() with Boyer-Moore-Gosper. --- 816,822 ---- if (regex_errmesg = re_compile_pattern(the_regexp, regexp_len, ®ex)) regerror(regex_errmesg); ! /* Find the longest metacharacter-free string which must occur in the regexpr, before short-circuiting regexecute() with Boyer-Moore-Gosper. *************** *** 860,866 **** char *match; char *start = begin; char save; /* regexecute() sentinel */ ! int len; char *bmg_search(); if (!bmgexec) /* full automaton search */ --- 922,928 ---- char *match; char *start = begin; char save; /* regexecute() sentinel */ ! ptrdiff_t len; char *bmg_search(); if (!bmgexec) /* full automaton search */ *************** *** 867,873 **** return(regexecute(r, begin, end, newline, count, try_backref)); else { ! len = end - begin; while ((match = bmg_search((unsigned char *) start, len)) != NULL) { p = match; /* narrow search range to submatch line */ --- 929,935 ---- return(regexecute(r, begin, end, newline, count, try_backref)); else { ! len = end - begin; while ((match = bmg_search((unsigned char *) start, len)) != NULL) { p = match; /* narrow search range to submatch line */ diff -c ./regex.c tc/regex.c *** ./regex.c Fri Mar 3 12:44:58 1989 --- tc/regex.c Fri Jun 23 16:19:32 1989 *************** *** 1,5 **** --- 1,6 ---- /* Extended regular expression matching and search library. Copyright (C) 1985, 1989 Free Software Foundation, Inc. + MS-DOS/TurboC mods June, 1989 by Tapani Tarvainen This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by *************** *** 38,43 **** --- 39,52 ---- #else /* not emacs */ + #ifdef __TURBOC__ + #define USG + #include <stdlib.h> + #include <alloc.h> + #include <string.h> + void * alloca (unsigned); + #endif + #ifdef USG #define bcopy(s,d,n) memcpy((d),(s),(n)) #define bcmp(s1,s2,n) memcmp((s1),(s2),(n)) *************** *** 164,184 **** #define PATUNFETCH p-- #define EXTEND_BUFFER \ ! { char *old_buffer = bufp->buffer; \ ! if (bufp->allocated == (1<<16)) goto too_big; \ bufp->allocated *= 2; \ ! if (bufp->allocated > (1<<16)) bufp->allocated = (1<<16); \ if (!(bufp->buffer = (char *) realloc (bufp->buffer, bufp->allocated))) \ goto memory_exhausted; \ ! c = bufp->buffer - old_buffer; \ ! b += c; \ if (fixup_jump) \ ! fixup_jump += c; \ if (laststart) \ ! laststart += c; \ ! begalt += c; \ if (pending_exact) \ ! pending_exact += c; \ } static int store_jump (), insert_jump (); --- 173,196 ---- #define PATUNFETCH p-- #define EXTEND_BUFFER \ ! { ptrdiff_t b_ofs = b - bufp->buffer, \ ! fixup_jump_ofs = fixup_jump - bufp->buffer, \ ! laststart_ofs = laststart - bufp->buffer, \ ! begalt_ofs = begalt - bufp->buffer, \ ! pending_exact_ofs = pending_exact - bufp->buffer; \ ! if (bufp->allocated == (1L<<16)) goto too_big; \ bufp->allocated *= 2; \ ! if (bufp->allocated > (1L<<16)) bufp->allocated = (1L<<16); \ if (!(bufp->buffer = (char *) realloc (bufp->buffer, bufp->allocated))) \ goto memory_exhausted; \ ! b = bufp->buffer + b_ofs; \ if (fixup_jump) \ ! fixup_jump = bufp->buffer + fixup_jump_ofs; \ if (laststart) \ ! laststart = bufp->buffer + laststart_ofs; \ ! begalt = bufp->buffer + begalt_ofs; \ if (pending_exact) \ ! pending_exact = bufp->buffer + pending_exact_ofs; \ } static int store_jump (), insert_jump (); *************** *** 199,205 **** /* address of the count-byte of the most recently inserted "exactn" command. This makes it possible to tell whether a new exact-match character can be added to that command or requires a new "exactn" command. */ ! char *pending_exact = 0; /* address of the place where a forward-jump should go --- 211,217 ---- /* address of the count-byte of the most recently inserted "exactn" command. This makes it possible to tell whether a new exact-match character can be added to that command or requires a new "exactn" command. */ ! char *pending_exact = 0; /* address of the place where a forward-jump should go *************** *** 706,712 **** struct re_pattern_buffer *bufp; { unsigned char *pattern = (unsigned char *) bufp->buffer; ! int size = bufp->used; register char *fastmap = bufp->fastmap; register unsigned char *p = pattern; register unsigned char *pend = pattern + size; --- 718,724 ---- struct re_pattern_buffer *bufp; { unsigned char *pattern = (unsigned char *) bufp->buffer; ! ptrdiff_t size = bufp->used; register char *fastmap = bufp->fastmap; register unsigned char *p = pattern; register unsigned char *pend = pattern + size; *************** *** 886,895 **** re_search (pbufp, string, size, startpos, range, regs) struct re_pattern_buffer *pbufp; char *string; ! int size, startpos, range; struct re_registers *regs; { ! return re_search_2 (pbufp, 0, 0, string, size, startpos, range, regs, size); } /* Like re_match_2 but tries first a match starting at index STARTPOS, --- 898,908 ---- re_search (pbufp, string, size, startpos, range, regs) struct re_pattern_buffer *pbufp; char *string; ! ptrdiff_t size, startpos, range; struct re_registers *regs; { ! return re_search_2 (pbufp, (char *)0, (ptrdiff_t)0, ! string, size, startpos, range, regs, size); } /* Like re_match_2 but tries first a match starting at index STARTPOS, *************** *** 908,928 **** re_search_2 (pbufp, string1, size1, string2, size2, startpos, range, regs, mstop) struct re_pattern_buffer *pbufp; char *string1, *string2; ! int size1, size2; ! int startpos; ! register int range; struct re_registers *regs; ! int mstop; { register char *fastmap = pbufp->fastmap; register unsigned char *translate = (unsigned char *) pbufp->translate; ! int total = size1 + size2; int val; /* Update the fastmap now if not correct already */ if (fastmap && !pbufp->fastmap_accurate) re_compile_fastmap (pbufp); ! /* Don't waste time in a long search for a pattern that says it is anchored. */ if (pbufp->used > 0 && (enum regexpcode) pbufp->buffer[0] == begbuf --- 921,941 ---- re_search_2 (pbufp, string1, size1, string2, size2, startpos, range, regs, mstop) struct re_pattern_buffer *pbufp; char *string1, *string2; ! ptrdiff_t size1, size2; ! ptrdiff_t startpos; ! register ptrdiff_t range; struct re_registers *regs; ! ptrdiff_t mstop; { register char *fastmap = pbufp->fastmap; register unsigned char *translate = (unsigned char *) pbufp->translate; ! ptrdiff_t total = size1 + size2; int val; /* Update the fastmap now if not correct already */ if (fastmap && !pbufp->fastmap_accurate) re_compile_fastmap (pbufp); ! /* Don't waste time in a long search for a pattern that says it is anchored. */ if (pbufp->used > 0 && (enum regexpcode) pbufp->buffer[0] == begbuf *************** *** 946,954 **** { if (range > 0) { ! register int lim = 0; register unsigned char *p; ! int irange = range; if (startpos < size1 && startpos + range >= size1) lim = range - (size1 - startpos); --- 959,967 ---- { if (range > 0) { ! register ptrdiff_t lim = 0; register unsigned char *p; ! ptrdiff_t irange = range; if (startpos < size1 && startpos + range >= size1) lim = range - (size1 - startpos); *************** *** 1008,1017 **** re_match (pbufp, string, size, pos, regs) struct re_pattern_buffer *pbufp; char *string; ! int size, pos; struct re_registers *regs; { ! return re_match_2 (pbufp, 0, 0, string, size, pos, regs, size); } #endif /* emacs */ --- 1021,1031 ---- re_match (pbufp, string, size, pos, regs) struct re_pattern_buffer *pbufp; char *string; ! ptrdiff_t size, pos; struct re_registers *regs; { ! return re_match_2 (pbufp, (char *)0, (ptrdiff_t)0, ! string, size, pos, regs, size); } #endif /* emacs */ *************** *** 1040,1047 **** re_match_2 (pbufp, string1, size1, string2, size2, pos, regs, mstop) struct re_pattern_buffer *pbufp; unsigned char *string1, *string2; ! int size1, size2; ! int pos; struct re_registers *regs; int mstop; { --- 1054,1061 ---- re_match_2 (pbufp, string1, size1, string2, size2, pos, regs, mstop) struct re_pattern_buffer *pbufp; unsigned char *string1, *string2; ! ptrdiff_t size1, size2; ! ptrdiff_t pos; struct re_registers *regs; int mstop; { *************** *** 1591,1598 **** re_exec (s) char *s; { ! int len = strlen (s); ! return 0 <= re_search (&re_comp_buf, s, len, 0, len, 0); } #endif /* emacs */ --- 1605,1613 ---- re_exec (s) char *s; { ! ptrdiff_t len = strlen (s); ! return 0 <= re_search (&re_comp_buf, s, len, (ptrdiff_t)0, ! len, (ptrdiff_t)0); } #endif /* emacs */ *************** *** 1681,1687 **** gets (pat); /* Now read the string to match against */ ! i = re_match (&buf, pat, strlen (pat), 0, 0); printf ("Match value %d.\n", i); } } --- 1696,1703 ---- gets (pat); /* Now read the string to match against */ ! i = re_match (&buf, pat, strlen (pat), (ptrdiff_t)0, ! (struct re_registers *)0); printf ("Match value %d.\n", i); } } diff -c ./regex.h tc/regex.h *** ./regex.h Fri Mar 3 12:44:58 1989 --- tc/regex.h Fri Jun 23 16:19:32 1989 *************** *** 1,5 **** --- 1,6 ---- /* Definitions for data structures callers pass the regex library. Copyright (C) 1985, 1989 Free Software Foundation, Inc. + MS-DOS/TurboC mods June, 1989 by Tapani Tarvainen This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by *************** *** 21,26 **** --- 22,31 ---- what you give them. Help stamp out software-hoarding! */ + #ifndef __STDC__ + typedef int ptrdiff_t; + #endif + /* Define number of parens for which we record the beginnings and ends. This affects how much space the `struct re_registers' type takes up. */ #ifndef RE_NREGS *************** *** 72,79 **** struct re_pattern_buffer { char *buffer; /* Space holding the compiled pattern commands. */ ! int allocated; /* Size of space that buffer points to */ ! int used; /* Length of portion of buffer actually occupied */ char *fastmap; /* Pointer to fastmap, if any, or zero if none. */ /* re_search uses the fastmap, if there is one, to skip quickly over totally implausible characters */ --- 77,84 ---- struct re_pattern_buffer { char *buffer; /* Space holding the compiled pattern commands. */ ! long allocated; /* Size of space that buffer points to */ ! long used; /* Length of portion of buffer actually occupied */ char *fastmap; /* Pointer to fastmap, if any, or zero if none. */ /* re_search uses the fastmap, if there is one, to skip quickly over totally implausible characters */ *************** *** 175,180 **** --- 180,186 ---- extern void re_compile_fastmap (); extern int re_search (), re_search_2 (); extern int re_match (), re_match_2 (); + extern int re_set_syntax (); /* 4.2 bsd compatibility (yuck) */ extern char *re_comp (); *** /dev/null Fri Jun 23 03:34:40 1989 --- tc/makefile.tcc Fri Jun 23 16:19:31 1989 *************** *** 0 **** --- 1,38 ---- + # + # Makefile for GNU e?grep + # + # TurboC version by Tapani Tarvainen June 1989 + + CC = TCC + + !if $(DEBUG) + CFLAGS = -DLONGHELP $(MODEL) -O -A -f- -d -k -G -Z- -w-amb -w-pia -N -y -v + !else + CFLAGS = -DLONGHELP $(MODEL) -O -A -f- -d -k- -G -Z -w-amb -w-pia + !endif + + # + # Add wildargs.obj (supplied with Turbo C), if TC hasn't been + # installed to link it in automatically (or e?grep won't + # understand wildcards in filenames). + # + OBJS = dfa.obj regex.obj getopt.obj alloca.obj + GOBJ = grep.obj + EOBJ = egrep.obj + + .c.obj: + $(CC) $(CFLAGS) -c $< + + all: egrep grep + + egrep: $(OBJS) $(EOBJ) + $(CC) $(CFLAGS) -eegrep $(OBJS) $(EOBJ) $(LIBS) + + egrep.obj: grep.c + $(CC) $(CFLAGS) -DEGREP -c -oegrep grep.c + + grep: $(OBJS) $(GOBJ) + $(CC) $(CFLAGS) -egrep $(OBJS) $(GOBJ) $(LIBS) + + #dfa.obj egrep.obj grep.obj: dfa.h + #egrep.obj grep.obj regex.obj: regex.h -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi