cs106132@umbc5.umbc.edu (cs106132) (05/17/91)
Hi,
here is a less than optimized implementation of a VERY fast pattern
matching algorithm called Knuth-Morris-Pratt. It is a must in
almost everyone's library. Forget the good old "slide over one"
method to search for patterns. KMP is a screamer with linear time
complexity. The book I got this out of proves that, but theory is
not my favorite subject. Only if it was in 80x86 assembler!
/****************************************************************/
/* this function will build a next[] array for the pattern */
int build_next (signed char next[], unsigned char pat[], unsigned m) {
int found;
int j = 0;
int k = -1;
next[0] = -1;
while (j < m) {
found = 0;
while (k >= 0 && (! found)) {
if (pat[k] != pat[j])
k = next[k];
else
found = 1;
}
k++;
j++;
if (pat[j] == pat[k])
next[j] = next[k];
else
next[j] = k;
}
return (0);
}
/*********************************************************************
This function will looks for pat[] int text[] using next[]
It assumes that pat[] is less than 128 bytes long
********************************************************************/
int KMP (unsigned char text[], unsigned n, unsigned char pat[],
unsigned m, signed char next[]) {
int i = 0;
int j = 0;
int pos = 0;
while (i < m && j < n) {
if (pat[i] == text[j]) {
i++;
j++;
}
else {
pos += (i - next[i]);
if (next[i] >= 0))
i = next[i];
else {
i = 0;
j++;
}
}
}
if (i == m) {
cprintf ("%d\r\n", pos);
return (0); /* pos is where found start */
}
else
return (-1);
}
/*********************************************************************/
I need an 80x86 assembler version of the routines above, especially
KMP. If any assembly gurus out there take the challenge and post a
C-callable version of this, it would be much appreciated.
Regards,
Tarkan