[comp.lang.pascal] String matching - parallel/concurrent/randomized info needed

hsu_wh@jhunix.HCF.JHU.EDU (William H Hsu) (07/18/90)

     I am trying to implement some good partial string matching
functions for use with strictly binary files (specifically, the
resource forks of Macintosh files).  I have encountered several
compiler and language specific problems while trying to get two
algorithms (one generic dynamic pattern matching scheme and one
randomized version of the Boyer-Moore/BMG algorithm) to work in
C.  
     First, I'm looking for a way to parallelize my string
matching.  I have seen references to a concurrent version of
Boyer-Moore and dynamic algorithms that run in O(kn) time for k-
approximate matches in text length n (Landau and Vishkin, 18th
Annual ACM Symposium), but have been unable to get my hands on
real or pseudocode.  Since I am looking for 10-100 short, random
strings, it would boost my program efficiency greatly to be able
to search for multiple patterns at once.  Has anyone implemented
a concurrent version of BMG or a dynamic partial-matching
algorithm in C (without parallel hardware)?
     Second, I'm sure that random number generation is a
universally known problem, but I have had problems using srand()
in the right place to get truly "random" pattern strings.  rand()
is used to generate the starting location for the source string. 
But I frequently get consecutive duplicate strings (four or five
"two in a row" strings per hundred).  My randomization call is:
     srand((unsigned)time(NULL));
How can the "randomness" be improved?  I have observed that
moving the line deeper into my nested while/for loops causes a
decrease in randomness.
     Please E-mail responses to: hsu_wh@jhunix.hcf.jhu.edu
(BITNET), hsu@cs.jhu.edu (ARPAnet)