alter@ttidca.TTI.COM (Steve Alter) (12/21/88)
A couple of months ago I posted a request for public-domain (or at least legally copyable) versions of the "gmatch" or "glob" routine and got a very satisfactory number of responses. So many, in fact, that I decided to compare them all and pick the fastest for use in a program of my own, which will be posted. I've completed my tests of all those gmatch-lookalikes. As it turns out, only one routine had really crippling bugs in it; the rest only differed in handling of marginal or degenerate cases such as when the pattern and/or the subject-string are null. They also differed on the character used to indicate the negation of a character-class, but that's no big deal to change. As far as the time-tests are concerned, I really put them through their paces. I wrote a program that feeds a specified pattern and subject-string to each routine and repeating for a specified number of iterations. I then wrote a shell-script that runs this test program using 69 different pattern/string pairs, and 100000 (one hundred thousand) iterations of each. I then ran this shell script through 50 complete passes (that's a total of five million iterations) on each of two different varieties of Vax processors: 11/785 (4.3 BSD) and a VaxStation 3200 (Ultrix 1.2). The 3200 ran the whole set in two days whereas the 11/785 needed three WEEKS (it actually stayed up the whole time!) Disclaimer: I don't claim that my 69 test-pairs represent any kind of "normal" situation; I just tried to catch lots of the syntax combinations. Why five million iterations? To completely flatten out any discrepancies caused by fluctuations in system-load and obtain a statistically significant average. I was aiming for a very high signal-to-noise ratio. It worked because in each test, the standard deviation turned out to be a very tiny fraction of the mean (all less than 2% on the VaxStation, most around 0.7%) so I believe that I got very accurate overall timings. So who is the fastest? WE HAVE A WINNER! THE WINNER IS... (the envelope, please)... The Free Software Foundation with the "glob_match" function! Now I don't feel right in posting this routine because it's not really public-domain (we don't see source to gcc on the net, do we?) but I'll send it to anyone who wants it, as allowed by the FSF license. If anybody wants a copy of the test environment (I can't imagine why, but who knows?) then I'll send that out too. Great thanks to David MacKenzie <edf@rocky2.rockefeller.edu> for sending me the gnuglob package in the first place. -- Steve Alter <alter@tti.com> {csun,philabs,psivax,pyramid,quad1,rdlvax,retix}!ttidca!alter Citicorp/TTI, Santa Monica CA (213) 452-9191 x2541