claus@iesd.auc.dk (Claus S. Jensen) (12/13/90)
Does anybody out there have a routine that compares two strings, one of which contains wildcards (*,?). If so, I would appreciate to obtain a copy. Thanks in advance, Claus S.Jensen -- ######################################################################## # Claus S.Jensen , claus@iesd.auc.dk # Superman on Deep Trouble MUD # # University of Aalborg, Denmark # krikand.iesd.auc.dk 2000 # ########################################################################
staff@cadlab.sublink.ORG (Alex Martelli) (12/14/90)
claus@iesd.auc.dk (Claus S. Jensen) writes: ... > Does anybody out there have a routine that compares >two strings, one of which contains wildcards (*,?). It's real easy, thanks to recursion (no "stylistic" flames, please!-): int wildmatch(pat, str) char *pat, *str; { int pat_ch, str_ch; while(pat_ch = *pat++) { if(pat_ch == '*') { if(*pat == 0) return 1; for(;;) { if(wildmatch(pat, str)) return 1; if((str_ch = *str++)==0) return 0; } } else { if((str_ch = *str++)==0) return 0; if(pat_ch != '?' && str_ch != pat_ch) return 0; } } return *str == 0; } Optimizing this, and in particular eliminating the recursive call, is, on the other hand, a rather interesting exercise, but one I'd undertake only if wildcard-matching was proven-by-profiling to be a bottleneck in my application... which is rather unlikely, I think. -- Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
thorinn@rimfaxe.diku.dk (Lars Henrik Mathiesen) (12/19/90)
staff@cadlab.sublink.ORG (Alex Martelli) writes: >claus@iesd.auc.dk (Claus S. Jensen) writes: >> Does anybody out there have a routine that compares >>two strings, one of which contains wildcards (*,?). >It's real easy, thanks to recursion (no "stylistic" flames, please!-): >int wildmatch(pat, str) char *pat, *str; { > . . . >Optimizing this, and in particular eliminating the recursive call, >is, on the other hand, a rather interesting exercise, but one I'd >undertake only if wildcard-matching was proven-by-profiling to be >a bottleneck in my application... which is rather unlikely, I think. There's nothing wrong with recursive calls. The problem with most implementations of this is nested _loops_, and it doesn't matter if they occur in recursive instances of a function, or are simulated with a stack. Try "echo ********.c" in Bourne shell for a demonstration. X users may have noticed the problem using font specifications with too many asterisks. Eliminating these nested loops changes the worst-case time complexity from O(n^(m-1)) to O(n*m), where n and m are the lengths of the string and the pattern, respectively. The point is that once the part of a pattern between two asterisks matches at some position in the string, there is no need to retry it at any later position: The rest of the pattern (after the second asterisk) will already have been tried (and have failed) at every position after the first match, so it will only fail again. Implementation hint: Let wildmatch return one of SUCCESS, RETRY or FAIL. FAIL is returned when the string is exhausted before the pattern, RETRY when it otherwise doesn't match the pattern. Both SUCCESS and FAIL from the recursive call are returned to the caller immediately, and the loop runs only on RETRY. (Once the loop is entered, the function will only return SUCCESS or FAIL.) Once a recursive version is finished, you can easily construct an iterative one (Hint: you only need to save one (pattern,string) position pair). -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk