gordon@prls.UUCP (Gordon Vickers) (12/12/89)
TO ing@hades.OZ Tried to email you but it bounced. Please send me your email address relitive to a well known hosts, or a surface mail address. I believe I have just what you want. File size (shar format) is 8.2 Kbytes. Gordon Vickers 408/991-5370 (Sunnyvale,Ca); {mips|pyramid|philabs}!prls!gordon ------------------------------------------------------------------------------ Earth is a complex array of symbiotic relationships: Every extinction, whether animal, mineral, or vegetable, hastens our own demise.
ing@hades.OZ (Ian Gold) (12/12/89)
I am looking for a 'soundex' routine in C (or C++). That is a routine capable of finding a substring of a target string that sounds like a given string. The call would look something like this. char *soundex(char *target, char *given); P.S. If the routine you have is NOT in C that's fine. I can always convert it.
wew@naucse.UUCP (Bill Wilson) (12/13/89)
From article <488@hades.OZ>, by ing@hades.OZ (Ian Gold): > > I am looking for a 'soundex' routine in C (or C++). That is a routine > capable of finding a substring of a target string that sounds like a given > string. The call would look something like this. > > char *soundex(char *target, char *given); > I would be interested in the code as well... -- Let sleeping dragons lie........ | The Bit Chaser ---------------------------------------------------------------- Bill Wilson (Bitnet: ucc2wew@nauvm | wilson@nauvax) Northern AZ Univ Flagstaff, AZ 86011
john@riddle.UUCP (Jonathan Leffler) (12/16/89)
In article <1842@naucse.UUCP> wew@naucse.UUCP (Bill Wilson) writes: >From article <488@hades.OZ>, by ing@hades.OZ (Ian Gold): >> I am looking for a 'soundex' routine in C (or C++). >> >> char *soundex(char *target, char *given); >> >I would be interested in the code as well... Will this do? : "@(#)shar2.c 1.5" #!/bin/sh # shar: Shell Archiver (v1.22) # # This is a shell archive. # Remove everything above this line and run sh on the resulting file # If this archive is complete, you will see this message at the end # "All files extracted" # # Created: Fri Dec 15 21:33:37 1989 by john at Sphinx Ltd. # Files archived in this archive: # soundex.c # if test -f soundex.c; then echo "File soundex.c exists"; else echo "x - soundex.c" sed 's/^X//' << 'SHAR_EOF' > soundex.c && X/* X** SOUNDEX CODING X** X** Rules: X** 1. Retain the first letter; ignore non-alphabetic characters. X** 2. Replace second and subsequent characters by a group code. X** Group Letters X** 1 BFPV X** 2 CGJKSXZ X** 3 DT X** 4 L X** 5 MN X** 6 R X** 3. Do not repeat digits X** 4. Truncate or ser-pad to 4-character result. X** X** Originally formatted with tabstops set at 4 spaces -- you were warned! X** X** Code by: Jonathan Leffler (john@sphinx.co.uk) X** This code is shareware -- I wrote it; you can have it for free X** if you supply it to anyone else who wants it for free. X** X** BUGS: Assumes ASCII X*/ X X#include <ctype.h> Xstatic char lookup[] = { X '0', /* A */ X '1', /* B */ X '2', /* C */ X '3', /* D */ X '0', /* E */ X '1', /* F */ X '2', /* G */ X '0', /* H */ X '0', /* I */ X '2', /* J */ X '2', /* K */ X '4', /* L */ X '5', /* M */ X '5', /* N */ X '0', /* O */ X '1', /* P */ X '0', /* Q */ X '6', /* R */ X '2', /* S */ X '3', /* T */ X '0', /* U */ X '1', /* V */ X '0', /* W */ X '2', /* X */ X '0', /* Y */ X '2', /* Z */ X}; X X/* X** Soundex for arbitrary number of characters of information X*/ Xchar *nsoundex(str, n) Xchar *str; /* In: String to be converted */ Xint n; /* In: Number of characters in result string */ X{ X static char buff[10]; X register char *s; X register char *t; X char c; X char l; X X if (n <= 0) X n = 4; /* Default */ X if (n > sizeof(buff) - 1) X n = sizeof(buff) - 1; X t = &buff[0]; X X for (s = str; ((c = *s) != '\0') && t < &buff[n]; s++) X { X if (!isascii(c)) X continue; X if (!isalpha(c)) X continue; X c = toupper(c); X if (t == &buff[0]) X { X l = *t++ = c; X continue; X } X c = lookup[c-'A']; X if (c != '0' && c != l) X l = *t++ = c; X } X while (t < &buff[n]) X *t++ = '0'; X *t = '\0'; X return(&buff[0]); X} X X/* Normal external interface */ Xchar *soundex(str) Xchar *str; X{ X return(nsoundex(str, 4)); X} X X/* X** Alternative interface: X** void soundex(given, gets) X** char *given; X** char *gets; X** { X** strcpy(gets, nsoundex(given, 4)); X** } X*/ X X X#ifdef TEST X#include <stdio.h> Xmain() X{ X char buff[30]; X X while (fgets(buff, sizeof(buff), stdin) != (char *)0) X printf("Given: %s Soundex produces %s\n", buff, soundex(buff)); X} X#endif SHAR_EOF chmod 0640 soundex.c || echo "$0: failed to restore soundex.c" fi echo All files extracted exit 0
exnirad@brolga.cc.uq.oz.au (Nirad Sharma) (08/31/90)
I have been using the SOUNDEX function supplied with Oracle V5 and have found it to be very convenient except that it may be too ambiguous. I noticed that SOUNDEX only returns a 5 (or 4 - I forget) character string. Is it possible that other soundex algorithms allow less ambiguity by making use of more characters ? If so and if the source (pref. c) exists could someone tell me how to get it, please ? While I'm at it, are there any FTP sites holding various Oracle bits e.g. forms, scripts and the like. ? Thanks for any help Nirad Sharma (exnirad@brolga.cc.uq.oz.au) Continuing Education Unit The University of Queensland AUSTRALIA
buckland@cheddar.ucs.ubc.ca (Tony Buckland) (08/31/90)
In article <1990Aug31.020725.6451@brolga.cc.uq.oz.au> exnirad@brolga.cc.uq.oz.au (Nirad Sharma) writes: >I have been using the SOUNDEX function supplied with Oracle V5 and have >found it to be very convenient except that it may be too ambiguous. I >noticed that SOUNDEX only returns a 5 (or 4 - I forget) character string. >Is it possible that other soundex algorithms allow less ambiguity by making >use of more characters ? This goes *way* back, about 30 years to when I did payroll work, but if I recall correctly, the Soundex algorithm (pre-computer, of course) we used then always produced the same small number of characters so that codes could be compared. Varying length would defeat this purpose, and a longer fixed length would require progressively more useless padding of short codes for progressively more names as the fixed length increased.