[comp.sys.ibm.pc.programmer] Soundex

wooters@icsib13.berkeley.edu (Chuck Wooters) (04/10/90)

Hello-

I saw some articles a while back talking about 
a routine for doing searches based on the way a word
sounds.  I think it was called soundex.

Can someone point me to where I can get some code
to implement this?  Or, alternatively, can someone
mail this to me?

Thanks in advance.
Chuck Wooters
wooters@icsi.berkeley.edu

cs4g6ag@maccs.dcss.mcmaster.ca (Stephen M. Dunn) (04/11/90)

   Well, being a good computer science student, I feel obliged to point
out that this algorithm, like so many others, is in one of Donald
Knuth's _The Art Of Computer Programming_ books, specifically the
one on Searching and Sorting algorithms.  You should be able to find
it in your campus library system.  I'm sure there are many other
books with the soundex algorithm, or similar, in them, but I _have_
to mention Knuth on the net at least once before I graduate :-)

   For those not familiar with soundex, it takes a word and transforms
it into another representation with vowels and double letters removed,
and with the remaining letters converted into codes so that letters
which have similar sounds (e.g. m and n, or t and d) end up with the
same code.  Well, that's an approximate description of it, but it's
enough to give an idea of how it works.  Note that it assumes that
words you feed it are from the English language; for foreign words,
it often doesn't work quite as well as you would like it to.
-- 
               More half-baked ideas from the oven of:
****************************************************************************
Stephen M. Dunn                               cs4g6ag@maccs.dcss.mcmaster.ca
     <std_disclaimer.h> = "\nI'm only an undergraduate ... for now!\n";

GMoretti@massey.ac.nz (Giovanni Moretti) (04/12/90)

Chuck
This very afternoon I copied an article (with source code in Turbo
Pascal) for the SOUNDEX algorithm.  It's only about 50 lines long so
converting it to another language shouldn't be a problem.

The reference is:

  Languages edited by Tony Rizzo, PC Magazine, September 26 1989, p377-378

The article includes a discussion of how it works as well.

Cheers
Giovanni

-- 
-------------------------------------------------------------------------------
|   GIOVANNI MORETTI, Consultant     | EMail: G.Moretti@massey.ac.nz          |
|Computer Centre,  Massey University | Ph 64 63 69099 x8398, FAX 64 63 505607 |
|   Palmerston North, New Zealand    | QUITTERS NEVER WIN, WINNERS NEVER QUIT |
-------------------------------------------------------------------------------