mwang (03/01/83)
_D_E_P_A_R_T_M_E_N_T _O_F _C_O_M_P_U_T_E_R _S_C_I_E_N_C_E
_U_N_I_V_E_R_S_I_T_Y _O_F _W_A_T_E_R_L_O_O
_S_E_M_I_N_A_R _A_C_T_I_V_I_T_I_E_S
_T_H_E_O_R_Y/_S_Y_S_T_E_M_S _S_E_M_I_N_A_R
- Friday, March 11, 1983.
Mr. H. Johnson, a graduate student of this department,
will speak on ``Rational Equivalence Relations and
String Similarity''.
TIME: 3.30 PM
ROOM: MC 5158
_A_B_S_T_R_A_C_T
Soundex code is used to map surnames into codes in such
a way that similar names are mapped into the same value
and dissimilar names into different values. As Knuth
points out in ``Sorting and Searching'' (p. 392), it is
not completely successful since some similar names get
different values and some quite dissimilar names get
the same value. In spite of its imperfections, how-
ever, Soundex is often used in practical applications.
Although Soundex is based on linguistic arguments, its
form is that of a sub-sequential function as defined by
Schutzenberger. The theory of sub-sequential functions
and the more general theory of rational relations pro-
vide a natural theoretical framework for Soundex and
similar coding functions since they combine the advan-
tages of concise expression with good algorithms.
Equivalence of Soundex code partitions the space of
names into equivalence classes. It is of interest to
consider more general equivalence models. This leads
to a hierarchy of classes of rational equivalence rela-
tions which show a tradeoff between power and effi-
ciency.
March 1, 1983