lane@sumex-aim.stanford.edu (Christopher Lane) (09/08/90)
I've ftp'd the file Homophones.tar.Z to the submissions directory of the cs.orst.edu archive, which is a homophone dictionary utility to build and interrogate a database of homophonic word sets derived from the Merriam- Webster database. ho-mo-phone \'haHm-e-,foEn, 'hoE-me-\ n [ISV] (1843) 1: one of two or more words pronounced alike but different in meaning or derivation or spelling 2: a character or group of characters pronounced the same as another character or group The homophone dictionary is a side effect of our speech recognition work and contains just over 1200 homophonic word sets. e.g: ['aE-bel] abel able [ik-'sept] accept except [,ak-le-'maE-shen] acclamation acclimation ['oGl] all awl ['berth] berth birth ['boEl] bole boll bowl The program that extracts the dictionary from Webster is an example of using: o Programmatic access to Webster -- this utility 'walks' the Webster database word by word (getting around at least one bug). o The 'Storage' object -- In all the example program sources we've online, I only found one (Tools) that used the Storage object--now I know why. o The 'HashFile' object -- Yet another example of using my HashFile object (posted to the archive earlier) interface to the 'db' routines. This program keeps objects out on a database file and accesses/loads them as needed. Due to inconsistencies in the pronunciations in Webster, the homophonic database does not contain all possible homophonic word sets. More could be generated by more manipulating/filtering of the pronunciation strings. I'd be interested in hearing from anyone who does so. - Christopher -------