nels@astrovax.UUCP (02/13/87)
References: A misspelling has crept into the dictionary used by the spell program here (spell erroneously accepts a misspelling of a famous astronomer's; name; somebody here got an angry letter from her when the misspelling appeared in a paper of his!). Is there an easy way to remove it? Unfortunately, we do NOT have an up-to-date ASCII list of all the words added to the spell database. Nels Anderson Princeton University, Astrophysics UUCP: {allegra,akgua,cbosgd,decvax, ihnp4,noao,philabs,princeton,topaz}!astrovax!nels ARPANET: nels%astrovax@rutgers.edu BITNET: 6070106@pucc -- or, if your mailer doesn't mind, nels@astrovax.princeton.edu
boykin@custom.UUCP (02/13/87)
In article <843@astrovax.PRINCETON.EDU>, nels@astrovax.PRINCETON.EDU (Nels Anderson) writes: > > A misspelling has crept into the dictionary used by the spell program here > ... Is there an easy way to remove it? > Unfortunately, we do NOT have an up-to-date ASCII list of all the words > added to the spell database. The only guaranteed way of doing this is by having the original list, however... If you're not using SV52 or later, than you can remove the word. SVR2 uses a huffman encoding, for which I don't know of a way to remove an entry. Other versions of spell use a 50K bitwise hash table. For each word in the original list there are 12 (?) bits set within the table. To remove a single word involves knowing which 12 bits are set and writing a program (or using a debugger) to reset those bits. Among other ways of finding this out is to create a new database with only that one word in it. For PC/SPELL under DOS the program is SPINSHST, I don't remember what the name of the UNIX program is, or it's syntax, check the UNIX manuals for details. Anyway, create the new database, dump it out to see which bits are set and clear those bits within the original database. That's the good news, now for the bad news! While you just removed that one entry, there is a definite possibility you've removed other words as well! In order for a word to be considered in the database, all 12 bits must be set, however, two words can have an intersection of bits which are set. You will probably find that you just introduced a number of misspellings by doing this! My reccomendation would be to start over again and maintain the list of words you need. Start over with a reasonable list of known valid words (the distribution tape is reasonable place to start!). SPELL keeps a log of misspelled words, look at it regularly and rebuild your database when people complain. Good luck! Joe Boykin Custom Software Systems {necntc, frog}!custom!boykin