scott@med.ge.com (Scott Woskoff) (01/10/90)
I'm posting this for a friend, so please reply directly to him at:
uunet!crdgw1!gemed!pete
I have a need to recognize one of a set of keywords, when given as
input string either the complete keyword, the shortest prefix of the
keyword which is unique in the set, or any string in between. The
actual application is a command recognizer which allows the user to
abbreviate the command names.
Example: given the keyword set {abel, absalom}, I should get
the following (input-result) pairs:
input result
abs absalom
absal absalom
absalom absalom
abe abel
ab error: ambiguous input
box error: no such keyword
The appropriate data structure seems to be a slightly modified trie.
Because the set of keywords never changes at runtime, but does usually
change with each release of the program, I want to build the trie at
compile-time, using C #defines, if possible. It's not clear to me,
however, that it is possible to do so. A less attractive alternative
is to build the table at runtime by inserting the keywords one at a
time into the trie during initialization.
Can anyone offer any help? Any alternative solutions? Good
advice?
Peter Schiltz
CSE, W-824
GE Medical Systems
3200 N. Grandview
Waukesha WI 53188
uunet!crdgw1!gemed!peteschmidt@crimee.ics.uci.edu (Doug Schmidt) (01/10/90)
In article <SCOTT.90Jan9153157@harpo.med.ge.com>, scott@med (Scott Woskoff) writes: >I have a need to recognize one of a set of keywords, when given as >input string either the complete keyword, the shortest prefix of the >keyword which is unique in the set, or any string in between. The >actual application is a command recognizer which allows the user to >abbreviate the command names. > >Example: given the keyword set {abel, absalom}, I should get >the following (input-result) pairs: > > input result > abs absalom > absal absalom > absalom absalom > abe abel > ab error: ambiguous input > box error: no such keyword > > >The appropriate data structure seems to be a slightly modified trie. > >Because the set of keywords never changes at runtime, but does usually >change with each release of the program, I want to build the trie at >compile-time, using C #defines, if possible. It's not clear to me, >however, that it is possible to do so. A less attractive alternative >is to build the table at runtime by inserting the keywords one at a >time into the trie during initialization. > >Can anyone offer any help? Any alternative solutions? Good >advice? If you've got access to anonymous ftp you should check out the file trie-gen-1.0.Z on ics.uci.edu (128.195.1.1). This tar file implements a minimal-prefix trie generator, which should do pretty much exactly what you are looking for here. The program is a beta-release, but there is sufficient documentation to make it usable! I solicit comments and suggestions for enhancements. Doug -- Douglas C. Schmidt
roy@prism.gatech.EDU (Roy Mongiovi) (01/10/90)
This doesn't use tries, but I did approximately the same thing in a
text editor I wrote many years ago. I kept a sorted table of all the
keywords I wanted to recognize. To search the table, I did a binary
search basically using "strncmp(keyword, table[i], strlen(keyword))".
If I didn't find a match, I knew the keyword wasn't in the table.
Otherwise I probed linearly backwards and forwards in the table
looking for other matches. If there weren't any the keyword is a
unique abbreviation, otherwise it's ambiguous. I also augmented
this to look for an exact match (strlen(keyword) == strlen(table[i]))
which allowed me to enter very short (i.e. single character) entries
in the table as a shorthand for certain frequently used commands.
It worked really well, it was relatively fast, and it's simple.
What more can you ask for?
--
Roy J. Mongiovi Systems Support Specialist Office of Computing Services
Georgia Institute of Technology Atlanta, Georgia 30332-0275 (404) 894-4660
uucp: ...!{allegra,amd,hplabs,ut-ngp}!gatech!prism!roy
ARPA: roy@prism.gatech.edubill@twwells.com (T. William Wells) (01/10/90)
In article <SCOTT.90Jan9153157@harpo.med.ge.com> <scott@gemed.med.ge.com> (Scott Woskoff) writes:
: Because the set of keywords never changes at runtime, but does usually
: change with each release of the program, I want to build the trie at
: compile-time, using C #defines, if possible. It's not clear to me,
: however, that it is possible to do so.
Write a program to convert your word list to source for a C data
structure.
---
Bill { uunet | novavax | ankh } !twwells!bill
bill@twwells.combrnstnd@stealth.acf.nyu.edu (01/11/90)
Speaking of tries, I've found a method of storing tries in constant space per node with the usual operations (searching down one level, inserting a new node) running in constant time, independent of the alphabet size. Has this been done before? I haven't found anything in the usual journals. ---Dan