[comp.compression] What is a trie ??

d88-jwa@byse.nada.kth.se (Jon W{tte) (05/27/91)

In article <24May91.206106.345@franklin.com> bill@franklin.com (bill) writes:

   From: bill@franklin.com (bill)

   BTW, the *correct* answer to the question of the "normal" way of
   storing the lexicon is this: store the lexicon as a trie. This is

What is a trie ? I've seen it mentioned over and again in this
group, but not being native english I haven't heard the word
elsewhere. I'm probably familiar with the data structure when
you explain it to me, but until then, I remain in the dark.

Any input gladly accepted,

--
						Jon W{tte
						h+@nada.kth.se
						- Power !

bill@franklin.com (bill) (05/27/91)

In article <D88-JWA.91May26212010@byse.nada.kth.se> d88-jwa@byse.nada.kth.se (Jon W{tte) writes:
: In article <24May91.206106.345@franklin.com> bill@franklin.com (bill) writes:
:
:    From: bill@franklin.com (bill)
:
:    BTW, the *correct* answer to the question of the "normal" way of
:    storing the lexicon is this: store the lexicon as a trie. This is
:
: What is a trie ? I've seen it mentioned over and again in this
: group, but not being native english I haven't heard the word
: elsewhere. I'm probably familiar with the data structure when
: you explain it to me, but until then, I remain in the dark.

I can't say what the "official" definition of trie is; Knuth, I
think, defines it somewhere, but the definition I use is this:
Think of an N-ary tree in which the arcs of the tree are labeled
with letters. A node of the tree may be labeled as "terminal". A
trie contains a word if there is a path which is marked with the
letters of the word from the root to a node marked terminal.

My copy of _Data Structures and Algorithms_ (Aho, Hopcroft,
Ullman) has a superficial section on them, which uses a
definition not too dissimilar to mine.

(BTW, I despise this neologism but we're stuck with it. A trie is
a tree!)

There are many ways of representing tries. One is as a
conventional tree. Another is "hash tries", which you can see
examples of in the TeX package (in the hyphenation pattern
generation code), and in one of the ACM Programming Pearls
columns, the one where Bently has Knuth showing off Web.

One favored trick with representing tries (and trees) is to merge
identical subtrees. One finds identical subtrees of the tree and
constructs a DAG (Directed Acyclic Graph) by making arcs to those
subtrees point instead to a single instance of the subtree. With a
little cleverness, it is possible to do this kind of compression
very quickly and in a small amount of memory over and above that
which is needed to store the trie itself. Doing this results in
dramatic space savings in the trie. I believe that Graham Toal's
stuff contains code for this.

brad@looking.on.ca (Brad Templeton) (05/27/91)

A trie is nothing fancy, it's a N-ary tree for re"trie"val, and is discussed
in Knuth, "Searching and Sorting."

What, you don't have a copy of Knuth?  While those pseudo-assembler
programs and certain other stuff do make the book a bit dated, it's still
one of the fundamental texts of CS.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

d88-jwa@byse.nada.kth.se (Jon W{tte) (05/27/91)

brad@looking.on.ca (Brad Templeton) writes:

   From: brad@looking.on.ca (Brad Templeton)

   A trie is nothing fancy, it's a N-ary tree for re"trie"val, and is discussed
   in Knuth, "Searching and Sorting."

Okay, thanx ! (bill@franklin also had a more fleshed-out definition,
but as I suspected, I was familiar with the structure, only not the
word...)

   What, you don't have a copy of Knuth?  While those pseudo-assembler
   programs and certain other stuff do make the book a bit dated, it's still
   one of the fundamental texts of CS.

Yeah, there is a reference ex. somewhere here... but I don't have a
personal copy, though I imagine the bookstores here carry it as well...

--
						Jon W{tte
						h+@nada.kth.se
						- Speed !

jones@pyrite.cs.uiowa.edu (Douglas W. Jones,201H MLH,3193350740,3193382879) (05/28/91)

From article <D88-JWA.91May26212010@byse.nada.kth.se>,
by d88-jwa@byse.nada.kth.se (Jon W{tte):
> 
> What is a trie?

A trie is any tree where branches are labeled with the letters of some
alphabet and where the path label, treated as a string of these letters,
is used to encode or decode the use of that path.  Thus, the tree used
in a Huffman code is a trie, and the hash table used by the LZW compression
algorithm is also a funny but very useful representation of a trie.

I've never seen any reason to use the word -- it's a cute pun on the word
tree, but the problem is, too many people don't understand what you mean
when you say it.  Rather than take the time to tell them what I meant, I
prefer to avoid the problem by avoiding the word.
							Doug Jones
							jones@cs.uiowa.edu

mohan@nynexst.com (Mohan Arkalgud) (06/05/91)

In article <6228@ns-mx.uiowa.edu>, jones@pyrite.cs.uiowa.edu (Douglas W. Jones,201H MLH,3193350740,3193382879) writes:
|> From article <D88-JWA.91May26212010@byse.nada.kth.se>,
|> by d88-jwa@byse.nada.kth.se (Jon W{tte):
|> > 
|> > What is a trie?
|> 
|> A trie is any tree where branches are labeled with the letters of some
|> alphabet and where the path label, treated as a string of these letters,
|> is used to encode or decode the use of that path.  Thus, the tree used
|> in a Huffman code is a trie, and the hash table used by the LZW compression
|> algorithm is also a funny but very useful representation of a trie.
|> 
|> I've never seen any reason to use the word -- it's a cute pun on the word
|> tree, but the problem is, too many people don't understand what you mean
|> when you say it.  Rather than take the time to tell them what I meant, I
|> prefer to avoid the problem by avoiding the word.
|> 							Doug Jones
|> 							jones@cs.uiowa.edu

	The term 'trie' originated as letters extracted from the word 
               'retrieval'.

	I agree with you that there is no overwhelming reason for the choice of
 the word trie.

--- Mohan Arkalgud
    mohan@nynexst.com