ram@lynx.Berkeley.EDU (m.v.s. ramanath) (05/30/91)
There seems to be a serious need for a public domain dictionary. Since dictionary publishers are unlikely to make their materials available for free, I propose that we net folks create our own. I assume that since English words and their meanings are not copyrightable (though the specific wording of a meaning as published in an extant dictionary may be). So as long as we come up with our own definitions and don't lift stuff verbatim from published dictionaries we should be ok. This won't of course be a scholarly and definitive effort but should be useful in an everyday context. Here is the proposed plan of action: I estimate that a dictionary needs to have about 180,000 words to be reasonably useful. That means that if we get about 500 (actually 494) definitions typed in per day, we'll be done in a less than a year. That means we need about 100 volunteers who each undertake to come up with the definitions (in their own unique words) of 5 words per day. Or 50 who can do 10 per day. So how about it folks ? Volunteers ? If there is enough interest in this idea (either from potential users of the dictionary or from potential volunteers) I'll flesh out the idea some more. If there is a fatal flaw in the idea, I'd like to hear about it. Ram ====================================================================== Disclaimer: I speak for myself only and not for my employer ====================================================================== M.V.S. Ramanath |ram@imagen.com or QMS/Imagen |imagen!ram@sun.com or 2650 San Tomas Expressway |imagen!ram@decwrl.dec.com Santa Clara, California, U.S.A. 95052 | Phone: (408) 986-9400 (ext. 431) |
lee@sq.sq.com (Liam R. E. Quin) (05/31/91)
ram@lynx.Berkeley.EDU (m.v.s. ramanath) writes: >There seems to be a serious need for a public domain dictionary. I've been thinking about exactly this issue for some time... and had even got an article written (but not typed) along the same lines in the last few days! >That means we need about 100 volunteers who each undertake to come up with >the definitions (in their own unique words) of 5 words per day. Or >50 who can do 10 per day. Of course, quality control and checks for regional variations are very important. Dictionary definitions are extraordinarily hard to write well. Although perhaps we could do a plausible job, I doubt that Chambers or Oxford or Webster need worry... :-) I had half-planned the following: * for each person writing definitions, there would be at least two people reading definitions with the ability to comment on them * a writer is sent n randomly-chosen words (for example, 30 words taken from random usenet articles and other sources, subject to other checking) The software would keep a list of which words were sent to whom, of course; that's easy * when the writer returns some or all of the words, the software sends the same number as were returned, crosses the received words off the list, and re-sends the un-returned ones. * a writer can work at any rate, and can "refuse" to do some or all of th words. * the words received are put on the list to be sent out to readers to check for typos, local variations (e.g. momentarily means different things to different people).. The same sort of thing for words received from reader-people. I'm even prepared to work on such software (as well as type words...) Much of the challenge is to automate enough that no one person has to see 500 words a day, as that would be (to say the least) a full-time job. For fun, by the way, I have some dictionary entries already, but they are mostly from seventeenth century dictionaries :-) Liam -- Liam Quin, lee@sq.com, SoftQuad, Toronto, +1 416 963 8337 the barefoot programmer
npn@cbnewsl.att.com (nils-peter.nelson) (06/01/91)
I've been holding on to these, not knowing what to do with them. Thanks for the opportunity. Now, for these never-before-published definitions: A is a letter. Di- is a two letter prefix. Tri- is a three letter prefix. Quad- is a four letter prefix. Penta- is a five letter prefix. Hexads are groups of 6 letters. Heptads are groups of 7 letters. Octogram is an 8 letter word. Nonagraph is a sequence of 9 letters. Decagraphs are sequences of 10 letters. Undecagraph is a sequence of 11 letters. Duodecagraph is a sequence of 12 letters. Tredecagraphs are sequences of 13 letters. Quattuordecads are groups of 14 letters. Quindecagraphic describes 15 letter words. Sexdecagrammaton is a 16 letter word. Septendecagraphic describes 17 letter words. Octodecagrammatons are 18 letter words. Novemdecagrammatons are 19 letter words.
gtoal@tardis.computer-science.edinburgh.ac.uk (06/01/91)
In article <1991May29.235751.1362@imagen.com> ram@lynx.Berkeley.EDU (m.v.s. ramanath) writes: >There seems to be a serious need for a public domain dictionary. Not many people know this... but... there already is one -- on uk's UUG archive at uk.ac.ic.doc - but I don't know how people on internet can get at it. I think it's about 5Mb - it's called DICT.Z (rather unoriginally :) ) and I've no idea of its provenance, except that it is clearly American. It includes a phonetic representation of most of the words in it too, for those of you interested in that sort of thing. Maybe someone at ic.doc could say where it came from? Graham
gaynor@yoko.rutgers.edu (Silver) (06/01/91)
I don't have anything useful to contribute to this duscussion yet, but mainly just wanted to acknowledge my interest and willingness to participate in a minor role. Regards, [Ag]
mh@awds23.imsd.contel.com (Mike Hoegeman) (06/02/91)
In article <1991May31.025805.24100@sq.sq.com> lee@sq.sq.com (Liam R. E. Quin) writes: ram@lynx.Berkeley.EDU (m.v.s. ramanath) writes: >There seems to be a serious need for a public domain dictionary. I've been thinking about exactly this issue for some time... and had even got an article written (but not typed) along the same lines in the last few days! Of course, quality control and checks for regional variations are very important. Dictionary definitions are extraordinarily hard to write well. Although perhaps we could do a plausible job, I doubt that Chambers or Oxford or Webster need worry... :-) >I had half-planned the following: >* for each person writing definitions, there would be at least two people >reading definitions with the ability to comment on them - - This sounds good to me. It might even be a good idea make a newsgroup and let the author post their definition and let anyone who wishes reply do so. The author can then after a suitable period revise their definition. I think it would be nice to allow accompanying articles (kind of like encyclopedia entries). For those who are ambitious. > * a writer is sent n randomly-chosen words (for example, 30 words > taken from > random usenet articles and other sources, subject to other checking) > The software would keep a list of which words were sent to whom, of > course; that's easy * when the writer returns some or all of the words, the software sends the same number as were returned, crosses the received words off the list, and re-sends the un-returned ones. - - I can understand the reasons for issuing words randomly but I would enjoy the project much more if I could pick some of the words I were to write entries for. Maybe have a policy that for every assigned word you write an entry for, you can write one of your own choosing. This would make the "word check out" process software more complicated but worth it in my opinion. This would also probably increase the quality of the dictionary. > * a writer can work at any rate, and can "refuse" to do some or all of > the words. > * the words received are put on the list to be sent out to readers to > check > for typos, local variations (e.g. momentarily means different things > to different people).. >The same sort of thing for words received from reader-people. >I'm even prepared to work on such software (as well as type words...) - - Me too... >Much of the challenge is to automate enough that no one person has to >see 500 words a day, as that would be (to say the least) a full-time >job. >For fun, by the way, I have some dictionary entries already, but they >are mostly from seventeenth century dictionaries :-) - - I'm one of those people who love reading obscure dictionary entries and other interesting lexicon. I think this could be a good piece of reading material as well as a good desk reference. I would love to have your 17th century entries just as much as something more than something more run of the mill. -------------------------------------------------------------------------- mike hoegeman, mh@awds.imsd.contel.com
jdc@naucse.cse.nau.edu (John Campbell) (06/03/91)
From article <1991Jun2.015314.5771@wlbr.imsd.contel.com>, by mh@awds23.imsd.contel.com (Mike Hoegeman): : In article <1991May31.025805.24100@sq.sq.com> lee@sq.sq.com (Liam R. E. Quin) writes: : ram@lynx.Berkeley.EDU (m.v.s. ramanath) writes: : >There seems to be a serious need for a public domain dictionary. : : - - This sounds good to me. It might even be a good idea make a newsgroup : and let the author post their definition and let anyone who wishes : reply do so. The author can then after a suitable period revise their : definition. I think it would be nice to allow accompanying articles : (kind of like encyclopedia entries). For those who are ambitious. : Isn't this how Douglas Adams wrote "Hitch Hikers Guide to the Galaxy"? -- John Campbell jdc@naucse.cse.nau.edu CAMPBELL@NAUVAX.bitnet unix? Sure send me a dozen, all different colors.
GONTER@awiwuw11.wu-wien.ac.at (Gerhard Gonter) (06/07/91)
There's clearly a need for a public domain online dictionary system and there are already a lot of such things available on various ftp sites etc.... Most of them are word lists, some of the contain part-of-speech or other information as well. Before we start compiling yet another one, we should sit back and consider an encoding scheme which is flexible enough to meet a wide variety of needs and applications. It's also a good idea to think about a possible way to expand such an encoding scheme for applications we don't even have an idea about now. In the last few months I've collected all sorts of dictionary material from virtually all over the world. This stuff was used then to create an experimental lexicon which currently has more than 384000 entries, mostly english words. Many entries have part-of-speech information and data from a so called psycholinguistic database included. Yet there are some more files to be processed. My problem about this lexicon is, that it's already too large for my processing capabilities (== hard disk storage space). I'm very interested in a `public lexicon project' and I'm willing to share the material that I've accumulated. I'm especially interested in an encoding scheme, powerfull enough to meet whatever the needs of interested users are. - Such an encoding scheme would/should possibly be based on SGML. Any comments/ideas? p.s. Douglas Adam's Hitchhikers Guide could be a nice metaphor for such a project. p.p.s: what about comp.text.lexicon ? best wishes, Gerhard Gonter +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Gerhard Gonter <GONTER@AWIWUW11.BITNET> Tel: +43/1/31336/4578 <gonter@awiwuw11.wu-wien.ac.at> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
jkp@cs.HUT.FI (Jyrki Kuoppala) (06/08/91)
In article <91156.144339GONTER@awiwuw11.wu-wien.ac.at>, GONTER@awiwuw11 (Gerhard Gonter) writes: >Before we start compiling yet another one, we should sit back and >consider an encoding scheme which is flexible enough to meet a wide >variety of needs and applications. It's also a good idea to think >about a possible way to expand such an encoding scheme for applications >we don't even have an idea about now. I think it'd be a good idea to provide support for several languages in the format. Then of course there'll be some trouble with the character sets, etc, but the work would be useful all over the world, and perhaps it could be made into a multi-language freely distributable dictionary. Maybe it could also be useful as a dictionary for automatic translation tools (ah well, then the format would be quite complicated). //Jyrki
lee@sq.sq.com (Liam R. E. Quin) (06/09/91)
GONTER@awiwuw11.wu-wien.ac.at (Gerhard Gonter) writes: > There's clearly a need for a public domain online dictionary system yes! > Before we start compiling yet another one, we should sit back and > consider an encoding scheme which is flexible enough to meet a wide > variety of needs and applications. It's also a good idea to think > about a possible way to expand such an encoding scheme for applications > we don't even have an idea about now. I agree... > I'm very interested in a `public lexicon project' and I'm willing > to share the material that I've accumulated. I'm especially interested > in an encoding scheme, powerfull enough to meet whatever the needs > of interested users are. > - Such an encoding scheme would/should possibly be based on SGML. Well, that's a good idea too. I think that the best way forward is to use a simple format that can easily be transmitted over networks. This implies a limited character set and fairly short (<= 72 character) lines for many people. The simple format can easily be converted to SGML. It isn't easy writing a DTD for a dictionary, so it is probably better not to try and do so at first, although that doesn't preclude an SGML-style markup. Another alternative would be the Text Encoding Initiative DTD, but that's probably more general than is appropriate. > p.p.s: what about comp.text.lexicon ? I think I'd rather see actual progress before a newsgroup, although I suppose I could be persuaded to moderate such a thing. Lee -- Liam Quin, lee@sq.com, SoftQuad, Toronto, +1 416 963 8337 the barefoot programmer
lark@greylock.tivoli.com (Lar Kaufman) (06/11/91)
lee@sq.sq.com (Liam R. E. Quin) writes: GONTER@awiwuw11.wu-wien.ac.at (Gerhard Gonter) writes: > There's clearly a need for a public domain online dictionary system yes! > I'm very interested in a `public lexicon project' and I'm willing > to share the material that I've accumulated. I'm especially interested > in an encoding scheme, powerfull enough to meet whatever the needs > of interested users are. > - Such an encoding scheme would/should possibly be based on SGML. Well, that's a good idea too. Agreed. I think that the best way forward is to use a simple format that can easily be transmitted over networks. This implies a limited character set and fairly short (<= 72 character) lines for many people. Agreed. The simple format can easily be converted to SGML. It isn't easy writing a DTD for a dictionary, so it is probably better not to try and do so at first, although that doesn't preclude an SGML-style markup. Another alternative would be the Text Encoding Initiative DTD, but that's probably more general than is appropriate. It would be interesting to see what the guys at U. of Waterloo did with the online O.E.D. project. I understand it is very SGML-like. > p.p.s: what about comp.text.lexicon ? I think I'd rather see actual progress before a newsgroup, although I suppose I could be persuaded to moderate such a thing. That would be a noble thing. Perhaps a newsletter would be an appropriate dissemination method, though? Lar Kaufman I would feel more optimistic about a bright future (voice) 512-794-9070 for man if he spent less time proving that he can (fax) 512-794-0623 outwit Nature and more time tasting her sweetness lark@tivoli.com and respecting her seniority. - E.B. White
drraymond@watdragon.waterloo.edu (Darrell Raymond) (06/12/91)
>It would be interesting to see what the guys at U. of Waterloo did with the >online O.E.D. project. I understand it is very SGML-like. The online OED is marked up with tags that are reminiscent of SGML. However, there is no DTD for the OED, or for many of the other markup projects that Oxford University Press has undertaken. Many existing dictionaries have too much variance in their structure to be completely captured by SGML. Even deciding what sort of information you want to capture in your markup is a subject of some controversy. ---------------- Maybe you guys could stand a few comments on your project in general. Basically, you've got three things to worry about: (i) coverage (ii) correctness (iii) finishing Coverage implies you have to find a source of words that gives us some confidence that your dictionary is comprehensive enough for whatever purpose you have in mind. This means more than just finding instances of every word, it means finding instances of most of the senses of the words. The strength of any dictionary is its underlying corpus, the collection of language from which the examples are drawn. In the case of the OED, this means 8 to 10 million quotations sent in by volunteer readers. In the case of the Collins COBUILD dictionary it's a special online corpus of about 40 million words. Correctness means that unless you put in place some mechanism that'll give us confidence in how you obtained your results, no one will be using (or at least depending on) your dictionary. One such mechanism is that old scholarly tradition, accountability. For example, the OED provides you with the quotes used to define the entry, as well as bibliographic information, so you can go and check the quote in the original source if you like. Thus you can hold the OED and its editors accountable for the decisions they made, because you can look at the same evidence. Finishing means that you ought to be aware of the fact that many a dictionary project takes decades longer than the original editors forecast. Dictionary-writing is not a part-time activity. Some comments on statements made in various postings: >There seems to be a serious need for a public domain dictionary. My first question is, what for? I admit I didn't see the first posting in this thread. Is it really the definitions you want, or just a word list with correct spellings and parts of speech (which would be fine for a lot of automatic uses)? If you actually want to write a dictionary from scratch, good luck, you'll be at it a long time. If it's only a word list that you want, you stand a better chance of completing. >That means we need about 100 volunteers who each undertake to come up with >the definitions (in their own unique words) of 5 words per day. Or >50 who can do 10 per day. Goodness gracious. 10 words per day? Just sit down and write me up a definition of the word "good". Make sure to cover as many senses and usages as you can think of. Go check a couple of dictionaries and see how many senses you missed. If it takes you less than an hour to do a good job I'd be surprised. Now multiply that by 10. Just as you cannot get twice the software production by doubling the number of programmers, you cannot get twice the dictionary by doubling the number of volunteers who write definitions. >Of course, quality control and checks for regional variations are very >important. Whoops, add to that hour per word all the checks you're going to do for regional variations and quality control. Who has the final word on the quality of a definition, anyway? >Dictionary definitions are extraordinarily hard to write well. But you plan to do 5 to 10 a day? >* a writer is sent n randomly-chosen words (for example, 30 words taken from > random usenet articles and other sources, subject to other checking) Usenet is not exactly what I would call a broadly based source of words (especially if you want them spelled correctly). >- - This sounds good to me. It might even be a good idea make a newsgroup >and let the author post their definition and let anyone who wishes >reply do so. The author can then after a suitable period revise their >definition. When there are disputes, who is the final authority? Since the author is basically chosen at random, he or she probably has no more claim to being the final authority than anyone else... >I think it would be nice to allow accompanying articles >(kind of like encyclopedia entries). For those who are ambitious. It would be nice - who's going to check them for correctness? What if some of them are sexist or racist? Who decides what is permitted and what isn't? Are the people who decide such things then exposing themselves to liability for lawsuits? >- - I can understand the reasons for issuing words randomly but I would >enjoy the project much more if I could pick some of the words I were to >write entries for. No doubt. Who decides who gets the most popular words? ---------------- I'm not trying to throw a wet blanket on this project. But imagine if a bunch of lexicographers got together to rewrite Unix on a part-time basis ('cause we need a public domain one, don'tcha know)....
enag@ifi.uio.no (Erik Naggum) (06/13/91)
Lar Kaufman <lark@greylock.tivoli.com> writes: | | It would be interesting to see what the guys at U. of Waterloo did | with the online O.E.D. project. I understand it is very SGML-like. They have used a portion of SGML's syntax, which, I'm sorry to say, does not make it SGML-conformant. As I heard the story, it was too many inconsistencies in the original material to try to make a DTD for OED2. What I've seen of the OED2 is not pretty, so they obviously had a very hard job figuring out how to encode it and actually do the encoding. We're creating a dictionary from scratch (aren't we? :-), so we could, perhaps, be better at consistency... Let's try to look at how other people did their dictionary entries, and what we would like to include, in what order, then test that by submitting numerous entries to these constraints before continuing. Changes in the structure is going to make a lot of people unhappy and create a lot of unnecessary work. (Converting between structures is possible, but generally difficult.) </Erik> -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text <ERIK@NAGGUM.NO> 0118 OSLO, NORWAY Computer Communications <enag@ifi.uio.no>
GONTER@awiwuw11.wu-wien.ac.at (Gerhard Gonter) (06/14/91)
Path: awiwuw11!psuvm!ysub!usenet.ins.cwru.edu!magnus.acs.ohio-state.edu!zaphod.mps.ohi o-state.edu! usc!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!sics.se!fuug!funic!nnt p.hut.fi!usenet From: jkp@cs.HUT.FI (Jyrki Kuoppala) Newsgroups: comp.text Subject: Re: Public Domain Dictionary Message-ID: <1991Jun8.145123.23369@nntp.hut.fi> Date: 8 Jun 91 14:51:23 GMT References: <1991Jun2.015314.5771@wlbr.imsd.contel.com> <3797@naucse.cse.nau.edu> <91156.144339GONTER@awiwuw11.wu-wien.ac.at> Sender: usenet@nntp.hut.fi (Usenet pseudouser id) Reply-To: jkp@cs.HUT.FI (Jyrki Kuoppala) Organization: Helsinki University of Technology, Finland Lines: 16 In-Reply-To: GONTER@awiwuw11.wu-wien.ac.at (Gerhard Gonter) Nntp-Posting-Host: sauna.hut.fi > I think it'd be a good idea to provide support for several languages > in the format. Then of course there'll be some trouble with the > character sets, etc, but the work would be useful all over the world, > and perhaps it could be made into a multi-language freely > distributable dictionary. Maybe it could also be useful as a > dictionary for automatic translation tools (ah well, then the format > would be quite complicated). > > //Jyrki As a native german speaker I fully agree with you. best wishes, Gerhard Gonter +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Gerhard Gonter <GONTER@AWIWUW11.BITNET> Tel: +43/1/31336/4578 <gonter@awiwuw11.wu-wien.ac.at>
tbray@watsol.waterloo.edu (Tim Bray) (06/24/91)
enag@ifi.uio.no (Erik Naggum) writes:
They have used a portion of SGML's syntax, which, I'm sorry to say,
does not make it SGML-conformant.
It's not clear to me that an SGML-conformant dictionary is either necessary
or desirable. A dictionary should be a small model of a human language.
Not even SGML's strongest partisans claim for it an ability to model natural
language.
As I heard the story, it was too
many inconsistencies in the original material to try to make a DTD for
OED2....
What I've seen of the OED2 is not pretty, so they obviously had a very
hard job figuring out how to encode it and actually do the encoding.
Indeed. Frank Tompa of the New OED project at Waterloo, who has had a lot
of experience with online dictionaries, in co-operation with Bob Amsler, then
of Bellcore, now of Mitre, put in a lot of time and came up with a proposed
SGML def for dictionaries. But it was tough, and even Tompa and Amsler
were left somewhat unsatisfied that they had covered the bases.
I have to disagree on the "not pretty" part. Challenging, complex, somewhat
irregular, yes, all of those are true. But this is no uglier than the
English language that the OED is trying to describe.
Tim Bray, U of W Centre for the New OED -and- Open Text Systems
enag@ifi.uio.no (Erik Naggum) (06/25/91)
Tim Bray <tbray@watsol.waterloo.edu> writes: | | It's not clear to me that an SGML-conformant dictionary is either | necessary or desirable. A dictionary should be a small model of a | human language. Not even SGML's strongest partisans claim for it | an ability to model natural language. I must have missed something really crucial. I have always thought that a dictionary _entry_ is a structured unit of information in a dictionary, containing other, smaller units of information, such as word class, etymology, pronunciation, inflection, and a number of definitions. What is the relevance of "natural languages" in this? SGML is a language in which you express the structure of information, among other things, and _all_ information has _some_ structure, other- wise it's noise. SGML is suitable to express any kind of structure which has a hierarchical nature, i.e. every element is contained in toto in another element. There are some cases where this is not true, and SGML fails to handle those cases in the simplest way with attribute-less tags, yet it can be done with tags and time-space coordinates and reference points to describe start and stop of any event, including overlapping spatial elements. I don't think you have a good grip on what SGML is, but you're not alone. The only wish I have is that those who have nth-hand infor- mation and knowledge on SGML please try to verify it, especially as n approaches infinity. | Indeed. Frank Tompa of the New OED project at Waterloo, who has | had a lot of experience with online dictionaries, in co-operation | with Bob Amsler, then of Bellcore, now of Mitre, put in a lot of | time and came up with a proposed SGML def for dictionaries. But | it was tough, and even Tompa and Amsler were left somewhat | unsatisfied that they had covered the bases. This is not totally relevant to the OED2 project. The OED2 project had some significant real life constraints to work with, such as an existing dictionary. It's very unlikely to have a very large number of dictionary entries be consistent with any given structure, unless that structure is so large it becomes chaotic and useless. If you sat down to work out a DTD ("SGML def"?) for a dictionary, you would spend a large amount of time doing so, instead of randomly stuffing things into dictionary entries with only intuitive guidelines for structur- ing, so as not to confuse the poor user. Document analysis and design are truly _hard_ tasks, and require a lot more than people think. The complexity of the task of course grows with the complexity of the document under analysis. That doesn't mean it can't be done, which you imply. Once it's defined, it should also capture the way we can best retrieve information from a given instance, such as a dictionary entry. Of course this is hard. What did you expect? | I have to disagree on the "not pretty" part. Challenging, | complex, somewhat irregular, yes, all of those are true. But this | is no uglier than the English language that the OED is trying to | describe. I don't understand how you can put both a description and the object of a description into one big bag and get anything useful out of it. To me, it looks like you suffer from a severe layering confusion, wherein an abstraction (description) of an entity can be no different in complexity than the entity itself. This is a very naive view. It's also remarkably counter-productive, as the main objective of abstraction is to reduce complexity to a level where humans can com- fortably deal with it. I'm utterly amazed that this comes from one who has worked with the OED2 dictionary project. A description, or structure specification, or whatever, will neces- sarily have to extract the essential elements of what is described or specified. Otherwise, it's useless, as one can turn only to the described element and get a better idea of it. "Essential", of course, requires (human) intelligence and creativity in discovering what is and is not essential. The whole task of writing a definition is centered around discarding the unimportant. A document type like- wise requires that one extract the essentials, according to one or a few views, which have to be known explicitly by the designer. It so happens that SGML is a language in which one can express the interrelationships between elements of a hierarchical structure in such a way as to produce a consistent type, of which any given dictionary, dictionary entry, and on down, are instances. I don't understand how you can claim that SGML can't model natural languages. It wasn't intended to, and the question is completely irrelevant to the structuring of dictionary entries. It's like claiming that TeX can't model emotions, or that the programming language C can't model sexual experiences. </Erik> -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text <erik@naggum.no> 0118 OSLO, NORWAY Computer Communications <enag@ifi.uio.no>