mike@turing.unm.edu (Michael I. Bushnell) (08/12/87)
I think the discussion about Chinese words would benefit from some interesting knowledge I gleaned from a book on writing systems and the alphabet. In Chinese, every word is one syllable. Needless to say, there is lots of overloading here, but the multiple meanings of a word are usually quite different and can be easily distinguished from context. There is not one character per word(== syllable), rather, there is one character per word meaning. Representations of Chinese in data-processing is always assumed to be like writing Chinese. In that case, you need enough bits to hold the large lexicon. But is it not possible to represent the syllables? There may be problems (many computer "things" have little context) but it might be workable. As for the size of the lexicon, a recent article here said that the OED had about 1,000,000 words, and English slightly more than that. From this, the poster derived a figure of 1,000,000 for the size of the Chinese lexicon. But English is a remarkable language. For most types of things, we have TWO words, one Latinic, one Germanic. For example: teeth/dental dead/mortal car/automobile. The list of such pairs is huge. In no other language to my knowledge is there such a phenomenon. My estimate, from this and other reasons for the large English lexicon, is about 500,000 words in Chinese. Unfortunately, this means that each character would not fit in 16 bits. But the number of syllables is MUCH less. That could probably fit. Michael I. Bushnell a/k/a Bach II mike@turing.UNM.EDU --- Where do your SOCKS go when you lose them in th' WASHER? -- Zippy the Pinhead
stanwyck@drutx.UUCP (08/17/87)
in article <615@unmvax.UNM.EDU>, mike@turing.unm.edu (Michael I. Bushnell) says: > In Chinese, every word is one syllable. Needless to say, there is > lots of overloading here, but the multiple meanings of a word are > usually quite different and can be easily distinguished from context. > There is not one character per word(== syllable), rather, there is one > character per word meaning. > > mike@turing.UNM.EDU Au contraire! As a former Chinese translater I can catagorically state the falseness of the above statement. There are 2 errors, related, in the above: 1. In Chinese, many words are multi-syllable (e.g., computer is 2 syllables - dien nyau [ dien - electric, nyau - brain]) 2. In Chinese, virtually every character is one syllable and has only one pronounciation. This is the primary advantage of learning Chinese rather than Japanese (which I also speak). Japanese kanji normally have at least two and as many as 23 different pronounciations (nama of nama tomago), some of which are single syllable and others of which are multiple syllables. The result is the opposite of the above statement: Chinese characters generally map one-to-one with a syllable, and while many words are monosyllabic, there are many compound words (see above example) that are polysyllabic. Most of the latter are of recent introduction. -- AT&T o o 303-538-5004 Don Stanwyck || ihnp4!drutx!stanwyck Denver, CO USA \__/ Telecom Standards