thomson@wasatch.UUCP (Rich Thomson) (12/13/88)
[ Please excuse the large newsgroup list, but also note that follow-ups are directed to comp.graphics. ] I'm interested in a scheme for entering Chinese characters via a keyboard. I've come up with the idea on my own, but the scheme seems obvious. So ovious that I imagine someone has already implemented it. The basic problem is to design a user interface for input of Chinese characters in a fashion that is analogous to the writing of the character as a sequences of strokes. There are 24 different basic strokes that I know of for Chinese calligraphy, although there may be more. When someone writes a Chinese character, the basic strokes are always written in accordance with a set of rules (left to right, top to bottom, etc). The sequence of basic strokes comprising a character is consistent from person to person. Similarly, when printing the letter 'h', we are always taught to draw the stem '|' first, and then the tail to complete the letter. The user interface for input of the character should use the stroke information (encoded on a key, for instance) in combination with the order of the strokes to uniquely identify a given Chinese character, or perhaps learn a new character. The Roman alphabet equivalent is already implemented in real-time spelling checker/completion programs that currently run on many machines. I believe that this is a most natural scheme for entering the characters as it mimics the act of writing the character calligraphically. This means the user need only adapt their current method of writing characters for machine input. Similar to learning to type English words by pressing sequences of letter keys in conjunction with the SHIFT key. There is also the subtle issue of size in conjunction with the stroke type and sequence. The same stroke appears in many different characters but of different sizes, so the user must be provided some fashion of adjusting the size of the stroke to fit the character; perhaps an ALT, SHIFT or META key can serve to identify this modifier to the stroke. Given this type of a scheme, does anyone know of any implementations of similar character entry systems, possibly for Japanese or other oriental character sets? Are there any journals (again, possibly Japanese) devoted to the problem of oriental native language I/O? Any references to articles, journals, books, programs, etc., would be greatly appreciated. Thanks in advance, -- Rich -- Rich Thomson thomson@cs.utah.edu {bellcore,hplabs}!utah-cs!thomson "Tyranny, like hell, is not easily conquered; yet we have this consolation with us, that the harder the conflict, the more glorious the triumph. What we obtain too cheap, we esteem too lightly." Thomas Paine, _The Crisis_, Dec. 23rd, 1776
bph@buengc.BU.EDU (Blair P. Houghton) (12/14/88)
In article <789@wasatch.UUCP> thomson@wasatch.utah.edu.UUCP (Rich Thomson) writes: >[ Please excuse the large newsgroup list, but also note that follow-ups are > directed to comp.graphics. ] > >I'm interested in a scheme for entering Chinese characters via a keyboard. >I've come up with the idea on my own, but the scheme seems obvious. So >ovious that I imagine someone has already implemented it. > >The basic problem is to design a user interface for input of Chinese >characters in a fashion that is analogous to the writing of the character >as a sequences of strokes. There are 24 different basic strokes that I >know of for Chinese calligraphy, although there may be more. Sounds simple enough, but you might try a digitizing pad and some sort of character-recognition software; the numerous configurations of those strokes in the thousands of chinese symbols might be a source of error in typing. I've actually seen a photo of a Chinese keyboard: it had about a hundred alphabetic keys, and a pad of nine (that's nine, one less than ten) shift keys. --Blair "Sounds perfect for Emacs."
kinmonthprep@deneb.ucdavis.edu (Earl H. Kinmonth) (12/14/88)
In article <789@wasatch.UUCP> thomson@wasatch.utah.edu.UUCP (Rich Thomson) writes: >[ Please excuse the large newsgroup list, but also note that follow-ups are > directed to comp.graphics. ] > >I'm interested in a scheme for entering Chinese characters via a keyboard. First, before you invent a wheel that has already been invented, why not look at some of the commercial word processors that are available for Chinese and Japanese. Even before you do that, think about your terminology. Chinese characters for Chinese are one thing, characters of (largely) Chinese origin used in Japanese are another. >The basic problem is to design a user interface for input of Chinese >characters in a fashion that is analogous to the writing of the character If you had done a little research you would know that there are a variety of methods already in use for "Chinese" characters ranging from entering raw HEX codes to fairly sophisticated context analysis schemes using rudimentary AI techniques. Japanese vendors have experimented with a variety of techniques including pressure sensitive tablets, stroke classification schemes, etc. A few of these are discussed in J. Marshall Unger, The Fifth Generation Fallacy: Why Japan is Betting Its Future on Artificial Intelligence (Oxford University Press, 1987). Overall, this is a shallow book, but it does describe some of the techniques use to handle characters in ENGLISH. To learn more, pick up the technical manuals for commercial Japanese word processors. Overall, Japanese seems best handled by table lookup from romanized input. Of course, characters are only a fraction of the symbols needed for writing Japanese. I make this generalization based on experimentation with a number of input techniques, but the best argument for it is that it is what people are buying in Japan. Every Japanese manufacturer seems to have tried a proprietary input scheme, but the one that users seem to prefer is translation from romaji. [much cut]
sun@venus.ycc.yale.edu (12/14/88)
In article <789@wasatch.UUCP>, thomson@wasatch.UUCP (Rich Thomson) writes... > >The user interface for input of the character should use the stroke >information (encoded on a key, for instance) in combination with the order >of the strokes to uniquely identify a given Chinese character, or perhaps ^^^^^^^^ >learn a new character. This scheme doesn't solve the problem of ambiguity, which is one of the major obstacles in Chinese character coding systems. For example, the character Jia3 (as in Jia3, Yi3, Bin3, Ding1, i.e., 1, 2, 3, 4, you know what I meant) and the character Shen1 (a family name) have the same number and sequence of strokes, and the same size of strokes. The only difference is the relative position of the last vertical stroke. Besides, the number of keys pressed could be very large. Hence, even if such an implementation exists, it is a very inefficient one. >er sets? Are there any journals (again, possibly Japanese) devoted >to the problem of oriental native language I/O? Any references to >articles, journals, books, programs, etc., would be greatly appreciated. I rember I read somewhere that there was a conference dedicated for Chinese Word Processing. But I fogot where. Maybe you can look for it.
tex@wucc.waseda.JUNET (Kamiya Fumiaki) (12/14/88)
I don't know how it is done in other oriental countries, but at least, I can tell you how it is usually done in Japan. The main idea is to deploy what is called a kana-to-kanji converter. Given a string of kanas, which represents the sound of the kanji he/she wants, it displays a list of kanjis and the user selects the one he/she wants. That's all. In fact there are other features implemented in real kana-to-kanji converters in public but the fundamental part is just what I have said. Of course, since there are about 50 kana characters, we can't enter a kana in a single stroke from an ASCII keyboard. But fortunately, there is so-called 'roma-ji' that assigns a string of alphabets, usually two, to every kanas. So if this convention is known by the kana-to-kanji converter, one can obtain kanji documents from an ASCII keyboard. (We also have so-called 'JIS keyboard' and one can enter kana in a single stroke) Kamiya Fumiaki Department of Mathematics, Waseda University NOTE: Please don't reply by mail, it will be rejected at the gateway.
geoff@lloyd.camex.uucp (Geoffrey Knauth) (12/14/88)
In article <45616@yale-celray.yale.UUCP> sun@venus.ycc.yale.edu writes: > Besides, the number of keys pressed could be very large. Hence, >even if such an implementation exists, it is a very inefficient one. > >>er sets? Are there any journals (again, possibly Japanese) devoted >>to the problem of oriental native language I/O? Any references to >>articles, journals, books, programs, etc., would be greatly appreciated. > I rember I read somewhere that there was a conference dedicated for >Chinese Word Processing. But I fogot where. Maybe you can look for it. I suggest you contact IBM, which has done a lot of work in China. You should also read the 11/21/88 edition of the Seybold Report on Publishing Systems, Vol. 18, No. 5, "IPEX, Part III: Non-Roman Languages Take Center Stage." An excerpt from that article reads, "HTS [High Technology Systems, an industry leader] uses the so-called 'Dr. Zhi' method of typing Chinese, whereby four basic elements (out of a set of 180) are used to construct a character. Some common characters can be entered with a single keystroke." -- Geoffrey S. Knauth ARPA: geoff%lloyd@hcsfvax.harvard.edu Camex, Inc. UUCP: geoff@lloyd.uucp or hcsfvax!lloyd!geoff 75 Kneeland St., Boston, MA 02111 Tel: (617)426-3577 Fax: 426-9285 I do not speak for Camex.
curtc@pogo.GPID.TEK.COM (Curtis Charles) (12/15/88)
In article <789@wasatch.UUCP>, thomson@wasatch.UUCP (Rich Thomson) writes... >The user interface for input of the character should use the stroke >information (encoded on a key, for instance) in combination with the order >of the strokes to uniquely identify a given Chinese character, or perhaps Several years ago I saw a prototype for a keyboard well suited to Chinese. (I know very little about Chinese, so take this with a grain of salt...) The keyboard was flat, and lacked the tactile feeling we've come to enjoy, and was much like a membrain keyboard. The reason that it was flat was that the glyphs were projected from behind onto the keyboard. Apparently, the Chinese alphabet can be thought of as tree structured, so getting a character (glyph?) on the screen became a process of menu selection. Several thousand characters were programmed in, and it took 3 to 5 (?) "menu picks" to get to a glyph on the screen. Thought about a graphic tablet with recognition software? (Probably tougher than recognition for English...) ------------------------------------------------------------------------ Curt Charles | "Let our swords run red with the blood of curtc@pogo.GPID.TEK.COM | infidels..." Sean Connery
wu@sunybcs.uucp (Wan-Chung Wu) (12/15/88)
In article <45616@yale-celray.yale.UUCP> sun@venus.ycc.yale.edu writes: > I rember I read somewhere that there was a conference dedicated for >Chinese Word Processing. But I fogot where. Maybe you can look for it. I know at least one annual conference discusses all stuff about Chinese Processing. The name of the conference is "International Conference on Chinese Computings". The proceedings of that conference should be able to give you some ideas of Chinese input methods. The one I attended is held on June 14~17, Chicago, IL, 1987. If somebody want to know where can you get the proceeding, please let me know and I will try my best to give you the pointer. ========================================================================= wu@cs.buffalo.edu Graphics Group University Computing Service State University of New York at Buffalo ========================================================================
jdm@h.cs.wvu.wvnet.edu (James D Mooney,205K,7,2913548) (12/15/88)
From article <283@lloyd.camex.uucp>, by geoff@lloyd.camex.uucp (Geoffrey Knauth): > In article <45616@yale-celray.yale.UUCP> sun@venus.ycc.yale.edu writes: >>>er sets? Are there any journals (again, possibly Japanese) devoted >>>to the problem of oriental native language I/O? Any references to >>>articles, journals, books, programs, etc., would be greatly appreciated. >> I rember I read somewhere that there was a conference dedicated for >>Chinese Word Processing. But I fogot where. Maybe you can look for it. > > I suggest you contact IBM, which has done a lot of work in China. You > should also read the 11/21/88 edition of the Seybold Report on > Publishing Systems, Vol. 18, No. 5, "IPEX, Part III: Non-Roman > Languages Take Center Stage." An excerpt from that article reads, > "HTS [High Technology Systems, an industry leader] uses the so-called > 'Dr. Zhi' method of typing Chinese, whereby four basic elements (out > of a set of 180) are used to construct a character. Some common > characters can be entered with a single keystroke." Another place this subject is discussed is at the annual PROTEXT conferences organized by Professor J. Miller of Trinity College, Dublin, Ireland. PROTEXT IV, held October 1987 in Boston, included some relevant papers including: Text Processing in Ideographic Languages, by Loh Shiu-Chang and Kong Luan Key Problems in Developing an Advanced Chinese Text Processing and Typesetting System, by Wang Xuan Proceedings of all PROTEXT Conferences are available from Boole Press Limited P.O. Box 5 Dun Laoghaire, Co. Dublin, Ireland Jim Mooney Dept. of Stat. & Computer Science (304) 293-3607 West Virginia University Morgantown, WV 26506 USENET: {allegra,bellcore,cadre,idis,psuvax1}!pitt!wvucsb!wvucsa!jdm
asp@puck.UUCP (Andy Puchrik) (12/16/88)
In article <391@wucc.waseda.JUNET>, tex@wucc.waseda.JUNET (Kamiya Fumiaki) writes: > I don't know how it is done in other oriental countries, but at > least, I can tell you how it is usually done in Japan. I've seen the NEC msdos micro and some of the laptop Japanese word processors. They all have the JIS character set in ROM. I suppose the terminals have hardware assist also. What kinds of software is available for workstations? Surely there must be terminal emulators and word processors for SUN and 386-class systems. Much of the spread of computers in the States and Europe was due to public domain editors and terminal emulators. Is there such a thing as public domain Japanese software? Anything that would run on the larger systems? -- Internet: asp@puck.UUCP Andy Puchrik uucp: decvax!necntc!necis!puck!asp Moonlight Systems ARPA: puchrik@tops20.dec.com Concord, MA 01742
wu@sunybcs.uucp (Wan-Chung Wu) (12/17/88)
To those who are interested in the Chinese Input schemes, As I promise to "try my best to give you a pointer" for the proceedings of International Conference on Chinese Computing, here are the persons you should contact with: (Because there are too many people to request the information, I have to post the information here to save my tight schedule :-) ) Prof. Shi-Kuo Chang Department of Computer Science University of Pittsburgh Dr. Patrick S.P. Wang Department of Computer Science Northeastern University Boston, Massachusetts Dr. An-Chi Liu Department of Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois The following are a list of papers in proceeding of ICCC'87 that related to Chinese Input: ------------------------------------------------------------------- 1. W.C.P. Yu, "Some New Advancement in High Speed Two-Stroke Chinese Input System". 2. H.L. Soo, "A Generic Chinese Input System". 3. W.H. Wu, "Chinese Characters Encoded in Stroke-Sequences". 4. J. Zhu and X. Liu, "A New Input System for Chinese Language Processing". 5. K.Y. Cheng and F.K. Yu, "On Disambiguous Chinese Phonetic Input" 6. A. Mathur and F. Fowler, "Design of a Dynamically Reconfigurable Keyboard". 7. A. MacDonald and Y.H. Ng, "Sequence Prediction for Chinese Language Input". 8. H.C. Tien, "PINXXIEE: The Chinese Computer Input Language". 9. V.C. Yeh, "The Phonetic Chinese Language Computer System". 10. T.Y. Kiang and T.H. Cheng, "Survey on the Establishment of Indexing System for Composed Chinese Characters". 11. T. Huang, "The Dai-E Chinese Encoding Method". ---------------------------------------------------------------- I am sure that there should be more interesting papers in the proceeding of ICCC'88 or former ones. If you still have question, send me mail again. Sorry to response you guys so late! ========================================================================== wu@cs.buffalo.edu Graphics Group Univeristy Computing Service State University of New York at Buffalo ==========================================================================
charette@edsews.EDS.COM (Mark A. Charette) (12/17/88)
In article <351@puck.UUCP>, asp@puck.UUCP (Andy Puchrik) writes: > In article <391@wucc.waseda.JUNET>, tex@wucc.waseda.JUNET (Kamiya Fumiaki) writes: > > I don't know how it is done in other oriental countries, but at > > least, I can tell you how it is usually done in Japan. > and terminal emulators. Is there such a thing as public domain > Japanese software? Anything that would run on the larger systems? If you're really ambitious you might want to take the X based kterm program and modify it to become an editor. All the X systems I've seen based on the distributed X tape have kterm and the kana and kanji fonts (14x14). If anyone is interested, I can send the 24x24 fonts to them. The file is a bit big (~ 2 mb) and is in pseudo-bdf format (I got them to compile into snf format with the X font compiler - but some work is necessary to put them in the proper JIS 1 & 2 positions). I will mail if that's the only way, but I would prefer it if you sent a Dec, Sun, Apollo, or HP tape, or if you sent either a high density PC floppy or enough low density ones to fit the data. ----- Mark Charette "People only like me when I'm dumb!", he said. Electronic Data Systems "I like you a lot." was the reply. 750 Tower Drive Voice: (313)265-7006 FAX: (313)265-5770 Troy, MI 48007-7019 charette@edsews.eds.com uunet!edsews!charette -- Mark Charette "People only like me when I'm dumb!", he said. Electronic Data Systems "I like you a lot." was the reply. 750 Tower Drive Voice: (313)265-7006 FAX: (313)265-5770 Troy, MI 48007-7019 charette@edsews.eds.com uunet!edsews!charette
tex@wucc.waseda.JUNET (Kamiya Fumiaki) (12/19/88)
In article <351@puck.UUCP>, asp@puck.UUCP (Andy Puchrik) writes: > I've seen the NEC msdos micro and some of the laptop Japanese word > processors. They all have the JIS character set in ROM. I suppose > the terminals have hardware assist also. What kinds of software is > available for workstations? Surely there must be terminal emulators > and word processors for SUN and 386-class systems. Much of the spread > of computers in the States and Europe was due to public domain editors > and terminal emulators. Is there such a thing as public domain > Japanese software? Anything that would run on the larger systems? Yes, as far as I know, there are few kana-to-kanji systems for UNIX machines. Wnn is one of such systems and is said to be the most powerful tool for this purpose. It was developed by Kyoto University, Tateishi Electronics and ASTEC. Since I'm not sure about how it is actually distributed, anyone willing to obtain a copy or more information should contact ASTEC directly. Their address is: ASTEC, Inc. Nagashima-Daiichi Building 1-22-12, Dougenzaka, Shibuya, Tokyo 150. --- Kamiya Fumiaki Department of Mathematics, Waseda University