dankg@volcano.Berkeley.EDU (Dan KoGai) (06/05/90)
In article <3137@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: >By the way, "other Indo-European language character sets" would have to >include many of the Indian languages, and the Devanagari scripts are not >covered by ISI 646 or ISO 8859. Indo-European is a _large_ language family. >As far as I'm concerned, the great thing about ISO 8859/1 is that at long >last it is _almost_ possible to type English text on a computer. And others suggested it will use up to 32bits to include virtually all languages. But this is not the only problem. Local difference spans from simply number of character set to justification: Arabic and Hebrew writes from left to right. Mongolian writes up to down. Japanese and Chinese allows all 3--left to right, right to left and up to down. Korean uses Indo-European-like way of writing in a sense that they have primitives for both consonants and vowels (in Japanese kana it's implemented as consonant-vowel cluster) but each syllable have to fit in one "character". And while simple character set of English allows single stroke typing (or WYTIWYG--what you type is what you get), Chinese and Japanese has so many characters that requires on-line dictionary to type in "small" keyboard. Justification in alphabetic language is word by word, which each word is typically delimited by space. Chinese and Japanese, on the other hand, needs no word justification, with each character so "dense" in meaning it's almost a word in a sense. And I wonder how thorough and complete (or loose and imcomplete) ISO's proposal is. Is it just a matter of mapping each character? Even so it's tough to say whether diacritics are distinct character or two characters, base char plus diacritical character. That also applies such languages as Korean where each "block" contains multiple phenoms to form a syllable. If those are replesented by just combination, that saves space (It applies to Chinese where each character is made up of primitives). And how about mixture of languages? It's common among Japanese to use alphabets and sometimes even hebrew characters (in math). And it's vital for such areas as foreign launguage educations. I think Xerox's implementation is very elegant but it still has problems. Xerox way is inherited by Mac (Does Xerox include this issue in its lawsuit against Apple) and I found Mac very powerful in foreign language processing--I have KanjiTalk, the Japanese OS and there you can not only type text in Japanese but also Menu definition, punctuation, date and unit definitions and more. But it still lacks vertical typing capability (it's up to application, not OS), which is crutial for Japanese DTP. And I'd be stuck if I wanted to use more than Japanese and English. More of all it still takes different OS to handle other languages (ArabicTalk and KanjiTalk won't run together). So I ask my question again. How much does new ISO's "Whole Earth Character Set" cover these localities? ---------------- ____ __ __ + Dan The "<- That still needs ascii pic in Usenet" Man ||__||__| + E-mail: dankg@ocf.berkeley.edu ____| ______ + Voice: +1 415-549-6111 | |__|__| + USnail: 1730 Laloma Berkeley, CA 94709 U.S.A |___ |__|__| + |____|____ + "What's the biggest U.S. export to Japan?" \_| | + "Bullshit. It makes the best fertilizer for their rice"
src@scuzzy.uucp (Source Admin) (06/08/90)
there is an article about computers using foreign character sets like chinese, hebrew, arabic etc in BYTE may. quite interesting ! they also talk about a programm that displays your typing in hebrew, coptic and some other charset simultaneously. have a look. -- Heiko Blume blume@scuzzy.UUCP FAX (+49 30) 882 50 65 Kottbusser Damm 28 blume@netmbx.UUCP VOICE (+49 30) 691 88 93 D-1000 Berlin 61 TELEX 184174 intro d