cdr@amdcad.AMD.COM (Carl Rigney) (03/08/90)
I'm playing with the kanji browser written by Stan Switzer (sjs@ctt.bellcore.com) and I'm wondering what order the Kanji symbols in the NeWS font are in. In his 88/10/04 posting to NeWS-makers he notes that % It seems that Kanji is encoded by pairs of hyper-printable-ASCII % characters. If we were to give each Kanji char a number "n" % beginning at 0, then "n", in terms of the pair (a,b) is given by % n = 96*(a-160) + (b-160) % The -160 term comes from subtracting 128 for hyperASCII and 32 % for printable ASCII. There are many ways to get the same character % (when, for instance, b is not in the "normal" range of 160-255). % This analysis is based on trial and error and error and .... My question is: Is there any mapping from say, Nelson Index # to Ordinal value in the font? For example, Ko (self) is 1462 in Nelson, and 370 in Hadamitzky & Spahn. You can display it with /Kanji findfont 24 scalefont setfont 72 72 moveto (\270\312) show 96*(188-160) + (202-160) is 2730. What I'm wondering is, is there any mapping, or are the Kanji in the font in random order (that would be horrible!). Also, does anyone out there have any tools, programs, or code examples for working with Kanji under OpenWindows? I know about Kterm for X and am planning to work on that, but I know almost nothing about X and would much prefer a solution that uses NeWS. What would be ideal is a Kanji hyper-dictionary to help learn it (I'm just a beginner), but almost anything would be appreciated. Thank you in advance for any hints, leads or help you can provide. And if anyone's interested, I'll post a summary of what I find out to comp.windows.news. -- Carl Rigney cdr@amdcad.AMD.COM {ames att decwrl pyramid sun uunet}!amdcad!cdr 408-749-2453
thomson@hub.toronto.edu (Brian Thomson) (03/10/90)
In article <29422@amdcad.AMD.COM> cdr@amdcad.AMD.COM (Carl Rigney) writes: >I'm playing with the kanji browser written by Stan Switzer >(sjs@ctt.bellcore.com) and I'm wondering what order the Kanji symbols >in the NeWS font are in. Kanji characters are encoded as specified by Japanese standard JIS C 6226, which comes in 1978 and 1983 flavours that are essentially identical. The encoding is done such that each 16-bit character, when regarded as a pair of 8-bit units, looks like a pair of printable ASCII characters. Similar standard encodings exist for Chinese and Korean. >My question is: >Is there any mapping from say, Nelson Index # to Ordinal value in the font? >For example, Ko (self) is 1462 in Nelson, and 370 in Hadamitzky & Spahn. >You can display it with > >/Kanji findfont 24 scalefont setfont >72 72 moveto >(\270\312) show > >96*(188-160) + (202-160) is 2730. What I'm wondering is, is there any >mapping, or are the Kanji in the font in random order (that would be horrible!). > The high-order (i.e. 128) bit being on is not required by the standard, it is a convention often used to distinguish Kanji from Roman characters in text that may contain both. The standard way to do this is to use escape sequences to switch from one character set to the other and back again. These escape sequences are standardized by the ISO. I have heard of Kanji dictionaries that indicate the JIS code, but I don't know any specifics. The character set actually contains more than just Kanji. It begins with kana (Japanese phonetic alphabets), graphics symbols, and Roman, Greek, and Cyrillic alphabets, and Arabic numerals. Then come the Kanji, which are in two groups: first a group of (relatively) common characters ordered by their commonest "on" (Chinese-derived) pronunciation in the usual Japanese syllabary order (a-i-u-e-o-ka/ga-ki/gi- etc.), then a group of less common ones ordered by radical. The character you describe has the pronunciations "onore", "ki", "ko", and sometimes "mi" in people's names. It is a common character, and its position in the character set is determined by the "ko" reading. That puts it near the front of the pack. It is immediately followed by (\270\313 as you would put it) "ko" meaning a kind of storage shed, as in "reizouko" = refrigerator. -- Brian Thomson, CSRI Univ. of Toronto utcsri!uthub!thomson, thomson@hub.toronto.edu
mleisher@nmsu.edu (Mark Leisher) (03/10/90)
Try 94*(a-161) + (b-161) instead of 96*(a-160) + (a-160). I discovered this in the encoding of some Asian fonts in the new release of X11. -- ----------------------------------------------------------------------------- mleisher@nmsu.edu "I laughed. Mark Leisher I cried. Computing Research Lab I fell down. New Mexico State University It changed my life." Las Cruces, NM - Rich [Cowboy Feng's Space Bar and Grille]