dhosek@jarthur.Claremont.EDU (D.A. Hosek) (11/30/89)
In article <20926@unix.cis.pitt.edu> jbw@unix.cis.pittsburgh.edu (Jingbai Wang) writes: >I strongly suspect if MetaFont type of approach can sucessful solve Chinese >formatting with TeX. According to mainland China GB standard (equivalent to >ASCII in USA), there are 87x94 Chinese characters. If each set of metafont >can carry 127 of them, you need more how many of them? and they all have to >be defined by \font. I am afraid TeX (especially LaTeX) memory will be blown >up. >JTeX used PK and TFM files which are not derived from Metafont, but they seem >to have ways to reduce number of sets. By the way, JIS has 94x94 characters. >I have completed a project of Chinese TeX, but it only supports PostScript >for the timebeing. It has nothing to do with metafont. I have talked to your >friend Nelson Beebe lately to have it installed in science.utah.edu for >distribution. Of cource, if metafont is available, I will modify it to use >metafont files indirectly. I don't want to see TeX/LaTeX memory and printer >VM blown up. Actually, Metafont can generate up to 65536 distinct character codes which is sufficient for all existing character sets (although I've heard a proposed 24bit Japanese set mentioned as a possibility for the future). JTeX works by breaking down the JIS set into 256 character subfonts. I believe that TeX retains the TFM organization of information in its own font info tables, in which case a 256 character Kanji font would probably take no more space than the info for a font like cmex10 (only one height, width, and depth would be necessary). There is another version of jTeX which uses 65536 character fonts as well. The printer VM problem really isn't one because any decent DVI driver only downloads those characters that are actually used, and if VM is cleared after each page, it would be difficult to run out of VM. The bigger problem is more the sheer tedium of writing and debugging all the individual character programs. See my paper presented at the TUG conference in August (to appear in the proceedings issue of TUGboat) for details of one approach. -dh -- "Odi et amo, quare id faciam, fortasse requiris? nescio, sed fieri sentio et excrucior" -Catullus D.A. Hosek. UUCP: uunet!jarthur!dhosek Internet: dhosek@hmcvax.claremont.edu
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (12/01/89)
From article <3313@jarthur.Claremont.EDU>, by dhosek@jarthur.Claremont.EDU (D.A. Hosek): >In article <20926@unix.cis.pitt.edu> jbw@unix.cis.pittsburgh.edu (Jingbai Wang) writes: >... I've been working at printing Chinese, too. I'll be eager to use JB's Chinese TeX (but where does the actual font come from?). Here is what I have so far, in case it might be of interest to anyone: 1) A set of 34 TeX-compatible subfonts, in 4 sizes, derived from the 24x24 bit Chinese font available by ftp from hanauma.stanford.edu in pub/zhongwen. The subfonts are pk and tfm files, meant to be used with JTeX (or JTeX slightly modified). 2) A program p2ps, derived from the JTeX utility k2ps (which came in turn from a2ps) for printing unformatted text with a mixture of ordinary roman and Chinese on a PostScript printer. It uses the fonts mentioned in 1). 3) A partially working modification of JTeX to use the Chinese fonts in place of the JIS Japanese fonts. (At the moment, not all the Chinese characters can be printed.) Now, it may be I'll just give up my little project once I can try out JB's Chinese TeX. I don't know -- my real interest in all this is in working toward some generalized facilities for composing and using large fonts -- not just Japanese and Chinese. But now I have some questions: Don Hosek mentions a variety of JTeX that uses one big font instead of a bunch of subfonts (did I get that right?). That interests me. Where can I get it? What's the right convention for escaping Chinese text? I'm just using the JIS conventions now. What about texts that have roman + Japanese + Chinese and maybe other character sets? Is there any agreed on convention? What about editing? Is there any public domain editing software for Chinese, like maybe a Chinese version of emacs? Does anyone have good ways of extending character bit maps to other sizes (e.g. 24x24 to 36x36)? (My way of doing this has some problems.) Greg, lee@uhccux.uhcc.hawaii.edu
jbw@unix.cis.pitt.edu (Jingbai Wang) (12/01/89)
In article <5578@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu (Greg Lee) writes: >From article <3313@jarthur.Claremont.EDU>, by dhosek@jarthur.Claremont.EDU (D.A. Hosek): |>In article <20926@unix.cis.pitt.edu> jbw@unix.cis.pittsburgh.edu (Jingbai Wang) writes: |>... |Here is what I have so far, in case it might be of interest to anyone: |... |1) A set of 34 TeX-compatible subfonts, in 4 sizes, derived from the |24x24 bit Chinese font available by ftp from hanauma.stanford.edu |in pub/zhongwen. The subfonts are pk and tfm files, meant to |be used with JTeX (or JTeX slightly modified). | Yeah, that's how jTeX fonts were built. |2) A program p2ps, derived from the JTeX utility k2ps (which came |in turn from a2ps) for printing unformatted text with a mixture |of ordinary roman and Chinese on a PostScript printer. It uses |the fonts mentioned in 1). I am not impressed by k2ps, try out my WStroff which can not only print unformatted text, but can also format text with Chinese fonts of different sizes, Adobe fonts in any family. Chinese fonts are from a whole set instead of subset. |Now, it may be I'll just give up my little project once I can |try out JB's Chinese TeX. I don't know -- my real interest in |all this is in working toward some generalized facilities for |composing and using large fonts -- not just Japanese and |Chinese. But now I have some questions: Why? We are using totally different approaches. It is alwasys good have different ways of solution to a problem as in academic journals. |Don Hosek mentions a variety of JTeX that uses one big font |instead of a bunch of subfonts (did I get that right?). That |interests me. Where can I get it? I don't read TUGboat (because I was really a Scribe hacker and C programmer, instead of TeX one), but I knew there were articles there about it. Well, a font of more 256 characters should not surprise anybody as computer text evolves, since 256 = 2^8 (8-bit representation or one byte representation), and JIS (Japanese) and GB (Chinese) and Big-5 (Taiwan Chinese) are using 2 bytes, it is 2^16 = 65536. However, we only use #161~#254 in both bytes because there are only 7000 some commonly used Chinese characters or Japanese Kanji (HanZi, in Chinese PinYin), and we do want to distinguish Chinese bytes from standard ASCII ones (#33~#126), remembering also not to use the control characters (#0~#31 and #127~#159). #32 and #160 (128+32) are reserved for <space). Thus, 65536 has only of theorectical beauty. If METAFONT also uses 16-bit encoding instead of 7-bit or 8-bit, 65536-char font set should not scare anybody. In my previous posting, I did not mean it was not possible to generate Chinese fonts with Metafont, of course you can enven with 8-bit scheme. Just break it up into subsets. The really problem is the efficiency of TeX and printing. VM is a serious problem uf you have many distinct characters in a page. After eash page, you can flush it, as Hosek said, but not in the middle of the page. Adobe is in an effort to support multiple bytes encoding scheme (two to three bytes), this will make programmer's life easier. The only thing is that the cumstomers have to pay bigger bucks for memory expansion and Adobe PS licensing (which is included in the printer price). | |What's the right convention for escaping Chinese text? I'm |just using the JIS conventions now. What about texts that |have roman + Japanese + Chinese and maybe other character |sets? Is there any agreed on convention? I developed ChTeX before I saw JTeX, and thus I do not stick to JIS, and I don't think any Chinese from China, Hong Kong, Taiwan, Singapore and overseas will. GB is the way to go as far as I can see. | |What about editing? Is there any public domain editing |software for Chinese, like maybe a Chinese version of emacs? I have it in ChTeX.tar.Z for mainframe systems. It is called ChText. It follows the most natural way (and some completely new ideas) fro you to type in Chinese just like typing English. Chinese emacs may not be too hard to design for some particular hardware with graphic capability, and indeed for DOS PC somebody already adopted emacs/epsilon commands in a Chinese editor. The key issue in inputting Chinese, however, is how to make inputting natural to human mind and international key board. | Greg, lee@uhccux.uhcc.hawaii.edu JB Wang