jalbert@cs.ubc.ca (Francois Jalbert) (10/12/90)
Hallo TeXperts. I have been working for a while on my own Japanese TeX system, but before I invest more time, I thought I would mention what I am doing and how I am doing it to all. Perhaps I am repeating another person's work. Perhaps some of you have some advice to give me. The biggest problem seems to be the large number of symbols. I first decided I would limit myself to a few thousands, but which ones? The answer came with the simple Japanese vi editor for MS-DOS machines called MOKE. There is in there a file called JIS24 which contains about 7802 24 by 24 pixel resolution japanese symbols. JIS stands for Japanese Industrial Standard and could act for me as some sort of extended ASCII table. I decided to limit myself to these symbols. I don't know where that file JIS24 comes from. The documentation seems to imply it was derived from some X-Window file. Any info regarding that and possible copyright violation is welcome. I quickly wrote a few utilities with my Turbo-Pascal 5.0 which allowed me to browse through JIS24, dump it all on my printer (+/- 50 pages), and also manipulate the information for each individual symbol. I decided to try the following approach. Write a small picture environment for each Japanese symbol. The picture will be a simple 24 by 24 matrix with circle*{1} put at the right places. If one assumes 10 such Japanese characters per inch, that gives us a density of 240DPI. Of course, that is not "true" 240DPI since the characters don't have a continuous boundary, but it might be enough for my simple needs. So I wrote a utility which translated automatically JIS24 into a large number of small .tex files, each containing the right pattern of circle*{1}. I could then use LaTeX as usual, and here and there use commands like \jap{3056} to get Japanese character #3056 to appear. That works fine. It looks great with my screen previewer, and so so with my lousy dot matrix printer. My current complaints are the size of the .dvi generated (typically 20 times the size of the main .tex document), the amount of memory used (big emTeX blows up with half a page of Japanese text), and the large CPU time required. I thought it ought to be possible to use METAFONT to generate fonts of small matrices of dots. After all, METAFONT must have primitive operators to draw outlines of symbols. There is probably some sort of circle*{1} operator in there. I could automatize the creation of these .mf files in the same way I did it for my .tex files. That's no problem. Does anybody have examples of such fonts? I could just change the size to 24 by 24 and the dot patterns. The problem now is the number of different fonts needed. At 128 Japanese symbols per font, I need around 40--50 fonts which might be potentially all needed in a given document. Is that too much for LaTeX? Is it possible to load a font, grab a character, and then discard the font. That would slow things down, but would allow me to at least process the document. Each font could be numbered like JAP23.TFM, and it would be "easy" from something like \jap{3056} to deduce the font number and the symbol offset. Anyway, I sure would appreciate any advice or information anyone could have for me. I want to avoid \specials since postscript dependant. I also know quality won't be great, please no flames regarding the spirit of TeX being violated. If this works fine, I may look at generating better fonts. But right now, I just want a bare bone system running. A million thanks in advance. Franky, hacker at large.
mzw_t@hpujsda.HP.COM (Matsuzawa Takashi) (10/16/90)
---- There already exist two Japanese TeX's that are widely used in Japan. They are `jTeX' ported at NTT (Nihon Telephone & Telegram) lab and Nihongo-TeX ported by ASCII co., a Japanese private company. They are both based on ctex 2.95 (or, pre-3.0) UNIX implementations. JTeX first ran on TOPS-20 and ported to VAX/VMS and UNIX. They are in public and you can obtain them free from following Internet hosts via anonymous FTP. miki.cs.titech.ac.jp (Tokyo Institute of Technology) utsun.is.s.u-tokyo.ac.jp (Tokyo University) Their archives are named as `ASCII-jTeX' or `NTT-jTeX' there. I believe there is no widely used public port of Japanese TeX to PC's yet. (ASCII co. is already selling the commercial version of Nihongo-TeX on NEC's PC-9801 computers, the major force in Japanese PC world.) So, you are encouraged to work on your Japanese TeX! --- JIS kanji set (JIS X 0208) for your character set is a good choice and enough. It includes Hiraganas, Katakanas, miscellaneous punctuations, and Kanjis --- you will not meet serious difficulties denoting the usual Japanese language sentenses. I can not be sure from where your JIS24 data came, but you can obtain the public kanji fonts in X11 bdf formats, from the sites I have noted above. (You can find k14.tar.Z, etc.) --- They may be not large enogh to meet your needs, but they are in public domain. Note: I will use the term `kanji's to denote the non-ASCII characters that appear in Japanese texts hereafter --- although `kanji's are just the subset of Japanese characters, as you might know. My only suggestion to your implementation is to use Shift-JIS kanji code or EUC (UJIS) kanji code for your input texts. (Or, you can also use the complicated ISO escape sequences to invoke kanji character sets from within ASCII texts.) They are standard encodings (multiple-byte encoding schemes) to manipulate Japanese texts in computer data. If your TeX allows these character codes, you can enjoy printing out whatever Japanese langage text files you have obtained from somewhere. ---- And, here is a brief description of NTT's jTeX implementation. (Nihongo-TeX has done major enhancements to TeX font file formats that are incompatible with ordinal TeX, and I think their approach is too drastic. --- they are planning to implement Nihongo-TeX with vertical writing mode, and it itself is a very interesting attempt, though...) In fact, I am currently using jTeX (jLaTeX) on my Apollo workstations, and a bit knowlegeble about it. I hope this will give you some hints on your Japanese TeX implementation. The main reason that jTeX does not `blow up' is that it treats Japanese text as a series of character codes, not as series of graphic patterns. --- There do exists the limit of loadable font numbers, though. (I think it might be good to look into jtex.ch, the TeX change file which is the core of jTeX implementation. you can also find the working implementations of jLaTeX, jBibTeX, etc.) --- jTeX reads the input text (which is generally the mixture of ordinal ASCII codes and Kanji codes.) It detects the kanjis in it and encode them into the special internal codes (a pair of bytes specifies one Japanese character). jTeX apply Japanese language specific formatting rules on them. For example, jTeX has the concept `current kanji-font' in addition to TeX's `current font' --- you have two `current font's in jTeX. Kanji characters have special glues, etc. ---- jTeX' internal expression of a Japanese character is as follows. <sub-font#><char#-within-subfont> As you have wrote, because Japanese language has so many characters, TeX font files' limit (256 glyphs) is not enough. jTeX uses multiple TeX font files for one font. i.e. you need just one file for the 10pt Computer-Modern font (cmr10.300pk, it contains necessary 128 glyphs.) But, if you need the 10pt DNP-Mincho font, then you need following files. Each of them contains 255 glyphs, approx seven thousand glyphs in total. dmjsy10.{tfm|300pk} (punctuations) dmjroma10.{tfm|300pk} (alpha-numerics) dmjhira10.{tfm|300pk} (hiraganas) dmjkata10.{tfm|300pk} (katakanas) dmjgreek10.{tfm|300pk} (greek characters) dmjrussian10.{tfm|300pk} (cyrillic characters) dmjkeisen10.{tfm|300pk} (line drawing characters) dmjka10.{tfm|300pk} (kanjis - 1st level) dmjkb10.{tfm|300pk} ( " ) dmjkc10.{tfm|300pk} ( " ) dmjkd10.{tfm|300pk} ( " ) dmjke10.{tfm|300pk} ( " ) dmjkf10.{tfm|300pk} ( " ) dmjkg10.{tfm|300pk} ( " ) dmjkh10.{tfm|300pk} ( " ) dmjki10.{tfm|300pk} ( " ) dmjkj10.{tfm|300pk} ( " ) dmjkk10.{tfm|300pk} ( " ) dmjkl10.{tfm|300pk} ( " ) dmjkm10.{tfm|300pk} (kanjis - 2nd level) dmjkn10.{tfm|300pk} ( " ) dmjko10.{tfm|300pk} ( " ) dmjkp10.{tfm|300pk} ( " ) dmjkq10.{tfm|300pk} ( " ) dmjkr10.{tfm|300pk} ( " ) dmjks10.{tfm|300pk} ( " ) dmjkt10.{tfm|300pk} ( " ) dmjku10.{tfm|300pk} ( " ) dmjkv10.{tfm|300pk} ( " ) dmjkw10.{tfm|300pk} ( " ) dmjkx10.{tfm|300pk} ( " ) dmjky10.{tfm|300pk} ( " ) dmjkz10.{tfm|300pk} ( " ) Imagine when you need several magnificatins to this, and `Mincho' is just one font design in Japanese fonts. (A resource hog!) ---- Please note that jTeX did not modify the format of TeX font files. You can use the DVI-wares written for TeX, without modification, to process jTeX output DVI files. Even Imagen or LaserJet (which will not output Japanese texts in general) will be able to output beautiful Japanese texts. Unfortunately, there is no public jTeX kanji font with better quality. jTeX distribution contains JIS 24x24 fonts (*.tfm and *.pk) with several magnifications, but I believe their quality is the same as what you have already. --- It *is* a hard task to develop new kanji font from scratch (you have to design several thousands of glyphs at a time!) There is a proprietry jTeX kanji font called DNP kanji font, and widely used by jTeX users. It is provided by Dai-Nippon-Printing, one of the largest printing company in Japan. The data are provided in the form of *.pk and *.tfm files (no *.mf files). Because this font is generated directly from DNP's professional out-line font data, it has the quality that can be used for professional publications. It comes with several magnifications, and includes two fonts, `Mincho' and `Gothic', two major Japanese fonts. --- Compared to them, JIS 24x24 is just a `courier'. (But they cost you several ten-thousand yens.) For further information on jTeX, Nihongo-TeX or DNP fonts, you should better contact with the authors of Japanese TeX's. Here are their network addresses, from the softwares' README's. ryo-i@ascii.co.jp (Nihongo-TeX) tony-o@ascii.co.jp ( " ) isozaji@ntt-20.ntt.jp (Nihongo-TeX) a87480@tansei.cc.u-tokyo.ac.jp ( " ) (Some of above are JUNET addresses, not Internet addresses. So I am not sure if your mails arrives of not. Someone other on net might be knowlegeable than me..) --- Because jTeX's kanji fonts occupy several ten-M bytes of disk space, you will have difficulties installing them on PCs. One practical approach is to use Japanese printers' internal kanji fonts. If you could provide appropriate *.tfm files and DVI-wares, you do not have to install *.pk files on you disk. When you use the kanji-PostScript printers, you can get the professional quality. Some public Japanese dvi2ps programs use this method and working fine. You can obtain them also, from above noted Internet hosts. (But, kanji-PostScript printers will cost you several hundred-thousand yens...) Good luck and best regards; Takashi Matsuzawa. (Yokogawa-Hewlett-packard) Email: mzw_t@apollo.hp.com