[sci.lang.japan] Chinese/Japanese in TeX

haccme@milton.u.washington.edu (Thomas Ridgeway) (12/04/90)

The Humanities and Arts Computing Center of the University
of Washington is developing systems for printing modest
quality Chinese and Japanese with TeX.

Poor Man's Chinese and Poor Man's Japanese are packages
for using TeX, the typesetting language, to print text
in Chinese and Japanese (and other languages already
supported under TeX), using dot-matrix quality fonts.

The Poor Man's systems require a TeX version 3, but
have no other special TeX requirements.  (  You will 
need a working copy of METAFONT to make up the fonts).

There are no known system dependencies.

Individuals with extensive background in TeX and/or METAFONT
are invited to take copies of pmC and pmJ for testing and
evaluation.  The test package is available now (see below).

Unless you are a reasonably experienced TeX user, a systems
analyst/maintainer, or other reasonably serious jock, don't
try now: we'll have it easier to do in a few months.  The Poor Man's
systems are evolving sufficiently rapidly that we would normally
let them cool down a little before exposing them to public view,
but we have had a few requests, so here goes.  The contents of
the intro files are inconsistent since they are of different
ages.  Documentation is non-existent or inaccurate thru-out.
(Oh, you know that already?)

Included below is the readme file for the package:

README    

This is the usual in-the-absence-of-real-documentation
readme file for Poor Man's Chinese and Japanese.

pmC and pmJ are less than ideal implementations of Chinese
and Japanese for TeX.  Less than ideal because they use fonts
based on 24x24 dot-matrix fonts, and don't do vertical format
typesetting and so forth.  However, they (seem to) work, are
free, and work with a standard TeX of version 3 with no known
system dependencies.  SEE THE END OF THIS FILE FOR A NOTE ABOUT
VERSION 3.0 OF TeX.

pmJ/pmC has two components:
    a) a font maker
    b) macros enabling TeX to recognize the character
       set encoding used.
       
    a) the fontmaker is a METAFONT code generator written in C
    which reads a JIS or CCLIB 24x24 bitmapped font, and writes
    METAFONT instructions to emulate the font.  (This is the same
    technique adapted by F. Jalbert for JEMTEX;  JEMTEX is in part
    based on an earlier version of this fontmaker).  The fonts are
    organized to reflect the character set encodings which pmC/pmJ
    support.  
    
    b)  pmC and pmJ support the GB and Shift-JIS encodings respectively.
    These schemes use two-byte codes for indicating single characters.
    [These encodings incorporate the standard one-byte ASCII as a subset, 
    thus allowing English TeX to proceed with no changes required]
    So far as pmC and pmJ are concerned, the first byte selects a font, 
    the second byte selects a character in the font.  pmC and pmJ accomplish
    this by making all characters in the range 160 to 254 \active; they
    are then defined as single-character macros . . . (and the details
    may be read in the respective pmJ.tex and pmC.tex macro files).
    Commands \beginJapanese . . . \endJapanese [ \beginChinese . . .
    \endChinese] define a local environment within which Japanese
    [or Chinese] text is expected.  Unless another language using
    a character set encoding in conflict with GB/Shift-JIS is also
    used, one may just say \beginJapanese [\beginChinese] at the beginning
    and leave it at that.

NOTES ON USE
    The Poor Mans's systems use A LOT of fonts; see the remarks on the
    subject in pm?intro.tex included in the package.  To partially
    alleviate the problem, pmfonts are made with a magstep2 magnification
    built into the base size font.  Whether the magnified or normal
    size of a character is printed depends on the state of an \ifbigJ
    or \ifbigC flag.  I.e., to put a section head in larger type, you
    might say
        \centerline{{\bigJtrue ^^b8^^a9^^d4^^f2^^c4^^a1}}
    where the \bigJtrue would tell pmJ to use the larger size.

    The Chinese traditional font is much inferior to the other fonts; it
    was mechanically scaled up from 16x16, and that is just asking a little
    too much.  In the fullness of time, someone will make an improved version,
    we hope.  [Win fame and glory! You do it!].

HOW TO GET IT
    During the month of December 1990, early test versions of pmC and
    pmJ will be available for anonymous ftp on
              blackbox.hacc.washington.edu  (128.95.200.1)
    in directory pub/poorman.  We expect rapid evolution in the Poor Man's
    group, so those taking copies of pmC or pmJ should send me e-mail
    notifying me that you are testing so that I may maintain a list
    for future notifications.
    First get the (text file) MANIFEST for a current list of files, and
    classification as to binary/text.
    
LEGALITIES
     Portions of pmC and pmJ are copyrighted free software.  See the file 
     license for details.  The TeX portions are public domain.
     
INSTALLATION
     Take everything from the poorman directory; all files are text
     except the font files jis24, cclib.24 and cclibf.24 are binary.
     [ confirm this information with whatever MANIFEST says ]
     Some of the text files include 8-bit text; be advised that they will
     likely ftp o.k., but may be damaged by passing through e-mail.
     Compile pmfont.c, and --- if you wish --- to_sjis.c [to_sjis
     is optional, converting some forms of encoding for Japanese to 
     the encoding used by pmJ].  Prepare a MakeFont program for your
     system; MakeFont is a system-dependent script which runs METAFONT
     at the behest of pmfont: preparing MakeFont is the principal point
     in the installation where you will need to know what's going on.
     A sample MakeFont for a Unix system is included in the test package.
     pmfont is invoked with two options: input-font-file and output-tag.
     The input-font-file is something like JIS24 or CCLIB.24; the output-tag
     will become the first part of the fontname METAFONT will work with.
     The expected output tags are `wjis' for Japanese, `wcct' for traditional
     Chinese, and `wccs' for simplified Chinese.  The font files which will
     be produced will be wjisa1, wjisa2, wjisa3 . . ./wccta1, wccta2, wccta3 
     . . ./wccsa1, wccsa2, wccsa3 . . .  [There is an optional third
     parameter, skipcount, for bypassing the early stages of a run;
     if you had a previous run, for example, which ran successfully until
     font wjisb9, and was then interrupted, you could restart without
     wasted effort by saying `pmfont jis24 wjis b9'].
        Here's the scheme: pmfont reads in the bitmap font and starts
        writing METAFONT code for one font, then pauses and issues a
        system call to MakeFont, expecting that MakeFont will tell
        METAFONT to make the font (in an appropriate mode, etc.),
        copy the output files to wherever they should go, and remove
        no-longer-needed files, including the METAFONT source code).
        pmfont then resumes reading the input bitmap font and writing
        METAFONT code, until it is time for another system call to
        MakeFont.  Depending on which set of fonts we are working on,
        we go thru this 80 some odd, or 90 some odd times. I haven't
        tried it on a PC; MakeFont would probably need to just dump the 
        METAFONT source code (compressed) to a floppy for later processing
        since most PC METAFONTs will not want to run in whatever
        memory space is left when pmfont runs a system call [but I
        could be wrong, surprise me!].
      Ordinarily you will not keep the METAFONT code pmfont generates
      since it is purely a mechanical production and can be regenerated
      whenever you wish [It will also require somewhere around 20Mbytes
      of storage space if you keep it].  BY ADVISED THAT YOU WILL HAVE
      A LONG RUN IN PROCESSING THE FONTS: the method for emulating dot-
      matrix in a device-independent way uses a LARGE NUMBER of arithmetic
      calculations.  My most recent run to produce ONE set of the JIS fonts 
      took over 39 hours on a NeXT; :) fortunately you only have to make
      the fonts once.
      Once the fonts are made, put the pmJ.tex, pmC.tex and pmCs.tex files
      in places where TeX can find them.  Attempt to print the pmjintro/
      pmcintro/pmsintro files, and see what happens.
      
WARNING WARNING WARNING  The Poor Man's Japanese and Chinese systems are
      working, but by no means finished, systems.  It will not be a trivial
      task to install them and get them to work.  You really should not
      expect to be able to do it if you have minimal TeX experience, and
      it would be very helpful if you had run METAFONT before.  You may not
      want to try this at home.

TeX 3.0 VERSIONS -- IMPLEMENTATION BUG AFFECTING pmJ and pmC
      TeX3.0 based on web2c, including UNIX TeX 3.0, needs to be fixed for
      pmJ or pmC to run.  Either declare NONASCII, in the Makefile, or
      modify file extra.c, commenting out one section which unfortunately
      obliterates characters with the high bit set:

		boolean zinputln(f)
		FILE *f;
		{
		    register int i;
		
		    last = first;
		#ifdef	BSD
		    if (f == stdin) clearerr(stdin);
		#endif
		    while ( last < bufsize && ((i = getc(f)) != EOF) && i != '\n') {
/* series of changes here, we seem to have forgotten about the 8-bit part */
/* #ifdef	NONASCII */
	buffer[last++] = i;
/* #else */
/*	buffer[last++] = (i > 127 || i < 0)?' ':i; */
/* oops! everything with high-bit set just turned into a space */
/* #endif */
		    }
       Thanks, by the way, to the authors of web2c, for a nice pathway
       to TeX.  Versions of UNIX TeX 3.1 (and presumably onward) do not 
       need any fixing.


TEST REPORTS
       Please sends reports/ bug reports/ bug fixes, improvements,
       interesting examples, and so forth related to Poor Man's language
       versions to
           ridgeway@blackbox.hacc.washington.edu
       or
           Thomas Ridgeway, Director
           Humanities and Arts Computing Center, DR-10
           University of Washington,
           Seattle WA 98195 USA
           telephone (206)-543-4218


12/3/90
Tom Ridgeway
Seattle