[sci.lang.japan] Hershey Japanese characters

lee@uhccux.uhcc.hawaii.edu (Greg Lee) (10/09/88)

A short while ago Jon Greenblatt posted to comp.graphics a program
hersh.c to display character shapes from the Hershey data base.  Very
useful.  I used it to get a printout of the "oriental"=Japanese
characters, and I thought others might be interested in what is there.
I found 758 characters:  603 kanji, 76 hiragana, 76 katakana, period,
comma, dash.

I would be interested in opinions about whether the data is useful,
and if so, how it can be used.  Maybe other systems for displaying
Japanese have been developed that would make it wasted effort to
do anything further with the Hershey data -- I don't know.  If it's
used, what system should be used for representing the characters
in a text file?  Do you give a number for a kanji character, or
what?

Anyhow, here is some more detail about the contents of the oriental data
base.  The characters are numbered in the data base from 1 through 6202
-- not all numbers in this range are used.  Numbers 1-5445 are kanji,
and the numbers correspond to those assigned in Andrew N. Nelson, The
Modern Reader's Japanese-English Character Dictionary.  I cannot tell
what system has been used to select which kanji to include.  In appendix
12, Nelson gives the "Toyo Kanji Lists", a grouping by commonness of
usage, I gather.  All the characters in "grade one" are present in the
Hershey data, and all those in "grade two" are, except for 380, 381,
858, 1066, 2191, 2260, 2433, 2489, and 2995.  I didn't check any
further.

Numbers 6000-6050 are hiragana in AIUEO order, corresponding to the
chart in Nelson, app.  3, p 1013, reading down the columns first.
6055-6079 are diacritic'd hiragana, in order corresponding to Nelson's
chart on p. 1015.  Numbers 6100-6150 and 6155-6179 are the corresponding
katakana.

Numbers 6200-6202 are period, comma, and dash.

		---------------

For reference, the Hershey data and related programs that have appeared
in mod.sources/comp.sources.unix are listed below.  'her2vfont' provides
for composing Unix vfont-format files from the Hershey data, and
'hershtools' provides for composing PostScript stroked fonts and TeX
.tfm files.

	      Vol.
hershey        4  Hershey Fonts,  (5 files)
hershey.f77    4  Hershey Fonts in Fortran 77   (2 files)
her2vfont      8  Hershey fonts to 'vfont' rasterizer
hershtools    12  (5 parts) Hershey font manipulation tools and data

		Greg, lee@uhccux.uhcc.hawaii.edu