[sci.lang] Resource name for Chinese sounds

lai@Apple.COM (Ed Lai) (12/20/88)

There are about a thousand sounds in Chinese. If someone
is willing to digitize each of these sounds, then these
sounds can be used from any stack. What we need is a
naming convention for each sound. The name of the sound
should be easily converted from any of the Romanization
system so that it can be used for all Romanization
system. The names should be short so it would not take up
a lot of room. Note that ease of knowing the sound when
you look at the name is not an important issue. This is
used mainly as an internal representation. But it still
would be nice if the name can give you a good idea what
the sound is.

When I first worked on the Chinese dictionary stack, I
came up with a naming convention. It is modified slightly
in version 0.2. There is effort in Taiwan to make a
complete stack out of the CCDB database with digitized
professional pronunciations. I am discussing with them to
try to adopt a common naming convention. If that happens,
this will become a standard. So I would like to present
it here to solicit comments.

The basic scheme is that resource name contains three
letters, the first letter indicates the consonant, the
second letter indicates the vowel parts, and the third
letter indicates the tone.

The first letter  represents the consonant in the
following ways

              Pinyin     Juiyin 2nd Form             

      B            B            B
      P            P            P
      M            M            M
      F            F            F
      D            D            D
      T            T            T
      N            N            N
      L            L            L
      G            G            G
      K            K            K
      H            H            H
      J            J            J
      Q            Q            CH
      X            X            SH
      V            ZH           J
      W            CH           CH
      Y            SH           SH
      R            R            R
      Z            Z            TZ
      C            C            TS
      S            S            S
      A            Has no initial consonant
      E            Exceptions (see below).

The second letter is used to indicate the vowels, and
unless the first letter is E, the second letter has the
following meaning.

              Pinyin     Juiyin 2nd Form             

      0            A            A
      1            O            O
      2            E            E
      3            AI           AI
      4            EI           EI
      5            AO           AU
      6            OU           OU
      7            AN           AN
      8            EN           EN
      9            ANG          ANG
      A            ENG          ENG
      B            ER           ER
      C            I            I
      D            IA           IA
      E            IE           IE
      F            IAO          IAU
      G            IU(IOU)      IOU
      H            IAN          IAN
      I            IN           IN
      J            IANG         IANG
      K            ING          ING
      L            U            U
      M            UA           UA
      N            UO           UO
      O            UAI          UAI
      P            UI(UEI)      UEI
      Q            UAN          UAN
      R            UN(UEN)      UEN
      S            UANG         UANG
      T            UENG(ONG)    UNG
      U            U:           IU
      V            UE:          IUE
      W            UAN:         IUAN
      X            UN:          IUN
      Y            IONG         IUNG
      Z            Consonant only, no vowel

If the first letter is E, the second letter will have a
different meaning.

      E1                         YO
      E3           (YAI)         YAI
      EH           (EH)          E

The third letter is either 1, 2, 3, 4, 5 for the tunes.

There are still minor holes in the system, for example NG
is not listed. NG can either be the E exceptions or one
of (I, O, U).

This is the scheme as it stands now. It may not be
perfect but HyperTalk script has been written to convert
from BoPoMoFo, PinYin, JuiYin 2nd Form, Wade-Gile, Yale
into the internal representation, so it works.

Possible improvement to the scheme can be to remap the
vowels so that on a QWERTY keyboard, the single vowels
would be on one row(home row?), the i double vowels on
one row, the u vowels on another, and the u: on the
bottom row. The columns should be arranged so that all
the ANs are in one column, ANGs are in another column
etc. In this way a user may be able to directly type in a
sound without going through a translation routine.
However so far I have not come up with a completely
satisfactory arrangement.

Even more ambitious is the possibility that we can come
up with a scheme that covers all the major Chinese
dialects. I am just not qualified to attempt this.

I am open to suggestion but I must also convince another
party that has a deadline to meet. So I can only act on
comments that come in the near future.


/* Disclaimer: All statments and opinions expressed are my own */
/* Edmund K. Lai                                               */
/* Apple Computer, MS42-C                                      */
/* 20525 Mariani Ave,                                          */
/* Cupertino, CA 95014                                         */
/* (408)974-6272                                               */