[comp.text] ASCII digraphs for ISO 8859-1 requested

npn@cbnewsl.att.com (nils-peter.nelson) (01/08/91)

I've received a proposal from Keizer, Simonsen and Akkerhuis
for 2 character representations for the additional 128 characters
in ISO 8859-1.  Dennis Ritchie had already implemented a competing
convention for Research UNIX for keyboard entry. For example,
for the open French quote, or left angle quote mark, position 10/11,
K/S/A suggest \(Fo, Ritchie suggests \(<<.
If you have such a key on your keyboard, this is not an issue; if
you don't, is there a common digraph in existence? Viz., does
everyone type "dead-key char1 char2" the same way, or is it
hopelessly local.
To the point, are there any more proposals out there for how to
represent 8 bit characters in a 7 bit subset?
mail to npn@mhuxo.att.com.
My goal is to have the troff convention be the same as the
keyboard convention most familiar to people, if such a thing exists.

yfcw14@castle.ed.ac.uk (K P Donnelly) (01/09/91)

npn@cbnewsl.att.com (nils-peter.nelson) writes:

>To the point, are there any more proposals out there for how to
>represent 8 bit characters in a 7 bit subset?
>My goal is to have the troff convention be the same as the
>keyboard convention most familiar to people, if such a thing exists.

Surely by far the most common keyboard convention for ISO 8859-1 is that
on used VT320 terminals, in which for example you generate the 8 bit
character "half" type pressing "Compose-character" then "1" then "2".
Some other examples are:
                quarter              1 4
                a-acute              a '   or  ' a
                A-acute              A '   or  ' A
                pound sign           L -   or  - L  
                French quote marks   < <
                                     > >
                suberscript 2        2 ^
                degree sign          0 ^
                plus or minus        + -
Any user's manual for a VT320 or compatible gives full details.

   Kevin Donnelly

keld@login.dkuug.dk (Keld J|rn Simonsen) (01/12/91)

yfcw14@castle.ed.ac.uk (K P Donnelly) writes:

>npn@cbnewsl.att.com (nils-peter.nelson) writes:

>>To the point, are there any more proposals out there for how to
>>represent 8 bit characters in a 7 bit subset?
>>My goal is to have the troff convention be the same as the
>>keyboard convention most familiar to people, if such a thing exists.

>Surely by far the most common keyboard convention for ISO 8859-1 is that
>on used VT320 terminals, in which for example you generate the 8 bit
>character "half" type pressing "Compose-character" then "1" then "2".

Well, well. I did some other work on this, and also had a look
at the VT320 names. What I wanted to do was to have ASCII encodings
of all the ISO 8859 character sets and also other character sets.
I found that the VT320 codes were fine for ISO 8859-1, but when
all the parts of ISO 8859 (there is about 10 parts) should be coded,
there were conflicts in the naming. 

I now have a set of more than 1300 character names in 2 character
ASCII (or actually invariant ISO 646), which is used for definition of
POSIX locales and used in email. I also have tables of the encoding
of about 60 character sets with these two-char names.

Unfortunately these names are incompatible with Ossanna/Kernighan
titroff. And thus the names in K/S/A are incompatible with this extended
list. And I was involved in both lists.... The 1300 character list
was a lot bigger than the K/S/A list and therefore had to be designed
more consistently and carefully. One fundamental design decision was to
shift around the letter position in letter names, so titroff *a is
now a* in the 1300 char list. 

The 1300 char list includes the following: extended latin, greek, 
cyrillic, hebrew, arabic, mathematics, hiragana, katakana, bopomofo.

Two ideas for this:
1. A titroff list of the 1300 characters (or the like) could be
   done - which could be as compatible as possible with the big list
   I have now.
2. the new titroff could have a specification of input character set
   - thus more markets could be opened up for this product.

The 1300 char list (and some code to handle it) is available by
anon ftp in dkuug.dk:pub/ch.shar*

Keld Simonsen