[comp.protocols.iso] ISO-LATIN-1?

jtkohl@athena.mit.edu (John T Kohl) (06/21/89)

Can someone point me to the ISO reference defining ISO-LATIN-1?

Thanks.

John Kohl <jtkohl@ATHENA.MIT.EDU> or <jtkohl@Kolvir.Brookline.MA.US>
Digital Equipment Corporation/Project Athena
(The above opinions are MINE.  Don't put my words in somebody else's mouth!)

jch@apollo.COM (Jan Hardenbergh) (06/27/89)

> Message-ID: <12108@bloom-beacon.MIT.EDU>
> Reply-To: jtkohl@athena.mit.edu (John T Kohl)

> Can someone point me to the ISO reference defining ISO-LATIN-1?

Here are some old articles that same quite a bit about ISO-LATIN-1
alias ISO 8859/1. Note that this info is derived form an X include file.
Then an enumeration of other 8859/? codes & characters.

If you are stuck trying to get ISO documents you might try ANSI
in New York, (212)-642-4900.

-Jan (Yon) Hardenbegh - jch@apollo.com - (508)-256-6600
Apollo, a susidiary of HP

----------------------------------------------------------------
Article 169 of comp.std.unix:
Path: apollo!ulowell!masscomp!uunet!longway!std-unix
From: guy@Sun.COM (Guy Harris)
Newsgroups: comp.std.unix
Subject: Re: 8-Bit ASCII Standard on UNIX-POSIX
Message-ID: <161@longway.TIC.COM>
Date: 8 Apr 88 05:38 GMT
Sender: std-unix@longway.TIC.COM
Reply-To: guy@Sun.COM (Guy Harris)
Lines: 136
Approved: jsq@longway.tic.com (Moderator, John S. Quarterman)

From: guy@Sun.COM (Guy Harris)

> To possibly add to the list, this sounds like the character set
> Microsoft Windows uses and terms (by no standard I know of) "ANSI".
> It has the vowels in acute, grave, circumflex, tilde, and umlaut.
> The high bit characters also include cent, pound, yen, and universal
> currency symbols, circle-R trademark and circle-C copyright symbols,
> inverted ? and !, section and paragraph symbols, << guillemets >>,
> several accents, 1/4, 1/2, and 3/4 characters, and superscripted 1, 2,
> and 3.  The last sound like a bad idea to me, so I actually hope this
> is something they threw together themselves.

> Sound like ISO 8859?

Yes.  The superscripted letters *do* come from ISO 8859 (see below).

> What I would also like to see is the ASCII 0..1F (31 dec.) graphic
> representations on new machines conform to the ANSI standard.  They
> might look impractical, but after setting up a font using them on my
> micro, it's amazing how much sense they make to me.

What "graphic representations" are you referring to?  The only ANSI standard I
know of for characters in the range 0x00 to 0x1f is ASCII, which says they're
*control* characters, not *printable* characters.

For your collective amusement, here is a chart of ISO 8859/1 or "ISO Latin
Alphabet #1".  This was derived by some quick hacking on the X11 include file
"keysymdef.h" - yes, X11 uses the ISO character sets as well.

non-breaking space		0xa0
inverted exclamation point	0xa1
cent sign			0xa2
pounds sterling			0xa3
"currency symbol"		0xa4
yen				0xa5
broken bar			0xa6
section mark			0xa7
diaeresis			0xa8
copyright			0xa9
feminine ordinal		0xaa
		(this is a subscripted lower-case "a", underlined)
left guillemot			0xab
		(French left quote, looks like small "<<")
not sign			0xac
hyphen				0xad
registered trademark		0xae
macron				0xaf
		(an elevated small horizontal bar)
degree symbol			0xb0
plus/minus			0xb1
superscript 2			0xb2
superscript 3			0xb3
acute accent			0xb4
mu				0xb5
paragraph symbol		0xb6
small centered dot		0xb7
cedilla				0xb8
superscript 1			0xb9
masculine ordinal		0xba
		(this is a subscripted lower-case "o", underlined)
right guillemot			0xbb
		(French right quote, looks like small ">>")
1/4				0xbc
1/2				0xbd
3/4				0xbe
inverted question mark		0xbf
A with grave accent		0xc0
A with acute accent		0xc1
A with circumflex accent	0xc2
A with tilde			0xc3
A with diaeresis		0xc4
A with ring			0xc5
		(as in "Angstrom")
AE dipthong			0xc6
C with cedilla			0xc7
E with grave accent		0xc8
E with acute accent		0xc9
E with circumflex accent	0xca
E with diaeresis		0xcb
I with grave accent		0xcc
I with acute accent		0xcd
I with circumflex accent	0xce
I with diaeresis		0xcf
upper-case eth			0xd0
		(eth is an Icelandic letter)
N with tilde			0xd1
O with grave accent		0xd2
O with acute accent		0xd3
O with circumflex accent	0xd4
O with tilde			0xd5
O with diaeresis		0xd6
multiply sign			0xd7
O with slash			0xd8
U with grave accent		0xd9
U with acute accent		0xda
U with circumflex accent	0xdb
U with diaeresis		0xdc
Y with acute accent		0xdd
upper-case thorn		0xde
		(thorn is an Icelandic letter)
German double-s			0xdf
a with grave accent		0xe0
a with acute accent		0xe1
a with circumflex accent	0xe2
a with tilde			0xe3
a with diaeresis		0xe4
a with ring			0xe5
		(lower-case "A with ring")
ae dipthong			0xe6
c with cedilla			0xe7
e with grave accent		0xe8
e with acute accent		0xe9
e with circumflex accent	0xea
e with diaeresis		0xeb
i with grave accent		0xec
i with acute accent		0xed
i with circumflex accent	0xee
i with diaeresis		0xef
lower-case eth			0xf0
n with tilde			0xf1
o with grave accent		0xf2
o with acute accent		0xf3
o with circumflex accent	0xf4
o with tilde			0xf5
o with diaeresis		0xf6
division sign			0xf7
o with slash			0xf8
u with grave accent		0xf9
u with acute accent		0xfa
u with circumflex accent	0xfb
u with diaeresis		0xfc
y with acute accent		0xfd
lower-case thorn		0xfe
y with diaeresis		0xff

Volume-Number: Volume 13, Number 49


Article 294 of comp.std.internat:
Path: apollo!ulowell!m2c!husc6!purdue!decwrl!video.dec.com!lasko
From: lasko@video.dec.com (Tim - DSG Terminals Architecture - 223-2186)
Newsgroups: comp.std.internat
Subject: RE: ISO 8859
Message-ID: <8805122042.AA21976@decwrl.dec.com>
Date: 12 May 88 23:19 GMT
Organization: Digital Equipment Corporation
Lines: 55

In response to: Richard Lee (rlee@ads.com)

ISO 8859 consists of several parts, each part specifying a set of up to 191
graphic characters and the coded representations thereof by means of a single
8-bit byte.  The use of control functions for the coded representation of
composite characters (like o [backspace] /) is prohibited. 

Another major feature of each of the parts is that the "left hand" part, or the
lower 94 graphic characters, is exactly the same as ASCII.  The "right hand"
part, or the higher-order 96 characters, are a mix of characters and symbols
(paragraph sign, fractions, special punctuation, etc.) useful for the region
covered by the part of the standard.  Note that bit combinations A0 hex and FF
hex constitute valid graphic characters (not control characters) in this
character code. 

The parts of 8859 are simply character sets, they don't define keyboards, code
extension mechanisms, or anything else.  The character codes are based on the
July 1986 version of ISO 4873 which specifies rules for eight-bit character
codes, similar to the ISO 646 standard, which specified basic rules for
seven-bit character codes. 

Each part of ISO 8859 is intended for use with a different set of languages or
scripts: 

Part Name                     Status*    Language/Region/Script

 1   Latin Alphabet No 1     IS Feb 87   "Western European" 
 2   Latin Alphabet No 2     IS Feb 87   "Eastern European" 
 3   Latin Alphabet No 3     IS Mar 88   "Southern European" + S. Africa
 4   Latin Alphabet No 4     IS Mar 88   Majority Scandinavian
 5   Latin-Cyrillic Alphabet tbp IS 88   ASCII + Cyrillic characters
 6   Latin-Arabic Alphabet   IS Aug 87   ASCII + Arabic characters*
 7   Latin-Greek Alphabet    IS Nov 87   ASCII + Greek characters              
 8   Latin-Hebrew Alphabet   tbp IS 88   ASCII + Hebrew characters
 9   Latin Alphabet No 5     proposed    modification of pt. 3 by Turkey

* Status key:   IS - approved international standard published at indicated date
            tbp IS - standard is approved, but not yet published
          proposed - draft text hasn't yet entered ISO ballot cycle

The repertoires of parts 5 through 8 have been worked out with relevant experts
in the affected countries, and in many cases form national standards as well. 

ISO 8859/1, Latin Alphabet No 1, is probably the one most U.S. manufacturers
will want to be concerned with, since it covers the repertoires for the
following languages:  Danish, Dutch, English, Faeroese, Finnish, French,
German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish
[Flemish, too if you consider it a separate language; it doesn't include Welsh].

That's probably more than you needed...let me know if you'd like more details.

==================
Tim Lasko,  Digital Equipment Corporation,  Maynard, MA
                   "There are no temporary workarounds..."
lasko@video.dec.com    lasko%video.dec@decwrl    decwrl!video.dec.com!lasko