[comp.std.internat] ISO 8859

rlee@deimos.ads.com (Richard Lee) (05/10/88)

Can someone tell me exactly what is defined by ISO standard 8859?
Thanks.


  ARPA:  rlee@ads.com            | RICHARD LEE |    Don't take life
  UUCP:  ...!{sri-spam, ames}!zodiac!rlee           so serious, son...
  USPS:  1500 Plymouth St, Mt. View CA 94043        it ain't *nohow*
  PHONE: 415-960-7300                               permanent.

glennw%noddy@Sun.COM (Glenn P. Wright) (05/11/88)

In article <3801@zodiac.UUCP>, rlee@deimos.ads.com (Richard Lee) writes:
> Can someone tell me exactly what is defined by ISO standard 8859?
> Thanks.

This is the ISO standard for single byte 8-bit encodement of graphical
character shapes. 
There are 6 (currently) subsets to this standard, the most popular
being IS-8859/1 which defines most characters required to support W.European
languages. One key feature of 8859 is that it includes 7-bit US ASCII 
representation in the bottom half of each and every subset of the standard. 

Glenn Wright. {..}glennw@sun or {..sun}!glennw
=====================
Sun Microsystems,  2550 Garcia Avenue,	Mountain View 
California 94043.	Tel (1) 415 691 6848"
Glenn Wright
============
Sun Microsystems Inc, Mountain View, California, USA.
Tel: (415) 960 1300

lasko@video.dec.com.UUCP (05/13/88)

In response to: Richard Lee (rlee@ads.com)

ISO 8859 consists of several parts, each part specifying a set of up to 191
graphic characters and the coded representations thereof by means of a single
8-bit byte.  The use of control functions for the coded representation of
composite characters (like o [backspace] /) is prohibited. 

Another major feature of each of the parts is that the "left hand" part, or the
lower 94 graphic characters, is exactly the same as ASCII.  The "right hand"
part, or the higher-order 96 characters, are a mix of characters and symbols
(paragraph sign, fractions, special punctuation, etc.) useful for the region
covered by the part of the standard.  Note that bit combinations A0 hex and FF
hex constitute valid graphic characters (not control characters) in this
character code. 

The parts of 8859 are simply character sets, they don't define keyboards, code
extension mechanisms, or anything else.  The character codes are based on the
July 1986 version of ISO 4873 which specifies rules for eight-bit character
codes, similar to the ISO 646 standard, which specified basic rules for
seven-bit character codes. 

Each part of ISO 8859 is intended for use with a different set of languages or
scripts: 

Part Name                     Status*    Language/Region/Script

 1   Latin Alphabet No 1     IS Feb 87   "Western European" 
 2   Latin Alphabet No 2     IS Feb 87   "Eastern European" 
 3   Latin Alphabet No 3     IS Mar 88   "Southern European" + S. Africa
 4   Latin Alphabet No 4     IS Mar 88   Majority Scandinavian
 5   Latin-Cyrillic Alphabet tbp IS 88   ASCII + Cyrillic characters
 6   Latin-Arabic Alphabet   IS Aug 87   ASCII + Arabic characters*
 7   Latin-Greek Alphabet    IS Nov 87   ASCII + Greek characters              
 8   Latin-Hebrew Alphabet   tbp IS 88   ASCII + Hebrew characters
 9   Latin Alphabet No 5     proposed    modification of pt. 3 by Turkey

* Status key:   IS - approved international standard published at indicated date
            tbp IS - standard is approved, but not yet published
          proposed - draft text hasn't yet entered ISO ballot cycle

The repertoires of parts 5 through 8 have been worked out with relevant experts
in the affected countries, and in many cases form national standards as well. 

ISO 8859/1, Latin Alphabet No 1, is probably the one most U.S. manufacturers
will want to be concerned with, since it covers the repertoires for the
following languages:  Danish, Dutch, English, Faeroese, Finnish, French,
German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish
[Flemish, too if you consider it a separate language; it doesn't include Welsh].

That's probably more than you needed...let me know if you'd like more details.

==================
Tim Lasko,  Digital Equipment Corporation,  Maynard, MA
                   "There are no temporary workarounds..."
lasko@video.dec.com    lasko%video.dec@decwrl    decwrl!video.dec.com!lasko

jch@apollo.uucp (Jan Hardenbergh) (05/14/88)

>In article <3801@zodiac.UUCP>, rlee@deimos.ads.com (Richard Lee) writes:
>> Can someone tell me exactly what is defined by ISO standard 8859?
>> Thanks.

> This is the ISO standard for single byte 8-bit encodement of graphical
> character shapes. 
> There are 6 (currently) subsets to this standard, the most popular
> being IS-8859/1 which defines most characters required to support W.European
> languages. One key feature of 8859 is that it includes 7-bit US ASCII 
> representation in the bottom half of each and every subset of the standard. 

> Glenn Wright. {..}glennw@sun or {..sun}!glennw

It is important to distinguish a character set from a font. ISO 8859/1 is
a character set. A "byte 8-bit encodement of graphical character shapes"
means that a certain bit pattern should look like a certain glyph. An a
is an a. It does not specificy a particular graphical representation the
way a font does - Helvetica. 8859/1 is also called ISO Latin Alphabet #1.

Jan Hardenbergh    {decvax,mit-eddie,umix}!apollo!jch    Apollo Computer

frisk@rhi.hi.is (Fridrik Skulason) (05/15/88)

In article <52702@sun.uucp> glennw%noddy@Sun.COM (Glenn P. Wright) writes:
> One key feature of 8859 is that it includes 7-bit US ASCII 
>representation in the bottom half of each and every subset of the standard. 
>

Not every subset - at least one has the international currency sign instead
of the dollar sign at position 24.

-- 
         Fridrik Skulason          University of Iceland
         UUCP  frisk@rhi.uucp      BIX  frisk

     This line intentionally left blank ...................