[comp.std.internat] Questions about LATIN-1

ct@dde.dk (Claus Tondering) (04/16/91)

Could somebody please answer the following questions about
the ISO 8859-1 (LATIN-1) character set:

1) What is the currency symbol used for? And who uses it?
   (I am referring to the symbol in position 0xa4 that looks like
   a sun with four rays emmanating from it.)

2) What are the feminine and masculine ordinal indicators used for?
   (The symbols in position 0xaa and 0xba that look like a small
   underlined a and o.) I believe they are used in Spanish, is that
   true and for what purpose?

3) What is the pilcrow sign used for? And who uses it?
   (The symbol in position 0xb6 that looks like an inverted P.)

4) Do the Germans consider the double-s (position 0xdf) a lower-case
   letter, or is it both upper and lower case?

5) Why does the standard contain the diacritical marks in separate symbol
   positions, when they are also found as parts of letters? When would
   you, for example, want to use the cedille (character 0xb8) or the
   umlaut (character 0xa8) on its own?

-- 
Claus Tondering,         Dansk Data Elektronik A/S, Herlev, Denmark
E-mail: ct@dde.dk
------------------------------------------------------------------------------
1 ns < 2 cents < 10 V < 1 km < 100 degrees C < 20 mips < 4 MPa < 1,000,000,000

melby@daffy.yk.Fujitsu.CO.JP (John B. Melby) (04/17/91)

>1) What is the currency symbol used for? And who uses it?
Your guess is as good as mine....

>2) What are the feminine and masculine ordinal indicators used for?
>   (The symbols in position 0xaa and 0xba that look like a small
>   underlined a and o.) I believe they are used in Spanish, is that
>   true and for what purpose?
They are used to change a cardinal number into an ordinal
(e.g. 1o -> primo = "first").

>3) What is the pilcrow sign used for? And who uses it?
I have seen it used as a paragraph indicator.

>4) Do the Germans consider the double-s (position 0xdf) a lower-case
>   letter, or is it both upper and lower case?
I'm only three-eighths German, but my guess is that it is both.
It does not appear at the beginning of a word, so the only time it is
found in upper case is when a whole word is capitalized.  (No "ligature
wars," please... :-) )

>5) Why does the standard contain the diacritical marks in separate symbol
>   positions, when they are also found as parts of letters? ...
Perhaps for use in modern printer terminals with non-destructive backspaces?
:-)

-----
John B. Melby
Fujitsu Limited, Machida, Japan
melby%yk.fujitsu.co.jp@uunet.uu.net

dlv@cunyvms1.gc.cuny.edu (Dimitri Vulis, CUNY GC Math) (04/17/91)

In article <1991Apr16.130422.16607@dde.dk>, ct@dde.dk (Claus Tondering) writes:
>Could somebody please answer the following questions about
>the ISO 8859-1 (LATIN-1) character set:
>
>1) What is the currency symbol used for? And who uses it?
>   (I am referring to the symbol in position 0xa4 that looks like
>   a sun with four rays emmanating from it.)
Well, I've seen a Soviet version of BASIC which used the generalized currency
symbol instead of the dollar sign to indicate string variables. :)
It wasn't in 0xA4, though...
>
>2) What are the feminine and masculine ordinal indicators used for?
>   (The symbols in position 0xaa and 0xba that look like a small
>   underlined a and o.) I believe they are used in Spanish, is that
>   true and for what purpose?
After numbers: e.g.

   a      o
10 -   12 -
This means 10th, 12th...

>3) What is the pilcrow sign used for? And who uses it?
>   (The symbol in position 0xb6 that looks like an inverted P.)
The New York Times occasionally uses it as a bullet in an itemized list.
>4) Do the Germans consider the double-s (position 0xdf) a lower-case
>   letter, or is it both upper and lower case?

>5) Why does the standard contain the diacritical marks in separate symbol
>   positions, when they are also found as parts of letters? When would
>   you, for example, want to use the cedille (character 0xb8) or the
>   umlaut (character 0xa8) on its own?
Well, a while ago I concocted a .CPI file (for MS-DOS 3.3+ & EGA/VGA), but
never got around writing keyboard drivers. My intention was to use these
characters for dead keys: you hit a certain key to get an umlaut, then you
hit, say, a to get u-umlaut, or space for just umlaut (like you can get
dead circumflex and dead grave in pure ASCII :) or something that can't
be combined with umlaut, and then the keyboard driver beeps and gets rid 
of the umlaut.
>
>Claus Tondering,         Dansk Data Elektronik A/S, Herlev, Denmark
>E-mail: ct@dde.dk
Dimitri Vulis
CUNY GC Math
DLV@CUNYVMS1.BITNET DLV@CUNYVMS1.GC.CUNY.EDU

henry@zoo.toronto.edu (Henry Spencer) (04/18/91)

In article <1991Apr16.130422.16607@dde.dk> ct@dde.dk (Claus Tondering) writes:
>1) What is the currency symbol used for? And who uses it?

Nothing and nobody, but it's an international standard. :-(  In theory it
is used when you want a currency symbol but don't want to use the symbol
of a specific currency, e.g. the dollar sign.

>4) Do the Germans consider the double-s (position 0xdf) a lower-case
>   letter, or is it both upper and lower case?

My understanding is that it is lowercase only, and has no uppercase
equivalent.

>5) Why does the standard contain the diacritical marks in separate symbol
>   positions, when they are also found as parts of letters? ...

Presumably as an escape hatch so that languages not considered in the design
of Latin 1 can still be written, albeit clumsily.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

xcarey@cucstud.UUCP (Christian Carey) (04/18/91)

In article <1991Apr16.130422.16607@dde.dk>, ct@dde.dk (Claus Tondering) writes:
> Could somebody please answer the following questions about
> the ISO 8859-1 (LATIN-1) character set:
> 
> 1) What is the currency symbol used for? And who uses it?
>    (I am referring to the symbol in position 0xa4 that looks like
>    a sun with four rays emmanating from it.)

It is a "generic" currency symbol; it's useful for making a one column header
representing money.

> 2) What are the feminine and masculine ordinal indicators used for?
>    (The symbols in position 0xaa and 0xba that look like a small
>    underlined a and o.) I believe they are used in Spanish, is that
>    true and for what purpose?

That's true, it's used in Spanish, and certain other Romance languages.
They have the function that "th" has in English, to "ordinalise" a number,
e.g. 20 -> 20th.  Depending on the noun's gender, either 0xAA or 0xBA is used.
I believe Spanish commonly ordinalises residence numbers in addresses.

> 3) What is the pilcrow sign used for? And who uses it?
>    (The symbol in position 0xb6 that looks like an inverted P.)

In (American) English, it represents a "paragraph break".  The King James
Version Bible has them to represent original breaks in the Scriptures.

> 4) Do the Germans consider the double-s (position 0xdf) a lower-case
>    letter, or is it both upper and lower case?

Only lower case.  Its upper case is SS.

> 5) Why does the standard contain the diacritical marks in separate symbol
>    positions, when they are also found as parts of letters? When would
>    you, for example, want to use the cedille (character 0xb8) or the
>    umlaut (character 0xa8) on its own?

When representing characters not found in 8859-1, e.g. two character sequence
0x54 0xB8 (T cedilla) to represent Romanian "T with cedilla".
-- 
"It is a question of cubic capacity; a man with so large a brain must have
something inside it."--Sherlock Holmes, _The Adventure of the Blue Carbuncle_

Christian Carey (size 8 hat (USA))     uunet!cucstud!xcarey

ef@tools.uucp (Edgar Fuss) (04/18/91)

>1) What is the currency symbol used for? And who uses it?
Isn't that Swedish or Norvegian currency (Ore or Oere or whatever it's called)?

>4) Do the Germans consider the double-s (position 0xdf) a lower-case
>   letter, or is it both upper and lower case?
It's a lowercase letter. Since it never appears at the beginning of a word, thre's
no reason to capitalize it (unless you capitalize the whole word, in which case
it convertes to ``SS''.

enag@ifi.uio.no (Erik Naggum) (04/22/91)

In article <EF.91Apr18144752@fidel.tools.uucp> ef@tools.uucp (Edgar Fuss) writes:

   >1) What is the currency symbol used for? And who uses it?

   Isn't that Swedish or Norvegian currency (Ore or Oere or whatever
   it's called)?

The symbol looks like a "crown", and the Norwegian, Swedish, Danish
and Icelandic currency is indeed called "crown" (krone, krona, krone,
and kro'na, respectively), but only the Swedes use the currency symbol
in position 2/4 of their ISO 646 national variant.  (O/re is the
smaller unit, 1/100th krone.)

I believe the Swedes use it because they're such rabid America-haters,
and having an evil dollar sign show up on their terminals would just
be Too Much To Bear.  (Only half a smiley on this one, I'm sad to say.)

According to what I heard from NSF, the Norwegian ISO member body,
only Sweden and Germany insisted on the "currency symbol" at the time
of ISO 646, and most other countries didn't care enough to counter
their "need".  This stupidity has been corrected in ISO 8859-1, and a
correction is finding its way back to ISO 646.

--
[Erik Naggum]					     <enag@ifi.uio.no>
Naggum Software, Oslo, Norway			   <erik@naggum.uu.no>

hpa@casbah.acns.nwu.edu (H. Peter Anvin) (04/24/91)

In article <ENAG.91Apr22014328@maud.ifi.uio.no> enag@ifi.uio.no (Erik Naggum) writes:
>According to what I heard from NSF, the Norwegian ISO member body,
>only Sweden and Germany insisted on the "currency symbol" at the time
>of ISO 646, and most other countries didn't care enough to counter
>their "need".  This stupidity has been corrected in ISO 8859-1, and a
>correction is finding its way back to ISO 646.

And, oh boy, did we Swedish computer geeks have to pay for it... More
trouble than that caused I can't imagine... besides, Sweden doesn't use the
"sol" symbol for *anything*, so of what use is it?

I would much rather have seen the section sign (IBM Extended ASCII 0x15) in
the position of the $ or # sign (# and @ are quite worthless outside
America, really...)

                 - - -  Peter
-- 
IDENTITY:   Anvin, H. Peter           STATUS:    Student
INTERNET:   hpa@casbah.acns.nwu.edu   FIDONET:   1:115/989.4
HAM RADIO:  N9ITP, SM4TKN             RBBSNET:   8:970/101.4
EDITOR OF:  The Stillwaters BBS List  TEACHING:  Swedish

dlv@cunyvms1.gc.cuny.edu (Dimitri Vulis, CUNY GC Math) (04/27/91)

In article <1991Apr17.170515.2058@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <1991Apr16.130422.16607@dde.dk> ct@dde.dk (Claus Tondering) writes:
>>5) Why does the standard contain the diacritical marks in separate symbol
>>   positions, when they are also found as parts of letters? ...
>
>Presumably as an escape hatch so that languages not considered in the design
>of Latin 1 can still be written, albeit clumsily.
No.
These are _spacing_ diacritics and cannot be combined with letters in ISO 8859.
>-- 
>And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
>"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry
Dimitri Vulis
CUNY GC Math
DLV@CUNYVMS1.BITNET DLV@CUNYVMS1.GC.CUNY.EDU

sommar@enea.se (Erland Sommarskog) (04/29/91)

Also sprach Erik Naggum (enag@ifi.uio.no):
>The symbol looks like a "crown", and the Norwegian, Swedish, Danish
>and Icelandic currency is indeed called "crown" (krone, krona, krone,
>and kro'na, respectively), but only the Swedes use the currency symbol
>in position 2/4 of their ISO 646 national variant.  (O/re is the
>smaller unit, 1/100th krone.)

It appears on some Swedish terminals instead of the dollar character,
but is never used for marking currency. In common talk it's called
the "sun character". Few people knows that it's supposed to be a
currency sign.

>I believe the Swedes use it because they're such rabid America-haters,
>and having an evil dollar sign show up on their terminals would just
>be Too Much To Bear.  (Only half a smiley on this one, I'm sad to say.)

Stupidities!
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
Le fils du maire est en Normandie avec beaucoup de medecins.

henry@zoo.toronto.edu (Henry Spencer) (04/30/91)

In article <1991Apr27.053523.17867@timessqr.gc.cuny.edu> dlv@cunyvms1.gc.cuny.edu writes:
>>>5) Why does the standard contain the diacritical marks ...
>>
>>Presumably as an escape hatch so that languages not considered in the design
>>of Latin 1 can still be written, albeit clumsily.
>No.
>These are _spacing_ diacritics and cannot be combined with letters in ISO 8859.

Uh, character code 010 is backspace, which makes such combination eminently
possible.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

lasko@regent.dec.com (Tim Lasko, Digital Equipment Corp., Westford, MA) (04/30/91)

In article <1991Apr29.184000.26077@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes...
>In article <1991Apr27.053523.17867@timessqr.gc.cuny.edu> dlv@cunyvms1.gc.cuny.edu writes:
>>>>5) Why does the standard contain the diacritical marks ...
>>These are _spacing_ diacritics and cannot be combined with letters in ISO 8859.
> 
>Uh, character code 010 is backspace, which makes such combination eminently
>possible.

Sorry, not within the scope of ISO 8859: that standard does not include control
characters.  Also, ISO 4873, the parent eight-bit code structure standard,
specifically prohibits combinations with BACKSPACE as being considered graphic
characters.  

Tim Lasko, Digital Equipment Corp., Westford MA  (lasko@regent.enet.dec.com)
Disclaimer: My opinions are my own; the facts can speak for themselves.

clive@x.co.uk (Clive Feather) (04/30/91)

In article <1991Apr29.184000.26077@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>>>>5) Why does the standard contain the diacritical marks ...
>>These are _spacing_ diacritics and cannot be combined with letters in ISO 8859.
>Uh, character code 010 is backspace, which makes such combination eminently
>possible.

Sorry, Henry, but character code 010 is, to quote ISO 8859/1, one of
several:
"bit combinations that do not represent graphic characters. Their use is
outside the scope of ISO 8859; it is specified in other International
Standards, for example ISO 646 or ISO 6429."
[This wording applies to 0x00 to 0x1F, and 0x7F to 0x9F, inclusive.]

In addition: "The use of control functions, such as BACKSPACE or
CARRIAGE RETURN for the coded representation of composite characters is
prohibited by ISO 8859."
-- 
Clive D.W. Feather     | IXI Limited         | If you lie to the compiler,
clive@x.co.uk          | 62-74 Burleigh St.  | it will get its revenge.
Phone: +44 223 462 131 | Cambridge   CB1 1OJ |   - Henry Spencer
(USA: 1 800 XDESK 57)  | United Kingdom      |

ronald@robobar.co.uk (Ronald S H Khoo) (04/30/91)

henry@zoo.toronto.edu (Henry Spencer) writes:
> dlv@cunyvms1.gc.cuny.edu writes:
> >These are _spacing_ diacritics and cannot be combined with letters in ISO 8859.
> Uh, character code 010 is backspace, which makes such combination eminently
> possible.

"The use of control functions for the coded representation of composite
 characters is prohibited by ISO 8859" -- [section 0: Introduction, ISO 8859/1]

"The use of control functions, such as BACKSPACE or CARRIAGE RETURN for the
 coded representation of composite characters is porhibited by ISO 8859"
	-- [section 7: Specification of the coded character set, ISO 8859/1]

Besides, the control characters, e.g. backspace, are not covered by 8859.

Aren't character set standards fun?  :-/
-- 
Ronald Khoo <ronald@robobar.co.uk> +44 81 991 1142 (O) +44 71 229 7741 (H)