[net.text] International UNIX

keld@diku.UUCP (Keld J|rn Simonsen) (07/22/85)

<>

A while ago there was some discussion in these groups on
international UNIX. I missed it due to faults in our news system.
(I still wonder why). Here is my two cents worth.

The EUUG is also having a Standards Commitee on International
UNIX. We are seeing forward to a cooperation with the /usr/group/UK
group. There is a meeting on this in connection with the EUUG Copenhagen
Conference scheduled to Thursday 12th September 1985.

As Leif Samuelson noted, some chars in what you think is ASCII,
but in reality is ISO 646-1983, are reserved for national use,
namely the twelve (12) chars:

             #$@[\]^`{|}~

Various European National Standardisation Boards have adopted
character representations different from ASCII on (in total)
all the abovenamed positions. So these should not be thought of
as generally useful for international software, any of these characters
will generate weird output at least in one major European area.

Yes, we need to be able to have variable names with these characters.
ANSI C does not allow this, but it allows a representation of nine of
the abovenamed chars in *trigraph* form: ?? is used as a lead-in to
define:

#       [       \       ]       ^       {       |       }       ~
??=     ??(     ??/     ??)     ??'     ??<     ??!     ??>     ??-       

$@` are not used (at the moment) in ANSI C.
Personally I do not like the choice of ? as lead-in char as it
is graphically quite dominating, maybe .. was better,
but the trigraph scheme is quite general and OK to me.
If we then could use the national chars in variable names, C could
become a quite useful programming language :-)

minow@decvax.UUCP (Martin Minow) (07/23/85)

Keld Joern Simonsen suggests, probably with tongue in cheek,
that C would be a useful programming languge if only European users
could use their full national character set in identifiers.

To my knowledge, no commercially available computer language --
including a few developed in Scandinavia such as Algol 60 (for
Trask and Besk), Algol-Genius (for the Datasaab machines) and
Simula (for Dec PDP10s) permit national letters in variable
names, so the marketplace hasn't exactly mandated their inclusion.

I would also point out that national replacement character sets
are being superseded by the Draft ISO/ANSI/ECMA 8-bit character
set called Latin 1.  Latin 1 has a unique representation for the
national letters of the major European languages and, once the
initial problems of going from a seven-bit character set to
an eight-bit set have been solved, should prove to be a much
simpler representation to deal with for international products.

Martin Minow (fil.kand. Stockholms Universitet)
decvax!minow

levy@ttrdc.UUCP (Daniel R. Levy) (07/25/85)

What do you do about the punctuation marks [], which are used in C to denote
arrays?  Wouldn't they come out screwy in some international ASCII dialects?
Something like char foo ??(100??) or what?
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|       dan levy | yvel nad      |  my own and are not at all those of my em-
|         an engihacker @        |  ployer, my pets, my plants, my boss, or the
| at&t computer systems division |  s.a. of any computer upon which I may hack.
|        skokie, illinois        |
|          "go for it"           |  Path: ..!ihnp4!ttrdc!levy
 --------------------------------     or: ..!ihnp4!iheds!ttbcad!levy

trb@masscomp.UUCP (Andy Tannenbaum) (07/25/85)

I don't think it's necessary to hack up languages to allow funny
(uhm, international...) characters as variable names.  It is
important, though, to allow international character sets in the user
interface.  This is a totally different problem, and it's a problem
that Hewlett Packard seems to be addressing.  Within the past year,
there have been several articles in the HP Journal which address the
issues involved in the international software marketplace.  For
example, Wilson and Shaw, "Designing Software for the International Market,"
HP Journal Sept 1984 is an overview.  There have also been articles
which discussed sorting, hyphenation, spelling correction, maintaining
multi-language prompt-string databases, date formats, etc.

By the way, the HP Journal is free from HP, and often contains
interesting and timely information on their products and engineering.
It's nice to see a free publication which isn't useless.
I think you can get put on the list by mailing to

	HP Journal
	3000 Hanover Street
	Palo Alto, CA 94304 USA


	Andy Tannenbaum   Masscomp  Westford, MA   (617) 692-6200 x274

zap@ttds.UUCP (Svante Lindahl) (07/25/85)

["For you, for you, for you, I came for you" -- Bruce Springsteen, "For you"]

In article <93@decvax.UUCP> minow@decvax.UUCP (Martin minow) writes:
>Keld Joern Simonsen suggests, probably with tongue in cheek,
>that C would be a useful programming languge if only European users
>could use their full national character set in identifiers.
>
>To my knowledge, no commercially available computer language --
>including a few developed in Scandinavia such as Algol 60 (for
>Trask and Besk), Algol-Genius (for the Datasaab machines) and
>Simula (for Dec PDP10s) permit national letters in variable
>names, so the marketplace hasn't exactly mandated their inclusion.

The PDP-10 Simula compiler does allow the lowercase national
characters { (a w/ umlaut, :a), | (o w/ umlaut, :o) and }
(a with a circle on top, Oa).

>Martin Minow (fil.kand. Stockholms Universitet)
>decvax!minow

Svante Lindahl (fil.kand. Stockholms Universitet)

-- 
Svante Lindahl, NADA, KTH (Dept of Numerical Analysis and Computer Science 
			   at the Royal Institute of Technology)
UUCP:	{decvax,philabs,seismo}!{mcvax,ukc,unido}!enea!ttds!zap
ARPA:	mcvax!enea!ttds!zap@seismo.ARPA
or 	Svante_Lindahl_NADA%QZCOM.MAILNET@MIT-MULTICS.ARPA

rjh@ihlpa.UUCP (Randolph J. Herber) (07/31/85)

> >To my knowledge, no commercially available computer language --
> >including a few developed in Scandinavia such as Algol 60 (for
> >Trask and Besk), Algol-Genius (for the Datasaab machines) and
> >Simula (for Dec PDP10s) permit national letters in variable
> >names, so the marketplace hasn't exactly mandated their inclusion.
> (a with a circle on top, Oa).
> 
> >Martin Minow (fil.kand. Stockholms Universitet)
> >decvax!minow
> UUCP:	{decvax,philabs,seismo}!{mcvax,ukc,unido}!enea!ttds!zap

IBM PL/I does allow three "national alphabet" characters in variable
names: $ (dollar sign), @ (at sign), and # (pound or number sign).

Randolph J. Herber, Amdahl Senior Systems Engineer, 
   at AT&T Bell Labs, Naperville, IL, 312-979-6553
   or 800-843-7467 extension 1075