karl@haddock.ima.isc.com (Karl Heuer) (05/02/90)
In article <11071@cbmvax.commodore.com> valentin@cbmvax (Valentin Pepelea) writes: [paraphrased --kwzh] >[How should locale information be organized? The monetary information is >usually specific to a country, while the collating information is specific to >a language. A country may have multiple languages, or a language may span >multiple countries.] Seems like the locale name ought to mention both the country and the language, e.g. "usa-english". There would be ample opportunity for the data to be linked%: usa-english/LC_COLLATE could be the same as uk-english/LC_COLLATE and can-english/LC_COLLATE, and likewise can-english/LC_MONETARY could be linked to can-french/LC_MONETARY. It would also be reasonable to support incompletely defined locales, e.g. "english" could be a valid local name when used in conjunction with LC_COLLATE but invalid for LC_MONETARY (and hence invalid for LC_ALL). Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint ________ % The likely UNIX implementation is as a bunch of directories with cross- linked files. An alternate scheme, less specific to quirks of UNIX, is to have a single index file where a key like "usa-english/LC_COLLATE" is paired with a file name containing the data. I mention this to demonstrate that my use of the word "link" need not imply a property of the filesystem.
news@OSF.ORG (USENET News System) (05/03/90)
From: martin@osf.osf.org (Sandra Martin) Path: osf!martin You're right that the examples are confusing, and not entirely appropriate. The problem is that there are no current standards for locale names or for the way locale information should be organized. Most implementations that I know of use some form of the X/Open naming recommendation which consists of three parts: language_territory.codeset At this point, however, there is no agreement about the contents of the individual parts. For example, some implementations might use "long" country names for the the territory segment (e.g., canada, germany), while others use abbreviations (can, ger). Still others use the nationality rather than the country name (e.g., using "swiss" rather than "switzerland"). There are many, many other examples of different approaches. As for your question about how the locales should be organized, again, it isn't standardized, and so depends on the implementation. There are two fairly popular approaches: flat and tiered. With the flat approach, information is stored something like this .../locale/<locale_name>/<locale_related_file(s)> With the tiered approach, information is stored something like this: .../locale/<language>/<territory>/<codeset>/<locale_related_file(s)> In the tiered approach, the territory and codeset directories are optional and therefore might not exist. You noted that some locale-related info is language-specific, while other info is country-specific. Notice that neither the flat nor tiered approach makes these kinds of distinctions. Some implementations do have separate files for language- and country-specific info, but they store them together in the same directory. Confused? I wouldn't be surprised if you were. I've thought for a long time that it would be a good idea to have some standards for locale names, but have been voted down in a couple of different groups. However, lately there have been some rumblings about the confusion inherent in the current chaotic system, so we may see some standards soon. Standards for the organization of locale info also would be helpful. Hope this helps. -- Sandra Martin Open Software Foundation email: martin@osf.org tel: (617) 621-8707
goudreau@larrybud.rtp.dg.com (Bob Goudreau) (05/04/90)
In article <7513@paperboy.OSF.ORG>, martin@osf.osf.org (Sandra Martin) writes: > > The problem is that there are no current standards for locale names or for > the way locale information should be organized. Most implementations that I > know of use some form of the X/Open naming recommendation which consists of > three parts: > > language_territory.codeset > > At this point, however, there is no agreement about the contents of the > individual parts. For example, some implementations might use "long" country > names for the the territory segment (e.g., canada, germany), while others use > abbreviations (can, ger). Still others use the nationality rather than the > country name (e.g., using "swiss" rather than "switzerland"). There are many, > many other examples of different approaches. > > .... > > Confused? I wouldn't be surprised if you were. I've thought for a long time > that it would be a good idea to have some standards for locale names, but > have been voted down in a couple of different groups. However, lately > there have been some rumblings about the confusion inherent in the current > chaotic system, so we may see some standards soon. Standards for the > organization of locale info also would be helpful. And to open a whole separate can of worms, what about the different ways to name a country or a language? E.g., "germany" vs. "deutschland", or "English" vs. "Englisch" vs. "Anglais" vs. "Ingles", etc. It would appear to be necessary to introduce many separate standards of locale names (each naming all locales), one for each language locale! Of course, the character set(s) used to form such names is yet another problem.... ------------------------------------------------------------------------ Bob Goudreau +1 919 248 6231 Data General Corporation 62 Alexander Drive goudreau@dg-rtp.dg.com Research Triangle Park, NC 27709 ...!mcnc!rti!xyzzy!goudreau USA
Bob.Stout@p6.f506.n106.z1.fidonet.org (Bob Stout) (05/04/90)
Perhaps this is why ANSI uses the category as the first argument to setlocale(). In my implementation, you could simulate a Quebec locale by calling setlocale(LC_ALL, "USA"); setlocale(LC_TIME, "FRANCE");. Once set this way, retrieving the locale using localeconv() would fetch a locale that looked like American English, but with the week days and months, etc. in French. (Yeah, I know that Quebec is more complicated than that - I merely used it as an example. I also support non-integer (second-specified) time zones, and other oddball stuff folks requested from various parts of the world.)
barr@frog.UUCP (Chris Barr) (05/05/90)
In article <11071@cbmvax.commodore.com>, valentin@cbmvax.commodore.com (Valentin Pepelea) writes: > The ANSI C function setlocale() allows the programmer to set the locale to > be used in localised functions. As examples we are given > > /usr/lib/locale/german/LC_MESSAGES/ contains message catalogues > /LC_COLLATE collation (sorting) information > /LC_TIME time & date information > /LC_NUMERIC number format infomation > /LC_MONETARY monetary symbol & format info > > But this is rather confusing. While messages and collation information varies > according to language, time format and monetary information is country specific. > So how are locale directories supposed to be organised? Name directories for BOTH country and language. Files which are the same for different 'locales' might be linked, e.g. messages in switz_french & canada_french. e.g.: /usr/lib/locale/switz_german/ /usr/lib/locale/switz_french/ /usr/lib/locale/canada_french/ /usr/lib/locale/canada_english/
meissner@osf.org (Michael Meissner) (05/07/90)
In article <14535@frog.UUCP> barr@frog.UUCP (Chris Barr) writes: | In article <11071@cbmvax.commodore.com>, valentin@cbmvax.commodore.com (Valentin Pepelea) writes: | | > The ANSI C function setlocale() allows the programmer to set the locale to | > be used in localised functions. As examples we are given | > | > /usr/lib/locale/german/LC_MESSAGES/ contains message catalogues | > /LC_COLLATE collation (sorting) information | > /LC_TIME time & date information | > /LC_NUMERIC number format infomation | > /LC_MONETARY monetary symbol & format info | > | > But this is rather confusing. While messages and collation information varies | > according to language, time format and monetary information is country specific. | > So how are locale directories supposed to be organised? | | Name directories for BOTH country and language. | Files which are the same for different 'locales' might be linked, e.g. messages | in switz_french & canada_french. | e.g.: | /usr/lib/locale/switz_german/ | /usr/lib/locale/switz_french/ | /usr/lib/locale/canada_french/ | /usr/lib/locale/canada_english/ Nothing in the locale stuff mandates that a locale be a country, place, or what have you (though that's how it mostly will be used). For example, you could have a locale that is used for sorting things in American Library Order (case insignificant, Mc and Mac at the beginning of words are considered the same, insignificant words like 'the' not counting in collation), etc. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA Catproof is an oxymoron, Childproof is nearly so
morten@modulex.dk (Morten Hastrup) (05/09/90)
barr@frog.UUCP (Chris Barr) writes: >In article <11071@cbmvax.commodore.com>, valentin@cbmvax.commodore.com (Valentin Pepelea) writes: >> The ANSI C function setlocale() allows the programmer to set the locale to >> be used in localised functions. As examples we are given >> >> /usr/lib/locale/german/LC_MESSAGES/ contains message catalogues >> /LC_COLLATE collation (sorting) information >> /LC_TIME time & date information >> /LC_NUMERIC number format infomation >> /LC_MONETARY monetary symbol & format info >> >> But this is rather confusing. While messages and collation information varies >> according to language, time format and monetary information is country specific. >> So how are locale directories supposed to be organised? >Name directories for BOTH country and language. >Files which are the same for different 'locales' might be linked, e.g. messages >in switz_french & canada_french. >e.g.: > /usr/lib/locale/switz_german/ > /usr/lib/locale/switz_french/ > /usr/lib/locale/canada_french/ > /usr/lib/locale/canada_english/ You might be right, but why organise it this way when ther is no recommenda- tion on this field. I have tryed to find some in X/OPEN Portability Guide. Besides, other companies/people ( Digital f.x. ) organise the locales this way: /usr/lib/intln/646/ENG_GB.646 /* English ISO646 */ /GER_DE.646 /* German ISO646 */ /usr/lib/intln/8859/ENG_GB.8859 /* English ISO8859-1 */ /GER_DE.8859 /* German ISO8859-1 */ They also use the environment variable INTLINFO to specify this directory- structure (e.i. INTLINFO = /usr/lib/intln/%c/%L). The dafault path is /usr/lib/intln. I could not find INTLINFO in X/OPEN, so I would like to hear from other about simular variables (And of course YOUR opinion on this field overall). How do you avoid comflicts between your own locale and locales that belongs to another application (I know ideel that they should be the same, but you never know). -- Morten Hastrup <morten@modulex.dk> A/S MODULEX Phone: +45 44 53 30 11 Lyskaer 15 Telefax: +45 44 53 30 74 DK-2730 Herlev Denmark