neil@progress.COM (Neil Galarneau) (02/14/91)
I have heard of a new multi-lingual character set called Unicode. It is supposed to give one all the character sets in the world in 16-bit charcters. It is supposed to be backed by several Unix companies, Apple, and the DOS companies. Other than that, I have no details. Does anyone know where I can get a spec? I need to evaluate this proposed "standard". Thanks, Neil neil@progress.com
garry@ceco.ceco.com (Garry Garrett) (02/15/91)
In article <1991Feb14.001842.24415@progress.com>, neil@progress.COM (Neil Galarneau) writes: > I have heard of a new multi-lingual character set called Unicode. > > It is supposed to give one all the character sets in the world in 16-bit > charcters. It is supposed to be backed by several Unix companies, Apple, > and the DOS companies. > > ... I hope not. I was working on this myself, and I was also going to include several other features as well. Well, I they are working on it I hope that they were smart enough to make character 0 = '0', char 1 = '1'... char 9 = '9', char 10 = 'A', char 11 = 'B' ... This would make conversions to Hexidecimal (or Octal) much much easier. You see there is no real reason that the control characters HAD to occupy the first 32 characters in ASCII. (They could have just as easily made them the last 32 and used positive logic circuts rather than negitve logic) IMHO, many of the problems that programmers face with character sets (ASCII & EBCDIC) are that they were designed by engineers who were (naturally) more concerned with what was going to be easy to build hardware-wise. You only build hardware once, but people write software for it for years. If it's a little bit harder to build, but easier to program, it's worth it. As for my thoughts on a NEW character set, there is no reason why all written languages could not be included. There are also a wealth of special characters that could be included, making people of various professions that use computers, jobs easier. Symbols like less-than-or-equal-to and not-equal-to, could make every programmer's job easier. Meteorology has alot of symbols that could be included in the character set that would simplify the excange of weather data, for example. We also need not limit a new character set to today's technology. (ASCII was designed for teletype machines) What I mean by this is that we should include characters that represent colors and music. granted, not everyone's computer has these capablities today, but why limit the character set? If your computer doesn't have a speaker, ignore the music characters. If you don't have the capabilities that some given character implies, then take an appropriate action that is within your hardware's realm. I think that there are alot of special characters that would help to unify word processing files (like a character for Boldface-on, Itallics-off...) If these characters existed in the character set, word processors would not need to make up their own representation for these things, and thus they could use "standard" unicode files. Immagine, having a file of "music" characters: you could "print" it to your synthizer and listen to it, or you could "print" it to you printer, and get out sheet music. (I realize that this is a bit idealized, but I think that it is possible). Joe Musician could write his new song on a computer, upload it to the studio, Record it, (most likely his record label will sell the theme to it to a video game maker to include as background to a game), and the record company will put it on a CD ROM with it's other Top 40 songs of the month to distribute to record stores so that you can come in and buy a copy of the sheet music (which the music store prints off on it's laser printer from the file on the CD ROM). I am not saying that this form of marketing is my goal, but I am only trying to show how much time & effort can be saved for members of a certain profession, if they are kept in mind when a new code is developed. I certainly hope that if there is a Unicode, that it's makers have had such a far reaching outlook on it's possiblities. It would be a shame for a new "standard" to emerge that is outdated about the time that it is accepted. If any of you out there have some ideas for things that may be included in my character set, please e-mail them to me. I still plan on working on this unless I get some more info on Unicode, and it does have some forthought to it. Garry Garrett garry@ceco.ceco.com
yawei@bronze.ucs.indiana.edu (mr. yawei) (02/15/91)
In article <405@ceco.ceco.com> garry@ceco.ceco.com (Garry Garrett) writes: >In article <1991Feb14.001842.24415@progress.com>, neil@progress.COM (Neil Galarneau) writes: >> I have heard of a new multi-lingual character set called Unicode. >> >> It is supposed to give one all the character sets in the world in 16-bit >> charcters. It is supposed to be backed by several Unix companies, Apple, >> and the DOS companies. This probably doesn't belong here, but I don't think it is possible to include *ALL* the character sets in the world. For example, one can not possibliy include the entire Chinese character set for two reasons: (1) its cardinality is huge, (2) the set is unbounded. As far as Chinese characters are concerned, what unicode may be able to do is to include only the most frequently used ones, and then provide a composition mechanism to generate less frequently used ones when they are needed. yawei
Norbert.Zacharias@arbi.informatik.uni-oldenburg.de (Norbert Zacharias) (02/15/91)
yawei@bronze.ucs.indiana.edu (mr. yawei) writes: >In article <405@ceco.ceco.com> garry@ceco.ceco.com (Garry Garrett) writes: >>In article <1991Feb14.001842.24415@progress.com>, neil@progress.COM (Neil Galarneau) writes: >>> I have heard of a new multi-lingual character set called Unicode. >>> >>> It is supposed to give one all the character sets in the world in 16-bit >>> charcters. It is supposed to be backed by several Unix companies, Apple, >>> and the DOS companies. > This probably doesn't belong here, but I don't think it is possible >to include *ALL* the character sets in the world. For example, one can >not possibliy include the entire Chinese character set for two reasons: >(1) its cardinality is huge, (2) the set is unbounded. > As far as Chinese characters are concerned, what unicode may be able >to do is to include only the most frequently used ones, and then provide >a composition mechanism to generate less frequently used ones when they >are needed. > yawei Hi all I know that there is an Code that contains every usual character include the chinese one. It was developt by the GMD (Gesellschaft fuer Mathematik und Datenverabeitung) for the chinese/japanese version of there OS EUMEL in 85/86. If one is interested in i'll try to get a file wich contains the definition from GMD .(i only have a map with the chars ) Norbert -- ============================================================================= Norbert Zacharias Norbert.Zacharias@arbi.informatik.uni-oldenburg.de FB Physik 148964@DOLUNI1.bitnet Carl-von-Ossietzky-Universitaet Tel. 0049-441-7983527 Was Du nicht willst das man Dir tu, das will auch nicht was willst denn Du? Heinz Erhard =============================================================================
einari@rhi.hi.is (Einar Indridason) (02/16/91)
In article <405@ceco.ceco.com> garry@ceco.ceco.com (Garry Garrett) writes: >positive logic circuts rather than negitve logic) IMHO, many of the >problems that programmers face with character sets (ASCII & EBCDIC) are >that they were designed by engineers who were (naturally) more concerned And more often than not, they came from the USA. Not from Europe, (or Iceland for that matter), because there is no thought in many 'standard' character sets for those that have to use some characters outside of the 7 bit range. Therefore some programmers thought: "it is allright to mask the 8th bit. Nobody uses it!!" Well, they are wrong!!! (Please don't mask the 8th bit!) > As for my thoughts on a NEW character set, there is no reason why >all written languages could not be included. There are also a wealth of >special characters that could be included, making people of various professions >that use computers, jobs easier. Symbols like less-than-or-equal-to and >not-equal-to, could make every programmer's job easier. Meteorology has alot A new character set? Fine with me, as long as we can use all our 36 (not 26, but 36) characters, plus the numericals and other 'non-letters' Of those 36 letters that we, here in Iceland, uses, there are 10 upper case and 10 lower case characters that *must* be placed in the higher half of ASCII set. Perhaps you might understand our frustration when we must put a whole lot of people, (that could be doing some other things), into 'icelandify' a bunch of *badly* written software :-( (Think what will happen when the need for 16bit character set starts to spread out?) >accepted. If any of you out there have some ideas for things that may be >included in my character set, please e-mail them to me. I still plan >on working on this unless I get some more info on Unicode, and it does >have some forthought to it. > Here is an idea: (no offence ment) don't mask the 8th bit. (or the 16th bit?) -- Internet: einari@rhi.hi.is | "Just give me my command line and drag UUCP: ..!mcsun!isgate!rhi!einari | the GUIs to the waste basket!!!!" Surgeon Generals warning: Masking the 8th bit can seriously damage your brain!!