blarson@oberon.UUCP (Bob Larson) (10/28/85)
[Let's demonstrate the need by cross-posting to something other than net.internat] Some people seem to be under the mistaken impression that ASCII hasn't changed. Lower case letters were added in (rather than the shift-in / shift-out cluge), _ was changed from left arrow to underline, ^ was chaned from up arrow to carrot, etc. I don't think adding an eighth bit would change it enogh to consider it something other than ASCII. Sorting order in ASCII realy isn't correct either. Do you like all of your upper case words coming before your lower case ones? The sorting order problem is realy one of replacing a case translator with a table lookup. Hopefully the table could be make easy to change for working in different languages. -- Bob Larson Arpa: Blarson@Usc-Ecl.Arpa Uucp: {the (mostly unknown) world}!ihnp4!sdcrdcf!oberon!blarson {several select chunks}!sdcrdcf!oberon!blarson
guido@boring.UUCP (11/01/85)
In article <150@oberon.UUCP> blarson@oberon.UUCP (Bob Larson) writes: >The sorting order >problem is really one of replacing a case translator with a table lookup. >Hopefully the table could be make easy to change for working in different >languages. YES! Decent sourting should always be done be table lookup. As an example, the Macintosh international utilities package sorts strings in this way, and the table can be customized to cope with national variations in the desired dictionary order. The Mac still uses the character set's native ordering to determine an ordering for strings that compare equal using the table (e.g., AA equals aa but precedes it, while aa precedes AB), so the character set's ordering still matters. I don't know whether the Macintosh character set (which is a superset of ASCII and contains most accented or otherwise slightly modified characters found in various Western European languages, but does not support differenty alphabets) would be acceptable as a standard, but at least it addresses the problems that are encountered most frequently, it fits in 8 bits and is compatible with ASCII. (I'm afraid that there is another standard extension of ASCII which uses up the 8th bit for lots of control codes like cursor up. However this does not seem to have caught on very much.) Guido van Rossum, CWI, Amsterdam (guido@mcvax.UUCP)
franka@mmintl.UUCP (Frank Adams) (11/04/85)
In article <6672@boring.UUCP> guido@mcvax.UUCP (Guido van Rossum) writes: >I don't know whether the Macintosh character set (which is a superset >of ASCII and contains most accented or otherwise slightly modified >characters found in various Western European languages, but does not >support differenty alphabets) would be acceptable as a standard, >but at least it addresses the problems that are encountered most >frequently, it fits in 8 bits and is compatible with ASCII. > >(I'm afraid that there is another standard extension of ASCII which >uses up the 8th bit for lots of control codes like cursor up. >However this does not seem to have caught on very much.) There is another standard extension of ASCII which is used for the IBM PC. It has a fair number of modified characters; I don't know how it compares with the Macintosh set. (It does not have the eastern European c's, s's, or z's with curlicues; it does have the vaguely similar French c.) It also has a fair selection of special characters. I am not actually recommending it, just putting it up for consideration. Given the source, I think it has to be taken into account. Frank Adams ihpn4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108
jack@boring.UUCP (11/05/85)
In article <6672@boring.UUCP> guido@mcvax.UUCP (Guido van Rossum) writes: >(I'm afraid that there is another standard extension of ASCII which >uses up the 8th bit for lots of control codes like cursor up. >However this does not seem to have caught on very much.) > > Guido van Rossum, CWI, Amsterdam (guido@mcvax.UUCP) As far as I remember, this 8 bit ASCII (which isn't called ASCII, by the way, but ISO-something-or-other) uses codes 0200-0240 for extra control functions, and 0241-0277 for extra characters. I even think that if you take a letter in normal ASCII, and add bit 8, you still have a letter (be it a different one, of course:-). Since this code seems to have been more-or-less accepted (I know of at least two terminals that accept it, or part of it), I guess the MAC will probably use the same code. If there is interest, I'll type in the code-table (more-or-less, of course). -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.