nukim@ndsuvax.UUCP (kyongsok kim) (04/12/89)
When 7-bit ascii code is used on 8-bit machines, I guess that the msb (most significant bit) is set to zero. For example, "A" is 100 0001 in 7-bit ascii code and it will be represented as 0100 0001 on 8-bit machines. In some book, I found that there is a 8-bit ASCII-8 code, which is different from the 7-bit code w/ a leading zero prefixed. The book says that, for example, "A" is 1010 0001 and "1" is 0101 0001 in 8-ASCII code. My questions are: 1) what is ASCII-8 code? a good reference or table? 2) is ASCII-8 different from the 7-bit ascii code w/ a leading zero prefixed. 3) where is this code used? Thanks in advance. Please send e-mail. Kyongsok Kim Dept. of Comp. Sci., North Dakota State University e-mail address: nukim@plains.nodak.edu nukim@ndsuvax.bitnet uunet!ndsuvax!nukim
nukim@ndsuvax.UUCP (kyongsok kim) (04/18/89)
In article <2542@ndsuvax.UUCP> nukim@ndsuvax.UUCP (kyongsok kim) writes: : : In some book, I found that there is a 8-bit ASCII-8 code, :which is different from the 7-bit code w/ a leading zero prefixed. :The book says that, for example, "A" is 1010 0001 and "1" is 0101 0001 :in 8-ASCII code. Thanks to all who responded to my question. Here goes the summary: > The original IBM System 360 had a special ASCII-8 mode ... > It was never implemented... > > ... a form that IBM introduced with the 360 back in the 1960s. It > was not a superset of standard ASCII and died a quiet death. That may > be what your book was referring to. If so, ignore it except for > computer archeology purposes. > > I know of no systems where any such 8-bit ASCII code is used. > k kim #! rnews 1969 Path: psuvm.bitnet!cunyvm!
billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe,2847,) (04/18/89)
From article <2568@ndsuvax.UUCP>, by nukim@ndsuvax.UUCP (kyongsok kim): > In article <2542@ndsuvax.UUCP> nukim@ndsuvax.UUCP (kyongsok kim) writes: > : In some book, I found that there is a 8-bit ASCII-8 code, > :which is different from the 7-bit code w/ a leading zero prefixed. > :The book says that, for example, "A" is 1010 0001 and "1" is 0101 0001 > :in 8-ASCII code. > > Thanks to all who responded to my question. Here goes the summary: > >> The original IBM System 360 had a special ASCII-8 mode ... >> It was never implemented... (etc.) 8-bit ASCII is simply the American Standard corresponding to ISO Latin 1, ISO 8859/1-9. The statement of equivalence, and a table displaying the character set, appeared in Byte several years ago (circa 1985-1987); unfortunately, I don't remember the exact issue, nor have I ever gotten around to looking it up. (One of those things I've always meant to do, but never gotten done) At any rate, check Byte over roughly that time span, and post the *exact* reference for the rest of us, if you would... (BTW, since 8-bit ASCII contains all the European characters, it is quite unfortunate that there is so much inertia in industry...) Bill Wolfe, wtwolfe@hubcap.clemson.edu
wtwolfe@hubcap.clemson.edu (Bill Wolfe) (04/21/89)
[This followup was sent to me by Barry Sigfried, who requested that I post it to comp.std.internat...] From: bs7086@wucs2.wustl.edu (Barry Siegfried) Subject: Re: 7-bit ASCII vs. 8-bit ASCII Summary: Byte article on 8-bit ASCII draft standard In article <5153@hubcap.clemson.edu>, billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe,2847,) writes: > > 8-bit ASCII is simply the American Standard corresponding to > ISO Latin 1, ISO 8859/1-9. The statement of equivalence, and > a table displaying the character set, appeared in Byte several > years ago (circa 1985-1987); unfortunately, I don't remember the > exact issue, nor have I ever gotten around to looking it up. [...] > > At any rate, check Byte over roughly that time span, and post the > *exact* reference for the rest of us, if you would... The Byte article (August 1985, pp 24-25) was written by Thomas N. Hastings of Maynard, MA, and was titled "8-bit ASCII Draft Standard." It was a letter to the editor. Please post this to comp.std.internat. I can read that group but can't post to it. Thanks, Barry Siegfried bs7086@wucs2.wustl.edu
greger@ism780b (Greger Leijonhufvud) (04/25/89)
In article <Apr.19.10.41.28.1989.7554@paul.rutgers.edu> halldors@paul.rutgers.edu (Magnus M Halldorsson) writes: >The ISO 8859 character sets specify sets for specific languages. Now >what if one wants to use a combination of those? Is there any standard >for storing, representing, and switching between various (ISO) >character sets? What if one wants to allow for Japanese or Chinese as >well? > >Magnus There are several standardized (and several not yet blessed) techniques for "mixing codesets". The /usr/group Subcommittee on Internationalization has been studying several techniques for a while, and may even propose something to POSIX (or whoever the appropriate forum is). The AT&T "EUC" (Extended UNIX Codes) method is the only one so far implemented within UNIX for "internal use". This was done in Japan, because the Japanese language typically is written with 3 different script systems (Kanji, Katakana and Hiragana). The EUC scheme is based on the ISO 2022 single-shift coding: 7-bit ASCII is always present as code set 0. All other code sets must have the high-order bit set in all bytes. Code set 1 is distinguished by the high order bit set. Code set 2 has the high order bit set, and each character is prepended by the ISO 2022 SS2 (8e) character. Code set 3 has the high order bit set, and each character is prepended by the ISO 2022 SS3 (8f) character. This scheme supports (in theory) 4 different code sets. For 8859 compatible code sets, of course, it only supports 3 (as ASCII is part of each code set), and it does not support code sets that does not conform to ISO 2022 (such as the IBM Extended ASCII used on PC's, or the Shift-JIS code set. A more generalized scheme is the "Compound String" method, also endorsed by ISO. It may very well be the X Windows encoding scheme for interchange or internal representation. There are also other encoding schemes, by Sun, Xerox and other companies. There is, however, no standard as yet. Unfortunately. But, from V.4, you should be able to mix Icelandic with Bulgarian, and get your Greek quotations OK, too. Greger Leijonhufvud Interactive Systems Corp. Sunny Santa Monica, Ca. uunet!ism780c!greger
rja@edison.GE.COM (rja) (04/25/89)
In article <Apr.19.10.41.28.1989.7554@paul.rutgers.edu>, halldors@paul.rutgers.edu (Magnus M Halldorsson) writes: > The ISO 8859 character sets specify sets for specific languages. Now > what if one wants to use a combination of those? Is there any standard > for storing, representing, and switching between various (ISO) > character sets? What if one wants to allow for Japanese or Chinese as > well? > The Chinese standard is reportedly going to reserve the characters (decimal) 0 thru 255 for romanised characters. I've forgotten what the Japanese standard say, but it is possible that 128-255 are used for either Hiragana or Katakana. SS1 and SS2 are freqently used to shift character sets. A good place to look for European usage is X/OPEN. For Asian character sets, you'll have to acquire the standards.
deh0654@sjfc.UUCP (Dennis E. Hamilton) (04/28/89)
In article <26644@ism780c.isc.com> greger@ism780b.UUCP (Greger Leijonhufvud) writes: >In article <Apr.19.10.41.28.1989.7554@paul.rutgers.edu> halldors@paul.rutgers.edu (Magnus M Halldorsson) writes: >>The ISO 8859 character sets specify sets for specific languages. Now >>what if one wants to use a combination of those? Is there any standard >>for storing, representing, and switching between various (ISO) >>character sets? What if one wants to allow for Japanese or Chinese as >>well? >[discussion of EUC and other Unix-flavored proposals] >There are also other encoding schemes, by Sun, Xerox and other >companies. > >There is, however, no standard as yet. Unfortunately. But, from V.4, >you should be able to mix Icelandic with Bulgarian, and get your >Greek quotations OK, too. There has been an ISO scheme for mixing code sets for some time now. ISO 2022-1973 specified basically unlimited code-extension techniques, and you can use either 7-bit or 8--bit ASCII to carry it in. (The 7-bit scheme has a shifting scheme for getting to the other codes that would normally have bit 8 set). Although it can be a little painful, there are alphabet registration systems that allow international identification of the code used in the code stream itself. You can use the code identification procedure to switch the 7/8-bit "window" over those codes you want to use at any particular time. When the special 8-bit character codes that are talked about here become approved, there will presumably also be internationally approved "announcement sequences" for shifting in and out of them. This works for communication better than for internal processing, of course. For what you want to see *internally* in a particular computer system, I suppose POSIX and other standards will have to make provision (and the C Language will become interesting, too). But for interchange purposes via 7/8-bit data streams, all of the machinery has been defined for some time, including the procedure for international registration of special code tables. -- Dennis E. Hamilton {uucp: ... !rochester!cci632!sjfc!deh0654} Robert Anson Heinlein, 1907-1988 May the First Muster always answer to your names.