COMSAT@MIT-MC@sri-unix (11/27/82)
From: Communications Satellite <COMSAT at MIT-MC> FAILED: Cory at MIT-DSPG; I gave up on sending this after 31 "temporary" errors. Failed message follows: ------- Date: 22 Nov 1982 0028-EST From: Mel Pleasant <WORKS at RUTGERS> Subject: WORKS Digest V2 #88 Sender: PLEASANT at RUTGERS at MIT-MC To: WorkS: ; Reply-To: WORKS at RUTGERS Works Digest Monday, 22 November 1982 Volume 2 : Issue 88 Today's Topics: Notes - Communications Breakthrough, Hardware - The Corvus Concept, Programming - Extending the ASCII character set (6 msgs) ---------------------------------------------------------------------- Mail-from: ARPANET site PARC-MAXC rcvd at 10-Nov-82 2233-EST Date: 10-Nov-82 19:36:21 PST (Wednesday) From: Hamilton.es at PARC-MAXC Subject: Communications Breakthrough To: Human-nets @ Rutgers Mail-from: Arpanet host CMU-10A rcvd at 10-NOV-82 0826-PST Date: 10 November 1982 1126-EST (Wednesday) From: James.Morris at CMU-10A To: csl^ at PARC-MAXC, isl^ at PARC-MAXC, junk^ at PARC-MAXC Subject: Communications Breakthrough Message-Id: <10Nov82 112614 JM90@CMU-10A> Because you can't see the person who is sending you electronic mail you are sometimes uncertain whether they are serious or joking. Recently, Scott Fahlman at CMU devised a scheme for annotating one's messages to overcome this problem. If you turn your head sideways to look at the three characters :-) they look sort of like a smiling face. Thus, if someone sends you a message that says "Have you stopped beating your wife?:-)" you know they are joking. If they say "I need to talk to you :-(", be prepared for trouble. Since Scott's original proposal, many further symbols have bee proposed here: (:-) for messages dealing with bicycle helmets @= for messages dealing with nuclear war <:-) for dumb questions oo for somebody's head-lights are on messages o>-<|= for messages of interest to women ~= a candle, to annotate flaming messages So you see, bit-map displays are really quite unnecessary :-> ------------------------------ Date: 21 November 1982 05:51-EST From: "James Lewis Bean, Jr." <BEAN at MIT-MC> Subject: The Corvus Concept Does anyone have anything good to say about one of these machines? I went to see one at a "Computer Store" and seemed to know more about the machine than the sales people did, oh well. It looks great! The price is great. The problem seems to be there is no documentation on the software that exists. And there doesn't appear to be much software for it. For those of you who do not know what a corvus concept is... The concept is an under $5K workstation. 68000 based with 256K standard. 15 inch 35mhz video Bit mapped 720x560 display. 120x56 in landscape mode 90x72 in portrait mode Two serial ports. One omninet interface (1mbps serial rs422) Detached Keyboard with lots of extra keys. Anyone know where I can get unix for it? lewis bean at mit-mc ------------------------------ Date: 13 Nov 1982 09:12:38-PST From: mo at LBL-UNIX (Mike O'Dell [system]) Subject: SUPER ASCII Flame on! The folks who are grossly misspeaking about what ASCII is and is not should look at the full specification. A while back there was a series of articles in IEEE Computer written by one of the prime movers in the ASCII effort talking about just such things. All the (serious?) extension people have mentioned [mathematics, other (non-English!!) languages, some graphics symbology] are included in the ISO code structure, of which 7-bit ASCII is only a TINY subset. There are defined codes for escaping to a great many alphabets, and if you layer NAPLS on top (which does so quite cleanly as it was designed with full knowledge of the ISO code structure) you can even define the glyphs represented by new codes. There are in fact TWO issues here - do you seriously want to define a discrete character code for every character in every font in every point size?? If so, 32-bit will barely suffice. The ISO code structure contains provides ways for "escape" mechanisms for getting to other protocols. One of these other protocols would be a presentation protocol like NAPLS or something similar which will specify glyph representation, font, pointsize, etc. The base alphabet would only represent the character which is presented in the form "current" in the presentation protocol. This means there is a tight coupling between the layers when handling real data, but a cross-product structure is surely more desirable than a flat enumeration of "all symbols needed for man's knowledge!" (excuse possible mis-quote.) A good friend of mine has a quote which I dearly love: "Creativity is no substitute for knowing what you're doing." Flame off! -Mike ------------------------------ Date: 13 November 1982 15:22-EST From: "Marvin A. Sirbu, Jr." <SIRBU at MIT-MC> Subject: Character codes ISO standard 222 (?) specifies a standard method for extending the 8 bit Teletex code by swapping in an alternate 128 character set. Already 45 different alternate character sets have been registered with the ISO including Greek, Russian, Arabic, etc. Efforts are now underway within the CCITT Study Group VIII to define a new extension technique which would allow for two bytes to specify a single character -- i.e. a 16-bit code. The primary reason for going to 16 bits is to encode Kanji (Japanese and Chinese) characters. Marvin Sirbu ------------------------------ Date: 15 Nov 1982 at 1100-PST Subject: Re: Supercodes From: chesley.tsca at SRI-Unix One very simple way to allow extended codes is to define the 8th bit as a "select character set" bit. If on, the lower 7 bits select a new character set (a sort of super shift-out). The default could be ASCII, thus keeping compatibility with existing standards. A couple of additional features allow even more expansion: (1) Reserve one of the character set numbers to mean "select second level character set"; i.e., next byte has character set number. This can be repeated indefinitely to allow an arbitrary number of character sets. (2) Divide the character sets into groups, each with a different number of bits per character: 8, 16, 24, 32. This allows arbitrary size character sets. Bucky bits can be handled by having "ASCII", "Meta-ASCII", etc. And it's quite simple at the keyboard interface to translate top bit set (i.e., meta) into "change-set,character,change-set". So this could be added to an existing system with very little work, and people can go off and start defining new character sets right away... --Harry... ------------------------------ Date: 16 Nov 82 8:35:40-PST (Tue) To: works at Rutgers From: hplabs!hao!csu-cs!bentson at Ucb-C70 Subject: Re: Supercodes See ANSI standard X3.41 "...Code Extension Techniques..." for a general structure upon which to hang an extensible character set. There's also a proposed ISO/DP 6937 "Coded Character Sets for Text Communication". I would like to see more of the "technocrats" in the standards committees; the last meeting I attended, there was a strong push towards restricting a "page image format" character set to a great deal LESS than the 128 char ASCII set because of the number of dumb terminals in existence! Fortunately it was finally recognized that the standard wouldn't be published for years so we would be able to (and were obliged to) look ahead to that time. If you have a real interest in all this, contact: Thomas N. Hastings Chairman, X3L2 Character Sets and Coding Digital Equipment Corp Mailstop MK1-2/J05 Continental Boulevard Merrimack, N.H. 03054 603 884-6767 He should be able to point you to the proper working group. I have been out of the "establishment of standards" activity for some time now, so I hope the above address is still correct. Randy Bentson Colo State U - Comp Sci {ucbvax!hplabs, menlo70!hao}!csu-cs!bentson ------------------------------ Date: 17 Nov 1982 0657-EST From: Marc Shapiro <Shapiro at MIT-XX> A recent message to WorkS discussed the CCITT/Teletex extensions to the Ascii alphabet, correctly noting that it allows all diareses combinations (i.e. accents and pronunciation marks on letters) plus a lot of new symbols (arrows, greek letters, math symbols), colors, underlining, etc. - the standard does *not* require a code per diaresis combination. Rather, there is a code for the letter + a code for the accent mark. The same is of course true for underlined text, colored text, etc. - the message stated that the standard needs 8bits/char., and hence is unsuitable for systems where the 8th bit is reserved for parity or doesn't exist (DEC equipment). Not quite true. The standard provides both for 8-bit and 7-bit encoding, the latter with longer escape sequences. The standard is compatible with Ascii. Reference: CCITT yellow book Volume VII - Fascicule VII.2 Telegraph and Telematic Services Terminal Equipment Recommendations of the Sand T series. or: Presentation Level Protocol Videotex Standard Bell System, May 1981 PS The standard also defines mosaic graphics "a la" bit-map. ------------------------------ Date: 18 Nov 1982 1603-PST From: Pierre MacKay <MACKAY at WASHINGTON> Subject: Range of ASCII, alias ISO 646-1973 To: LES at SU-AI cc: Furuta at WASHINGTON, Binding at WASHINGTON, Your 8 bit ASCII message of 10 Nov 1982, found its way to me by a somewhat roundabout route, since I am not on the WorkS list, and, given the size of my mail file as it is, I am hesitant to get there. You underestimate the range of even 7-bit ASCII. In conjunction with the appropriate escape sequences from ISO 2022-1973, alias (for all practical purposes) ANSI X3.41-1974, the good old 7-bit table speaks several languages. For instance: Greek---ISO 5428-1980 (I haven't actually seen this yet. Japanese---National standard C6220-1969 (katakana only, of course, and this, in the form JISCII is a true 8-bit code, with ASCII residing in columns 0..7 and katakana in columns 10..13. Russian---GOST 13052-67, a dreadful aberration set up for the use of SO and SI coding, with the Cyrillic alphabet scrambled to match the visually similar Latin letters. Why even a Commissar would want to do that to his own language is beyond me, but it is AUTHORITATIVE, under the circumstances. The Arabic case is chaos. There is no reason why a good, efficient Arabic script coding table cannot be included in a 7-bit range. I am working with one now, but it is rather my own invention. It resembles some of the work done by ISO TC-46 and similar work done at the Library of Congress. There was a fine suggestion put forward at Riyadh, Saudi Arabia, about two and a half years ago, but it came to nothing, and a dreadful Moroccan notion, cobbled up out of a set of linotype matrices now has a certain currency, in that it has been registered, whatever that means, as Number 59, dated June 1, 1982 with ISO. It includes 4 ISO 2022 escape sequences to identify G0, G1, G2, and G3 graphic sets, but does not say what is to be done with all these alternatives. ECMA has plunged into the same waters with an entirely different proposal, which may even be worse. They all seem to assume that all Arabic ligature forms must be shown in the coding table, rather as if Don Knuth's TeX were to require the elimination of the open and close brace character positions so that you could code the double-f ligatures directly. The implications of microprocessor technology have not yet got through. Urdu, Pashto and Sindhi would probably overload a 7-bit table, since you are really dealing with two incompatible alphabets mashed into one in those cases. Malay and Chinese-Turkish (as seen on the lower right corner of PRC banknotes) will fit. Persian, of course will fit easily, as will Ottoman Turkish, a language for which I have a bizarre atavistic affection. Western Europe and Hungary have national versions of ISO 646 to account for heavily used diacriticals. I don't know about Czech, which is a bit overloaded. Modern Turkish is a nice problem too. I believe the Sanskrit-derived Indian languages would fit, and the Tamil family would certainly fit in a 7-bit table. Chinese, and Japanese Kanji would not. The Japanese use a manageable subset of Chinese ideographs, and have already established a multi-bit code. One proposal for Chinese uses the 94 cells available in the Graphic area of ISO 646 in a three level code. There are 94 books of 94 pages each of 94 characters each, or 94 to the third power possible characters. That should suffice even for Chinese. --Pierre MacKay ------------------------------ Date: 18 Nov 82 15:30:53-PST (Thu) From: hplabs!intelqa!omsvax!bc at Ucb-C70 Subject: Re: Supercodes The address given for Tom Hastings of the ANSI X3L2 character codes committee was out of date. The correct address is: Thomas N. Hastings Chairman, X3L2 Character Sets and Codings Digital Equipment Corp. Mailstop ML1-2/H26 146 Main St. Maynard, Mass. 01754 As member of the standards community (X3H3, computer graphics, and liaison to X2L2), I implore anyone planning on extending the ASCII (or any other) character set to contact Tom and get information from him. There are far too many partly or wholly incompatible "standards" in the world now, more are not needed. Besides which, someone out there may have already solved your problem, and saved you a lot of work. I think it was Robert Heinlein who said that a good engineer is adept at recognizing good work, and using it, with the serial numbers filed off as appropriate. Bruce Cohen ...{pur-ee,hplabs}!intelqa!omsvax!bc ------------------------------ End of WorkS Digest ******************* -------