gs732@uxe.cso.uiuc.edu (04/01/88)
8-BIT ASCII STANDARD ON UNIX-POSIX Hello, everyone, Have you ever dreamed that TeX were more WYSWYG or that you could type Greek characters in the text mode directly? If we had an extended 256 8-bit ASCII character set such as IBM PC's. (See Appendix of PC DOS Manual), things would be much easier. Then why not use WordPerfect or MS Word? First, all the Greek characters and the math symbols are not supported by them, unless we buy extra software and hardware. In IBM's extended ASCII, there is no 'Greek tau', 'Greek nu', or inverted Greek capital delta symbol for partial differential equations. Even the registered trade mark sign 'R in a circle' does not exist. Then you might ask why not ChiWriter or T-cube? Simply they are not portable! They are graphics programs. They are not public-domain. One of them is really expensive. So TeX has been thought to be a better choice for technical writers. But it's not because it is easy to use but because the text is portable and sometimes it is more versatile than other PC word processors. In fact, without a laser printer or VorTeX and a graphics workstation, TeX is not so useful as PC word processors. So something should be 1) ASCII text files for portability and 2) easy to use. Then how about having Greek characters and math symbols in the ASCII character set itself? I've got an idea for all of us. And I wish to write a letter to the ANSI people about a new 256 8-bit extended ASCII character standard. But I don't know the ANSI's address. So if you agree with my idea, please forward this message to ANSI with your opinion. Let's have the Ext-key in our keyboard at the same location as the 'Alt' key in IBM's Enhanced keyboard. (I am using the term 'Ext' to distinguish it from GNU-Emacs' meta-editing keys. However, the real name of the latter half of this extended ASCII set should be 'meta', since they call it in that way in the termcap files.) Ext-p (F0-hexdec) will give us a printable character Greek-pi, and Ext-shift-p (D0-hexdec) will give us a printable character Greek-Pi (captial-pi) directly. IBM's Greek-pi is in E3 in hexadecimal which matches 'c' (63-Hexdec) among 128 7-bit ASCII codes. So every word processor is different in its way of producing pi. It lowers the portability of word-processed texts. At the end of this posting, I propose my draft. Please see and examine it. One may oppose this draft because the existing printers might not be used. We can use those with a mere printer driver software with a translater software as long as we do not type in the original text any one of the fonts not supported by the printer. I understand that the standardization of 8-bit extended ASCII is too late. However I know that once this is implemented on the new version of UNIX or POSIX, everyone will follow this slowly. Now people are gathering to standardize UNIX, POSIX, SVID, or whatever. Now is the time to express our opinion to ANSI people. If we lose this chance we will never have a standard 8-bit ASCII. If you agree with my idea, write a letter to ANSI, POSIX committee (IEEE CS/P1003), and the acting System V.4 committee members of AT&T-Sun-Unisys(?) immediately for their prompt action. Unfortunately, I do not know any of those addresses. I really do not know whether this effort is made first by me. Nor do I know whether there exists such extended ASCII made by ANSI. Since no text-mode terminal has inherent math fonts, I think there is no such standard so far. More than one half of college graduates in the world are either engineers, scientist, or medical doctors. They need English with Greek characters and math fonts to write reports, homeworks, papers. The need sometimes exceeds that of their own language support. They need a knowledge bank that can save some great idea like E = h-bar.omega : Einstein's photoelectric effect, E = mc^2 : Einstein's relativistic energy without backslashes or $'s, and yet portable. Thank you for your attention. G. Hugh Song Coordinated Science Lab. Univ. of Illinois at Urbana-Champaign 1101 W. Springfield Av. Urbana, IL 61801 song@uispg.csl.uiuc.edu ============================================================ Here is my draft of 256 new 8-bit ASCII character set. I place the second half of 8-bit characters (128-255) next to the first half of them. I am not decisive on what to assign to the following Ext-control keys (80- hexdec to 9f-hexdec). There are two options: 1) We can assign new control keys which have become neccessary as the computer science evolves. Some examples are shown below. I wish that someone in the field rearrage the assignment and complete this, since I do not have enough knowledge and current implementation status of i/o utilization. 2) Or we may give some freedom to the manufacturers of keyboard and terminals. Even though these (00-hexdec to 1f-hexdec and 80-hexdec to 9f-hexdec) are not legitimately printable while editing a text file, I wish there are corresponding printable characters such as graphical framing characters as in IBM PC or a triangle directing left for ^H, not just as the current '^' which does not distinguish itself from 5e-hexdec. It will ease debugging communication problems. | 00 ^@ nul 80 sml decreases character size and increases back 01 ^a soh 81 02 ^b stx 82 bld boldifies and unboldifies (toggle) 03 ^c etx 83 04 ^d eot 84 dwn steps down one half line spacing 05 ^e enq 85 06 ^f ack 86 07 ^g bel 87 grp enters and exits graphics mode | 08 ^h bs 88 hlp invokes help universally. 09 ^i ht 89 itl italicize and deitalicize from now 0a ^j nl 8a 0b ^k vt 8b mlm mouse left movement \ 0c ^l np 8c mlb mouse left button | 0d ^m cr 8d mmb mouse middle button | 0e ^n so 8e mdm mouse downward movement | Important! 0f ^o si 8f | no matter what these are | 10 ^p dle 90 mum mouse upward movement | meta-control keys or 11 ^q dc1 91 mrm mouse right movement | escape sequences. 12 ^r dc2 92 mrb mouse right button / 13 ^s dc3 93 scr sripticizes or unscripticizes (toggle) 14 ^t dc4 94 15 ^u nak 95 up steps up one half line spacing 16 ^v syn 96 rev reverses or reverses back characters's black and white 17 ^w etb 97 | 18 ^x can 98 19 ^y em 99 1a ^z sub 9a 1b ^[ esc 9b atn escapes during communication calling attention of the local control 1c ^\ fs 9c 1d ^] gs 9d 1e ^^ rs 9e 1f ^_ us 9f Now in the following we have printable characters except the 'DEL' key at the end of the lower 7-bit codes. The alt key may be used to send the 8-bit code to the host computer by simulating this key with kermit's 'set key' program such as in MSFERMIT version 2.30. For the 7-bit terminal environment, in which 8-bit signals are not generated or received by the terminal, such as VT100, it is desirable for the C-shell or the editor to have a key which tells the host computer that the next key is one of the upper 8-bit codes (128-255). This key should not contradict with a control key of the existing editor programs. The 'esc' key might be thought the best choice. However, most editor programs use this key heavily for some other purposes. To avoid conflict, the 'cr (Cntrl-m)' key, which is redundant both in vi and in gnuemacs (You might have noticed notice that 'C-m' is changed to 'nl (C-j)' automatically by both editors), may be used. This will limit the use of the Meta key in our (or Stallman's) GNU-Emacs. This actually means no revision in GNU-Emacs. We just use the ESC key to invoke the Meta editing keys, although the keyboard has the Meta key. This is the price we pay for those Greek characters and the math symbols. If we use the 'Cntrl-h' for the real backspace, we have to choose another key for invoking 'help' in GNU-Emacs. How about the 'Ext-Cntrl-h' ('88-hexdec') (or 'C-m C-h' on the 7-bit terminal) as a key for invoking help in the future version (Ver. 19) of GNU-Emacs. This is the only change which is not compatible to the present version (Ver.18). I'd like to suggest that the 'Ext-Cntrl-h (88-hexdec)' or 'Cntrl-m Cntrl- h' on the 7-bit terminal be a new standard key invoking help in e very software package in the future. Isn't it a good idea? | 20 sp a0 a horizontal bar longer than just '-'. 21 ! a1 a black square 22 " a2 the starting double quotation mark 23 # a3 \neq : not-equal sign '/=' in one character site 24 $ a4 the Pound symbol (U.K. money unit) 25 % a5 \div : the division symbol, ':-' in one character site 26 & a6 \cap : the common set in set theory, The inverted 'U'. 27 ' a7 the starting single quatation mark | 28 ( a8 the top portion of the left parenthesis 29 ) a9 the top portion of the left parenthesis 2a * aa a small circle at the ' level that usually represents degree 2b + ab \pm : '+-' in one character site with + up and - down. 2c , ac the cedilla symbol without c, s, or C. 2d - ad \mp : '-+' in one character site with - up and + down. 2e . ae \cdot : a dot at the center 2f / af $\dot $ : a dot at the top | 30 0 b0 the bottom portion of the right parenthesis 31 1 b1 \propto : the proportionality symbol, 'oc' in one character site 32 2 b2 \bigcirc : a big circle. 33 3 b3 \prime \prime \prime : tripple-prime 34 4 b4 a vertical line with a wart in the middle as in '{' 35 5 b5 a vertical line with a wart in the middle as in '}' 36 6 b6 \partial : the mirror image of '6' 37 7 b7 \cup : the symbol in the set theory, that looks like 'U' | 38 8 b8 \infty : the infinity symbol, 'oo' in one character site 39 9 b9 the bottom portion of the left parenthesis 3a : ba $\ddot $ : the umlaut, two dots overhead. 3b ; bb \prime \prime : the double-prime 3c < bc \le : '_<' in one character site 3d = bd \equiv : '=_' in one character site for the defining equality 3e > be \ge : '_>' in one character site 3f ? bf \supset : superset symbol | 40 @ c0 the registered trademark sign, a small capital R in a circle 41 A c1 angstrom, a small circle on top of 'A' 42 B c2 \rightarrow : an arrow heading east 43 C c3 \copyright : a small capital 'C' in a circle 44 D c4 \Delta 45 E c5 \in : 'an element of' symbol in set theory 46 F c6 \Phi 47 G c7 \Gamma | 48 H c8 \hbar : accented italic h for the Planck constant 49 I c9 the top portion of the integral symbol 4a J ca the bottom portion of the integral symbol 4b K cb \simeq :a set symbol (obtained from U by rotating it 90 deg CW) 4c L cc \Lambda 4d M cd \subset: symbol in the set theory 4e N ce \nabla : inverted Greek-capital-Delta 4f O cf \Omega | 50 P d0 \Pi 51 Q d1 \Theta 52 R d2 \surd : also makes a \sqrt if combined with underlines (__) 53 S d3 \Sigma 54 T d4 the trade mark sign, the superscripted 'TM' 55 U d5 \Upsilon 56 V d6 \leftarrow : an arrow heading west 57 W d7 \ddag : the double dagger symbol used for a footnote. | 58 X d8 \Xi 59 Y d9 \Psi 5a Z da \downarrow : an arrow heading south. 5b [ db \lceil : a vertical line whose top is clamped to the right 5c \ dc \times : 'x' without serif, math symbol for a multiplication 5d ] dd \rceil : a vertical line whose top is clamped to the left 5e ^ de $\check $ : an accent symbol inverted from '^' 5f _ df $\overline $ : a long bar on top. | 60 ` e0 \prime (60-hexdec is a back-prime) 61 a e1 \alpha 62 b e2 \beta 63 c e3 \chi 64 d e4 \delta 65 e e5 \epsilon 66 f e6 \phi 67 g e7 \gamma | 68 h e8 \eta 69 i e9 \iota 6a j ea \smallint : the integral symbol, enlongated s 6b k eb \kappa 6c l ec \lambda 6d m ed \mu 6e n ee \nu 6f o ef \omega | 70 p f0 \pi 71 q f1 \theta 72 r f2 \rho 73 s f3 \sigma 74 t f4 \tau 75 u f5 a wiggle positioned at the underline(_) level. 76 v f6 \vec the short arrow symbol that represents a vector 77 w f7 $\dagger $ : the dagger symbol used for a Hermitian conjugate | 78 x f8 \xi 79 y f9 \psi 7a z fa \zeta 7b { fb \lfloor : a vertical line whose bottom is clamped to the right 7c | fc \| : two vertical lines in one character site 7d } fd \rfloor : a vertical line whose bottom is clamped to the left 7e ~ fe \sim : a wiggle positioned at the center level - - - - - - - - - - - - - - - - - - - 7f del ff erh erase the character at the current cursor position ------------------------------------------------------------- These all can be reside in the text mode in 8-bit mode so that any text mode terminal can display them directly on the text mode screen. The possible benefit of this extension is: 1. If every typesetting program is revised according to the new standard, they will become more WYSWYG. It means we do not need to type the '\alpha' while typing a TeX file. 2. The wordprocessor and the typesetting programs will be cheaper since they do not need to include soft-font files or the hard font ROM. 3. The word processor files can easily be exported and imported from one word processor file to another without losing special characters as long as they reside in 256 character set. In addition to this new extended ASCII, I think that some of the present ASCII characters should be redesigned from the present ones as follows: " 22 should be designed to look more like the closing double quotation mark as in typeset books. ' 27 the closing single quotation mark or apostrophe same comment as above (" 22-hexdec) * 2a position this a little higher than the present height so that it looks like a footnoting symbol, not like a multiplication symbol. / 2f stretch this so that two of these can be connected without breaking to make a long slanted line. \ 5c the same comment as above _ 5f the same comment as above so that it should be \underbar{ } | 7c make this a single long vertical line rather than the present one broken at the middle. The current ANSI standard for erasing the previous character is DEL, not backspace! Let us encourage everyone to observe this standard. I know that the troublemaker IBM does not follow this standard. Let them go their way. We do not care for IBM. We are talking about UNIX and GNU-Emacs and TeX. Then backspace will do the following job in GNU-Emacs and vi. ^h 08 bs a backspace key without erasing the previously typed character, making an overprinted image when printed. I think this special function of this key is actually in the present ANSI standard. You might have noticed that the UNIX 'man'ual pages contain '^H' in their text files for underlining. It seems now fully supported by most ANSI terminals. (But not on IBM's) Nevertheless, it is not supported by vi or GNU-Emacs. Let's encourage Mr.Stallman to support this in his new version of GNU-Emacs. It will display every accented vowel for foreign alphabets, the cent (money unit), some foreign money units, the C-cedilla ('Ext-,-backspace-c'), and the null set symbol ('0/' in one character site. To edit this backspace we need a special character for this. A hollow triangle dirrected to the left is good enough. Also (Emacs maybe not on vi) will have mode to view this while editing. ^m 0d met In due consideration, the mnemonic should be changed from 'cr' to 'met'a. ========================================== KEYBORAD ----------------- This part is not part of my proposal. I just wish that the new ANSI ASCII keyboard has the following keys. One may assign some function keys for the following purposes. But it goes without saying that separate keys at the space bar level are more desirable. For text/graphics terminals Italic key : italicizes the normal character. this key should be active only on the alpabetic characters, Greek capital characters, but not on numeric characters, symbols like '%', '+', '"',etc. On a black-and-white text-mode-only terminal which does not have ROM to support various fonts (such as VT100), it would be desirable if this key reverses white and black of those characters between the two italic keys. Black becomes white, white becomes black. (Toggle) Bold key : boldens or highlights a character. (Toggle) For graphics terminals Step-up key : moves the position 1/2-line higher. and then step down key to go back to the original line height. Step-down key : moves the position 1/2-line lower. and then step-up key to go back. Script key : displays the scripted characters. (Toggle) Small character key : displays small characters from now and restores the size back. (Toggle) As to the Keyboard Layout, We do not need to have the editing keypad on the right. Why don't we move it to the left leaving more space for the mouse? =====================================End of draft======= P.S. At first, I did not intend to do this as a project. However it turned out to be a big project. Now I want to drop this project and let this free to the public by posting at the news system here. I hope everybody to express their opinion and fruitful discussion here. And fianlly I hope to see ANSI or POSIX committee act. Please start this project and act, ANSI.
ken@cs.rochester.edu (Ken Yap) (04/09/88)
You need more than just an extended font. You also need ways to construct extended integral signs and large sigmas, etc. I sometimes wish I didn't have to type $Sigma$, etc, but this is made up by the excellent (no, make that fantastic) job TeX does on chracter placement without manual intervention. In any case, the time required to type this stuff in is but a fraction of the time required to come up with something worth publishing. Ken
gs732@uxe.cso.uiuc.edu (04/17/88)
Re:8-bit ASCII on UNIX Reply to ken@cs.rochester.edu I agree with you on every point at which you made comment. But, where do you make your everyday note while you are working? Isn't it on your notebook, at best, with three holes for binding? I wish I could make everyday note on a computer in a file (kind of a hypertext file) which is really portable. As long as the 8th bit is not being utilized, it is not a big deal to have a new standard for it. Let's make it to the point. Suppose there are two OS worlds. Every situation is the same except the 8th bit utilization between the two OS's. There is no compatibility problem in one direction in the plain text mode. Now, given the two OS worlds with the same number of implementations, which world will you choose? Hugh gs732@uxe.cso.uiuc.edu song@uispg.csl.uiuc.edu
ken@cs.rochester.edu (Ken Yap) (04/18/88)
| As long as the 8th bit is not being utilized, it is not a big deal to have a |new standard for it. Let's make it to the point. Suppose there are two You may be too late grabbing that 8th bit. There are standards for character sets with 256 entries. Try for the next byte. You need it for oriental languages. Ken
ken@cs.rochester.edu (Ken Yap) (04/18/88)
Actually what I have wanted for some time is keycaps with LCDs. This way you can change the symbols on the keyboards depending on whether you are typing APL, TeX or Sanskrit. Software would do all the necessary mapping to whatever your word processor wants. Yeah, I know touchscreens will do this, but I doubt if a LCD screen can give me nice tactile feedback. I suppose it is still too expensive... Ken