arndt@zyx.SE (Arndt Jonasson) (06/21/88)
Suggestions for how Gnu Emacs can be made to handle non-ASCII. ["non-ASCII characters" below refer to those characters in the set of 8-bit characters (of which set ASCII is a subset) that have codes > 127. Thus, they don't include EBCDIC or 16-bit characters.] With the advent of the ISO Latin-1 standard, non-ASCII characters in text files are going to be increasingly common, and already there are manufacturers who support non-ASCII characters in their operating systems. Therefore, a few suggestions on how support for them can be accomplished in Gnu Emacs with only minor effort. 1) Display. With fairly minor changes to the C code, Emacs can be made to display characters with codes > 127 not in the usual way (e.g. \314), but as themselves, assuming that the virtual terminal can handle such characters. The changes involve a half dozen tests in xdisp.c and indent.c, including a Lisp flag to toggle the new functionality on and off. 2) Input. Assuming that the virtual terminal possesses the capability to let the user enter non-ASCII characters from the keyboard, support for easy input of them in Emacs (easy = without needing C-Q) can be implemented in Lisp alone, with no C changes. 3) Character syntax. This is affected by the Lisp function 'modify-syntax-entry' and presents no problems. 4) Upper/lower-case conversion. This is not available as a user-settable table. I suggest that it be made user-settable, either by making the tables available as strings, or through Lisp functions. These are the areas that have come to my mind; are there any that I have forgotten? I am using Gnu Emacs 18.49. If there is interest among the Gnu Emacs developers to implement the above suggestions, I will gladly supply the code that I have (which implements 1 and 2). -- Arndt Jonasson, ZYX Sweden AB, Styrmansgatan 6, 114 54 Stockholm, Sweden email address: arndt@zyx.SE or <backbone>!mcvax!enea!zyx!arndt
janssen@titan.SW.MCC.COM (Bill Janssen) (06/23/88)
In article <2641@zyx.SE>, arndt@zyx.SE (Arndt Jonasson) writes: > Suggestions for how Gnu Emacs can be made to handle non-ASCII. ... > 1) Display. ... > characters. The changes involve a half dozen tests in xdisp.c and > indent.c, including a Lisp flag to toggle the new functionality on and > off. It isn't quite this easy. A lot of the code that figures out "what line is where" in the window uses the knowledge that certain character codes take up 2 or 4 character positions. This knowledge seems to be scattered through the code, and might require some rooting to eliminate cleanly. Bill
karl@haddock.ISC.COM (Karl Heuer) (06/25/88)
In article <807@titan.SW.MCC.COM> janssen@titan.SW.MCC.COM (Bill Janssen) writes: >In article <2641@zyx.SE>, arndt@zyx.SE (Arndt Jonasson) writes: >> Suggestions for how Gnu Emacs can be made to handle non-ASCII. ... >> The [display] changes involve a half dozen tests in xdisp.c and indent.c, >It isn't quite this easy. A lot of the code that figures out "what line >is where" in the window uses the knowledge that certain character codes >take up 2 or 4 character positions. This knowledge seems to be scattered >through the code, and might require some rooting to eliminate cleanly. It seems to me that any such knowledge, if it correctly handles control characters, must test the ctl-arrow variable. A grep on the 18.41 sources revealed five places where it's being used in this way. (As Arndt said, it's confined to xdisp.c and indent.c.) Can you give a specific example of something else that would need to be changed? Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint