henke@qt.ipa.fhg.de (Juergen Henke) (11/08/90)
Since patchlevel 7 there is support for 8-bit characters (ISO 8859) in the ATK (they told...:-)). But how can i get my a-umlaut, u-umlaut and so on in ez (for example) ? Thanks in advance, Juergen. _________________________________________________________________________ Juergen Henke, e-mail juh@qt.IPA.FhG.de, PSI-mail PSI%4571109306::JUH_IPA Fraunhofer-Institut f. Produktionstechnik u. Automatisierung Eierstrasse 46, D-7000 Stuttgart 1
bernerus@CS.CHALMERS.SE (Christer Bernerus) (11/08/90)
Excerpts from info-andrew: 7-Nov-90 8-bit characters, how to use ? Juergen Henke@qt.ipa.fhg (446+0) > Since patchlevel 7 there is support for 8-bit characters (ISO 8859) in > the ATK (they told...:-)). > But how can i get my a-umlaut, u-umlaut and so on in ez (for example) ? > Thanks in advance, > Juergen. > _________________________________________________________________________ > Juergen Henke, e-mail juh@qt.IPA.FhG.de, PSI-mail PSI%4571109306::JUH_IPA > Fraunhofer-Institut f. Produktionstechnik u. Automatisierung > Eierstrasse 46, D-7000 Stuttgart 1 \Warning{\German{ J|rgen. Du kannst mit "help compchar" hilfe finden. Ich brauche 8-bits um Schwedisch zu schreiben, und es funktzioniert ganz fein. Um d zu schreiben, versuch mit ^X-v a:<ret>. Es scheint etwa kompliziert, aber es gibt mvglichkeiten um ein mehr einfach verwendergrenzschnitt zu spezifizieren. Bitte entschuldigen Sie mir f|r meinen schlecten Deutschen Grammatik, aber es war unwiederstdndlich zu beweisen das es wirklich funkzioniert. D.h. wenn Du dieses Brief mit "messages" lest. } } Please excuse me for writing the above in German, it's probably bad grammar, but I couldn't resist the temptation of showing some of the possibilites. Those of you who read this on usenet or as unformatted mail probably saw strange characters instead of u-umlaut, a-umlaut etc. This is because unformatting ATK mail just strips the 8-th bit, and u-umlaut becomes |, a-umlaut becomes d and o-umlaut becomes v. There are routines in ATK to make more sane conversions, but I've never figured out where unformatting of mail is done, and how it should be possible to use the compchar routines there, but maybe some of the ATK gurus could tell me. I think an 8-bit RFC822 would help a bit. Chris. ------------------------------------------------------- Christer Bernerus ! E-mail: bernerus@cs.chalmers.se Chalmers University of Technology ! Phone: +46 31 721000 Department of Computer Science ! Ham radio: SM6FBQ 144.3 MHz S-412 96 Gothenburg, SWEDEN
tpn+@ANDREW.CMU.EDU (Tom Neuendorffer) (11/09/90)
Many people reading Christer's reply may wonder why they didn't see the advertised umlaut's. As he mentioned, we don't currently have a good solution for those reading unformatted mail . But if you normally get formatted mail and are having problems, the following should help you fix things up. If the second letter in J|rgen appears as an u-umlaut, you are in good shape, ignore this message. If it appears as a vertical bar (J|rgen), then you are running a fairly old version of ATK. If you expose styles, you will note an undefined style around the bar with the name '@' (i.e. \@{|}). This is how we specified that the high-bit should be turned on. It is backward compatible in that if this file is rewritten, the old ATK maintains the information. Patches are available to upgrade you to a more recent version; see the recent post by Susan (Re: What is andrew (CMU, I know that!) ?). If it appears as a hex number ( J\374rgen), this indicates that you have the right version of ATK, but are using the wrong fonts. If you have all of the fonts that came with that X11.R4 distribution from MIT, this can be fixed by installing the non-andrew font alias file as either the standard font alias file for your system, or for yourself on an individual basis. To make it standard for your system, just copy <ANDREW_SOURCE_DIR>/xmkfontd/non-andrew.fonts.alias to $ANDREWDIR/X11fonts/fonts.alias, and either restart x or run' xset fp rehash'. Once installed and working, you will be able to delete the cou* hel* and tim* font files from $ANDREWDIR/X11fonts, since they will be replaced by their ISO counterparts from the R4 distribution. I would recommend that all sites that expect to use this feature install this alias file. We will try to get something put in the next patch that will set this up automatically according to a site.mcr file variable. To install it on an individual basis, do something like mkdir ~/myxfonts mkfontdir ~/myxfonts cp ANDREW_SOURCE_DIR/xmkfontd/non-andrew.fonts.alias ~/myxfonts/fonts.alias xset +fp ~/myxfonts xset fp rehash The xset +fp call can be added to your .xinitrc file to add the directory when you start up X. Note: Once this alias file is installed, users will note greater space between lines of text in ez, messages, etc. This is not a bug, it simply reflects the fact that the height of these fonts has to be greater, in order to allow room for accents over capital letters. At this point, you should hopefully be able to view files containing ISO characters. For help in entering these characters, see the help file on cpchar (run help cpchar). Other information is given in my 'ATK + 8859 = Multi-lingual Text and Mail' paper in the recent EUUG (now Europen) proceedings. While I am at it, I would like to that Rob Ryan for all of his work in getting the ISO stuff together. Thanks Rob! If you have more problems or questions, please let us know. Regards, Tom N. --------------------------- Tom Neuendorffer (tpn@andrew.cmu.edu) Manager-ATK Group Information Technology Center Carnegie Mellon University 4910 Forbes Ave. Pittsburgh, Pa. 15213-3890
bernerus@CS.CHALMERS.SE (Christer Bernerus) (11/09/90)
Excerpts from mail: 8-Nov-90 Re: 8-bit characters, how t.. Craig_Everhart@transarc. (389) > Mail unformatting is done by the andrew/overhead/mail/lib/unscribe.c > module. Is it always obvious how to turn accented characters into > non-accented ones? I know some of the rules in German (what turns into > (e.g.) oe, ae, ue, ss), but what rules apply to other languages? > Swedish, for instance? > Certainly the unscribe.c module pre-dates any consideration of 8-bit > characters. > Craig Thanks for pointing out unscribe.c for me. I had a look at it but it doesn't seem trivial to enhance it the way I wanted. What I had in mind was to use the compchar character table which allows for "customary local replacements". Preferably using the ATKToASCII function in textaux/compchar.c, but it doesn't seem as if unscribe.c was a part of the object-oriented stuff in ATK, so I'm very unsure how to do it in a proper way. It can of course be done as a "hack", but I feel that's a bit dangerous if e.g the lib/compchar/comps format changes. Regarding the way conversions should be done, there are usually many ways of doing this, even within a country, institution, group etc. So the problem isn't trivial, especially not for a mail gateway which does the unformatting. E.g. mail from Sweden containing e, d, v and even | should probably be replaced with }, { | and u, but if the letter came from Germany, maybe the replacements should be (there's no e in germany) , ae, oe and ue respectively. Converting the other way round is definitely non-trivial, epecially if the latter replacements are used. In my opinion, the only thing that really helps for the (nearest) future is an 8-bit extension to RFC822 which would make it "legal" to write mailers which support 8 bit mail transparently. It doesn't solve the whole world's problems though. Chris.
Craig_Everhart@TRANSARC.COM (11/09/90)
Indeed, unscribe.c is not part of ATK at all, but has been a wart on the side. It is used by several non-ATK programs (the AMS message server, AMDS, CUI, VUI) that don't need the overhead (distribution-time, build-time, and execution-time) of getting involved in dynamic loading. Fortunately, unscribe makes no pretensions at being able to invert its transformations. It does enough interpretations that doing so would be impossible. Thus, a run through UnScribe is a known way to lose information. As you suggest, the issue for a mail gateway is non-trivial, and the reason is the same reason that the ``customary local replacements'' are important. Does 8-bit RFC822 mail really solve any problems? What are recipients in Germany supposed to do with Swedish e, since their displays can't handle it? What are they supposed to do with upside-down question marks (?)? We would always have the problem of the ``local extensions,'' no? I could imagine doing worse things than reading the local-extensions table in unscribe. Craig
henke@qt.ipa.fhg.de (Juergen Henke) (11/10/90)
Excerpts from mail: 9-Nov-90 Re: 8-bit characters, how t.. Craig_Everhart@transarc. (1041) > Does 8-bit RFC822 mail really solve any problems? What are recipients > in Germany supposed to do with Swedish e, since their displays can't > handle it? What are they supposed to do with upside-down question marks > (?)? We would always have the problem of the ``local extensions,'' no? > I could imagine doing worse things than reading the local-extensions > table in unscribe. > Craig Craig, there's of course a problem with the swedish e (e| ?), but most of the special (country specific) characters are in ISO 8859. So a 8 bit RFC 822 would help a lot to those outside the (native) english speaking world... J|rgen P.S.: You notice the u-umlaut in my name ? :-) or :-( ? _________________________________________________________________________ Juergen Henke, e-mail juh@qt.IPA.FhG.de, PSI-mail PSI%4571109306::JUH_IPA Fraunhofer-Institut f. Produktionstechnik u. Automatisierung Eierstrasse 46, D-7000 Stuttgart 1