satya@ssdc.honeywell.com (Satya Prabhakar) (06/20/91)
Hi, I am involved in developing the ISO/RDA protocol between CDC 920 and IBM 3090. CDC 920 uses ASCII representation and the IBM uses EBCDIC. I am looking for C routines that convert ASCII strings into EBCDIC strings and vice-versa. We need these DESPERATELY and asap. If you have some information re: these, please let me know asap. I would REALLY appreciate your help. Very many thanks in advance. Satya Prabhakar (satya@ssdc.honeywell.com) (Office Phone: 612-782-7134)
rickert@mp.cs.niu.edu (Neil Rickert) (06/20/91)
In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes: >uses EBCDIC. I am looking for C routines that convert ASCII strings >into EBCDIC strings and vice-versa. We need these DESPERATELY and I don't understand. What is wrong with a simple loop replacing c with EBCDIC[c] for each char c in the string of unsigned chars. Something like: while(*p) { *p = EBCDIC[*p]; p++;} You do need to initialize your table. Pick up the tables used on the 3090 and use those. If you don't like that choice build your own in the grand tradition of mutual inconsistency which currently exists in the ASCII <-> EBCDIC translation world. -- =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*= Neil W. Rickert, Computer Science <rickert@cs.niu.edu> Northern Illinois Univ. DeKalb, IL 60115 +1-815-753-6940
exspes@gdr.bath.ac.uk (P E Smee) (06/20/91)
In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes: >I am involved in developing the ISO/RDA protocol between CDC 920 >and IBM 3090. CDC 920 uses ASCII representation and the IBM >uses EBCDIC. I am looking for C routines that convert ASCII strings >into EBCDIC strings and vice-versa. We need these DESPERATELY and >asap. If you have some information re: these, please let me know >asap. I would REALLY appreciate your help. There's a basic problem here, which is that there is (still) no such thing as EBCDIC. At present, for example (and I may have missed some possibilities) the following codings might be used for 5 example special chars: Char ASCII EBCDIC { 0x7b 0x43, 0x44, 0x51, 0x63, 0x9c, or 0xc0 } 0x7d 0x47, 0x54, 0xd0, or 0xdc [ 0x5b 0x4a, 0x90, 0x9e, 0xb1, 0xb5, 0xba ] 0x5d 0x51, 0x5a, 0x9f, 0xb5, 0xbb | 0x7c 0x4f, 0xbb and, there are some characters in each set which don't exist in the other. Apropos those codes above, the problem is that original EBCDIC did not define codes for a lot of the special chars, but they DID exist on various IBM printer belts (in different places). So, as a pragmatic approach, various packages used, internally, the encodings which they though most likely to work with the types of print belts that they thought their type of users would use. The UK's BlueBook file transfer protocol suggested using the following mapping, which seems to work pretty well most of the time, which is as good as you're gonna be able to do (has the advantage of being 1-1 and reversible as long as the chars involved exist in the ASCII set): 0x 1x 2x 3x 4x 5x 6x 7x 8x 9x Ax Bx Cx Dx Ex Fx x0 00 10 40 F0 7C D7 79 97 x1 01 11 5A F1 C1 D8 81 98 x2 02 12 7F F2 C2 D9 82 99 x3 03 13 7B F3 C3 E2 83 A2 x4 37 3C 5B F4 C4 E3 84 A3 x5 2D 3D 6C F5 C5 E4 85 A4 x6 2E 32 50 F6 C6 E5 86 A5 x7 2F 26 7D F7 C7 E6 87 A6 x8 16 18 4D F8 C8 E7 88 A7 x9 05 19 5D F9 C9 E8 89 A8 xA 25 3F 5C 7A D1 E9 91 A9 xB 0B 27 4E 5E D2 AD 92 C0 xC 0C 1C 6B 4C D3 E0 93 4F xD 0D 1D 60 7E D4 BD 94 D0 xE 0E 1E 4B 6E D5 71 95 5F xF 0F 1F 61 6F D6 6D 96 07 This is EBCDIC on the inside, ANSI (IA5) on the edges. So, to turn ASCII 0x21 into EBCDIC, you look in the column headed 2x on the row labelled x1, and come up with EBCDIC 0x5a. To find out what EBCDIC 0xD1 is, you find D1 in the table, in column 4x, row xA, so it is ASCII 0x4a. (Note this assumes 0-parity ASCII, as well.) Warning, EBCDIC is bigger than ASCII. if you turn this into a set of translation tables, you have to decide what you are going to do with EBCDIC chars which do not appear in the table (with no corresponding char in ASCII). IBM's normal response is to just map them all to one printing char (forgotten which, something like $ or &). The above mapping seems to do the right sort of things with most IBM applications -- e.g. ASCII C programs moved to an IBM and translated this way will probably compile with the IBM C compiler. Ditto other language sources. For a real real definitive answer, you'll have to get friendly with someone in IBM. I know THEY are trying to come up with a definitive ASCII<->EBCDIC mapping as part of their SAA project; to dig themselves out of the hole caused by the fact that IBM/PCs speak ASCII, while IBM mainframes speak EBCDIC. God only knows what mapping they'll end up with. -- Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132
lnds@sherlock.mmid.ualberta.ca (Mark Israel) (06/21/91)
In article <1991Jun20.160334.11278@gdr.bath.ac.uk>, P.Smee@bristol.ac.uk (Paul Smee) writes: > There's a basic problem here, which is that there is (still) no such > thing as EBCDIC.... For a real real definitive answer, you'll have to get > friendly with someone in IBM. I extracted the following from local documentation. Mark Israel I have heard the Wobble! userisra@mts.ucs.ualberta.ca ------------------------------------------------------------------------------ In February 1987, a new eight-bit ISO character set standard, 8859/1, was ratified. Also in 1987, IBM published an EBCDIC standard called Code Page 37, based on the ISO 8859/1 standard. Both of these standards contain identical character graphics. The ISO 8859/1 character set contains such EBCDIC characters as the logical-not sign and the cent sign, and the new EBCDIC character set contains the ISO tilde and circumflex, among other ASCII characters. The ISO 8859/1 standard is supported by many large computer manufacturers, including DEC and IBM. As we deal more and more with other machines using ISO-based rather than EBCDIC-based character coding schemes, it becomes imperative that we be able to move data from one machine to another and back again without loss of information. The mapping that results from using the ISO 8859/1 standard and the IBM Code Page 37 EBCDIC will allow us to move information back and forth between ISO- and EBCDIC-based machines with none of the problems we have had in the past. EBCDIC ASCII GRAPHIC DESCRIPTION --------------------------------------------------------------------- X'00' X'00' NUL null X'01' X'01' SOH start of heading (Ctrl-A) X'02' X'02' STX start of text (Ctrl-B) X'03' X'03' ETX end of text (Ctrl-C) X'04' X'9C' ? ... X'05' X'09' HT horizontal tabulation (Ctrl-I) X'06' X'86' ? ... X'07' X'7F' DEL delete (rubout, DEL control char) X'08' X'97' ? ... X'09' X'8D' ? ... X'0A' X'8E' ? ... X'0B' X'0B' VT vertical tabulation (Ctrl-K) X'0C' X'0C' FF form feed (Ctrl-L) X'0D' X'0D' X'0E' X'0E' SO shift-out (Ctrl-N) X'0F' X'0F' SI shift-in (Ctrl-O) X'10' X'10' DLE data link escape (Ctrl-P) X'11' X'11' DC1 device control 1 (X-Off, Ctrl-Q) X'12' X'12' DC2 device control 2 (Ctrl-R) X'13' X'13' DC3 device control 3 (X-On, Ctrl-S) X'14' X'9D' ? ... X'15' X'85' ? ... X'16' X'08' BS backspace (Ctrl-H) X'17' X'87' ? ... X'18' X'18' CAN cancel (Ctrl-X) X'19' X'19' EM end of medium (Ctrl-Y) X'1A' X'92' ? ... X'1B' X'8F' ? ... X'1C' X'1C' FS file separator X'1D' X'1D' GS group separator X'1E' X'1E' RS record separator X'1F' X'1F' US unit separator X'20' X'80' ? ... X'21' X'81' ? ... X'22' X'82' ? ... X'23' X'83' ? ... X'24' X'84' ? ... X'25' X'0A' LF line feed (Ctrl-J) X'26' X'17' ETB end of transmission block (Ctrl-W) X'27' X'1B' ESC escape (Escape) X'28' X'88' ? ... X'29' X'89' ? ... X'2A' X'8A' ? ... X'2B' X'8B' ? ... X'2C' X'8C' ? ... X'2D' X'05' ENQ enquiry (Ctrl-E) X'2E' X'06' ACK acknowledge (Ctrl-F) X'2F' X'07' BEL bell (Ctrl-G) X'30' X'90' ? ... X'31' X'91' ? ... X'32' X'16' SYN synchronous idle (Ctrl-V) X'33' X'93' ? ... X'34' X'94' ? ... X'35' X'95' ? ... X'36' X'96' ? ... X'37' X'04' EOT end of transmission (Ctrl-D) X'38' X'98' ? ... X'39' X'99' ? ... X'3A' X'9A' ? ... X'3B' X'9B' ? ... X'3C' X'14' DC4 device control 4 (Ctrl-T) X'3D' X'15' NAK negative acknowledge (Ctrl-U) X'3E' X'9E' ? ... X'3F' X'1A' SUB substitute character (Ctrl-Z) X'40' X'20' space (blank) X'41' X'A0' ? no-break space X'42' X'E2' ? small a with circumflex accent X'43' X'E4' ? small a with diaeresis X'44' X'E0' ? small a with grave accent X'45' X'E1' ? small a with acute accent X'46' X'E3' ? small a with tilde X'47' X'E5' ? small a with ring above X'48' X'E7' ? small c with cedilla X'49' X'F1' ? small n with tilde X'4A' X'A2' ? cent sign X'4B' X'2E' . period, full stop X'4C' X'3C' < less-than sign X'4D' X'28' ( left parenthesis X'4E' X'2B' + plus sign X'4F' X'7C' | vertical line (bar, "or" sign) X'50' X'26' & ampersand (and sign) X'51' X'E9' ? small e with acute accent X'52' X'EA' ? small e with circumflex accent X'53' X'EB' ? small e with diaeresis X'54' X'E8' ? small e with grave accent X'55' X'ED' ? small i with acute accent X'56' X'EE' ? small i with circumflex accent X'57' X'EF' ? small i with diaeresis X'58' X'EC' ? small i with grave accent X'59' X'DF' ? small sharp s, German X'5A' X'21' ! exclamation mark X'5B' X'24' $ dollar sign X'5C' X'2A' * asterisk (star) X'5D' X'29' ) right parenthesis X'5E' X'3B' ; semicolon X'5F' X'AC' ? not sign X'60' X'2D' - minus sign or hyphen X'61' X'2F' / solidus (slash) X'62' X'C2' ? capital A with circumflex accent X'63' X'C4' ? capital A with diaeresis X'64' X'C0' ? capital A with grave accent X'65' X'C1' ? capital A with acute accent X'66' X'C3' ? capital A with tilde X'67' X'C5' ? capital A with ring X'68' X'C7' ? capital C with cedilla X'69' X'D1' ? capital N with tilde X'6A' X'A6' ? broken bar X'6B' X'2C' , comma X'6C' X'25' % percent sign X'6D' X'5F' _ low line (underscore) X'6E' X'3E' > greater-than sign X'6F' X'3F' ? question mark X'70' X'F8' ? small o with slash X'71' X'C9' ? capital E with acute accent X'72' X'CA' ? capital E with circumflex accent X'73' X'CB' ? capital E with diaeresis X'74' X'C8' ? capital E with grave accent X'75' X'CD' ? capital I with acute accent X'76' X'CE' ? capital I with circumflex accent X'77' X'CF' ? capital I with diaeresis X'78' X'CC' ? capital I with grave accent X'79' X'60' ` grave accent X'7A' X'3A' : colon X'7B' X'23' # number sign (hash mark, sharp sign) X'7C' X'40' @ commercial at X'7D' X'27' ' apostrophe (single quote) X'7E' X'3D' = equals sign X'7F' X'22' " quotation mark (double quote) X'80' X'D8' ? capital O with slash X'81' X'61' a small a X'82' X'62' b small b X'83' X'63' c small c X'84' X'64' d small d X'85' X'65' e small e X'86' X'66' f small f X'87' X'67' g small g X'88' X'68' h small h X'89' X'69' i small i X'8A' X'AB' ? angle quotation mark left (<< mark) X'8B' X'BB' ? angle quotation mark right (>> mark) X'8C' X'F0' ? small eth, Icelandic X'8D' X'FD' ? small y with acute accent X'8E' X'DE' ? small thorn, Icelandic X'8F' X'B1' ? plus or minus sign X'90' X'B0' ? degree sign X'91' X'6A' j small j X'92' X'6B' k small k X'93' X'6C' l small l X'94' X'6D' m small m X'95' X'6E' n small n X'96' X'6F' o small o X'97' X'70' p small p X'98' X'71' q small q X'99' X'72' r small r X'9A' X'AA' ? ordinal indicator feminine X'9B' X'BA' ? ordinal indicator, masculine X'9C' X'E6' ? small ae dipthong X'9D' X'B8' ? cedilla X'9E' X'C6' ? capital AE dipthong X'9F' X'A4' ? currency sign (lozenge) X'A0' X'B5' ? micro sign (small mu) X'A1' X'7E' ~ tilde (wavy line) X'A2' X'73' s small s X'A3' X'74' t small t X'A4' X'75' u small u X'A5' X'76' v small v X'A6' X'77' w small w X'A7' X'78' x small x X'A8' X'79' y small y X'A9' X'7A' z small z X'AA' X'A1' ? inverted exclamation mark X'AB' X'BF' ? inverted question mark X'AC' X'D0' ? capital D with stroke, Icelandic eth X'AD' X'DD' ? capital Y with acute accent X'AE' X'FE' ? capital thorn, Icelandic X'AF' X'AE' ? registered sign (circled capital R) X'B0' X'5E' ^ circumflex accent X'B1' X'A3' ? pound sign (Sterling currency) X'B2' X'A5' ? yen sign (Nipponese currency) X'B3' X'B7' ? middle dot (scalar product) X'B4' X'A9' ? copyright sign (circled capital C) X'B5' X'A7' ? section sign (S-half-above-S sign) X'B6' X'B6' ? pilcrow (paragraph, double-barred P) X'B7' X'BC' ? fraction one-quarter (1/4) X'B8' X'BD' ? fraction one-half (1/2) X'B9' X'BE' ? fraction three-quarters (3/4) X'BA' X'5B' [ left square bracket X'BB' X'5D' ] right square bracket X'BC' X'AF' ? macron X'BD' X'A8' ? diaeresis or umlaut X'BE' X'B4' ? acute accent X'BF' X'D7' ? multiply sign (vector product) X'C0' X'7B' { left curly bracket (left brace) X'C1' X'41' A capital A X'C2' X'42' B capital B X'C3' X'43' C capital C X'C4' X'44' D capital D X'C5' X'45' E capital E X'C6' X'46' F capital F X'C7' X'47' G capital G X'C8' X'48' H capital H X'C9' X'49' I capital I X'CA' X'AD' ? soft hyphen X'CB' X'F4' ? small o with circumflex accent X'CC' X'F6' ? small o with diaeresis X'CD' X'F2' ? small o with grave accent X'CE' X'F3' ? small o with acute accent X'CF' X'F5' ? small o with tilde X'D0' X'7D' } right curly bracket (right brace) X'D1' X'4A' J capital J X'D2' X'4B' K capital K X'D3' X'4C' L capital L X'D4' X'4D' M capital M X'D5' X'4E' N capital N X'D6' X'4F' O capital O X'D7' X'50' P capital P X'D8' X'51' Q capital Q X'D9' X'52' R capital R X'DA' X'B9' ? superscript one X'DB' X'FB' ? small u with circumflex accent X'DC' X'FC' ? small u with diaeresis X'DD' X'F9' ? small u with grave accent X'DE' X'FA' ? small u with acute accent X'DF' X'FF' ? small y diaeresis X'E0' X'5C' \ reverse solidus (backslash) X'E1' X'F7' ? divide sign (dot over line over dot) X'E2' X'53' S capital S X'E3' X'54' T capital T X'E4' X'55' U capital U X'E5' X'56' V capital V X'E6' X'57' W capital W X'E7' X'58' X capital X X'E8' X'59' Y capital Y X'E9' X'5A' Z capital Z X'EA' X'B2' ? superscript two (squared) X'EB' X'D4' ? capital O with circumflex accent X'EC' X'D6' ? capital O with diaeresis X'ED' X'D2' ? capital O with grave accent X'EE' X'D3' ? capital O with acute accent X'EF' X'D5' ? capital O with tilde X'F0' X'30' 0 digit zero X'F1' X'31' 1 digit one X'F2' X'32' 2 digit two X'F3' X'33' 3 digit three X'F4' X'34' 4 digit four X'F5' X'35' 5 digit five X'F6' X'36' 6 digit six X'F7' X'37' 7 digit seven X'F8' X'38' 8 digit eight X'F9' X'39' 9 digit nine X'FA' X'B3' ? superscript three (cubed) X'FB' X'DB' ? capital U with circumflex accent X'FC' X'DC' ? capital U with diaeresis X'FD' X'D9' ? capital U with grave accent X'FE' X'DA' ? capital U with acute accent X'FF' X'9F' ? ...
gary@sci34hub.sci.com (Gary Heston) (06/21/91)
In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes: >I am involved in developing the ISO/RDA protocol between CDC 920 >and IBM 3090. CDC 920 uses ASCII representation and the IBM >uses EBCDIC. I am looking for C routines that convert ASCII strings >into EBCDIC strings and vice-versa. We need these DESPERATELY and >asap. If you have some information re: these, please let me know >asap. I would REALLY appreciate your help. Well, let's see--this is a hard one, seeing as how it's a C programming problem, of course it's posted to comp.lang.c, isn't it? >Newsgroups: comp.protocols.ibm,comp.protocols.iso,comp.protocols.tcp-ip,comp.unix.questions Oops, guessed wrong on that one... Try building a 256 by 2 char array, initializing [x,0] with the EBCDIC code set and [x,1] with the ANSI (it hasn't been ASCII for many years) code set, so that you can use the character as the offset to find out what it converts to. Typing in the initialization strings are left as an exercise to the programmer. (You could do this with pointers, but that's a little more complicated, and you said you were in a hurry.) >Very many thanks in advance. Anytime. -- Gary Heston System Mismanager and technoflunky uunet!sci34hub!gary or My opinions, not theirs. SCI Systems, Inc. gary@sci34hub.sci.com I support drug testing. I believe every public official should be given a shot of sodium pentathol and ask "Which laws have you broken this week?".
dag@fciva.FRANKCAP.COM (Daniel A. Graifer) (06/22/91)
In article <1991Jun20.115613.13073@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes: >In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes: >>uses EBCDIC. I am looking for C routines that convert ASCII strings >>into EBCDIC strings and vice-versa. We need these DESPERATELY and > > I don't understand. What is wrong with a simple loop replacing >c with EBCDIC[c] for each char c in the string of unsigned chars. >Something like: > while(*p) { *p = EBCDIC[*p]; p++;} > > You do need to initialize your table. Pick up the tables used on the >3090 and use those. If you don't like that choice build your own in >the grand tradition of mutual inconsistency which currently exists in >the ASCII <-> EBCDIC translation world. I wrote a two line c program to output chars0-256, and piped this through dd -ascii and od -c to create my table. This way, the table is at least consistant with what dd produces. Note that the table produced this way will not necessarily be invertable. (But then neither are the mainframe translate tables. If I recall my Burroughs B[67]000 days correctly (now Unisys A-Series), the translation of ASCII "!" was not invertable. And you may have a real problem with the >100 EBCDIC chars that aren't meaningful. Dan -- Daniel A. Graifer Coastal Capital Funding Corp. Sr. Vice President, Financial Systems 7900 Westpark Dr. Suite A-130 (703)821-3244 McLean, VA 22102 uunet!fciva!dag fciva.FRANKCAP.COM!dag@uunet.uu.net
dandrews@bilver.uucp (Dave Andrews) (06/26/91)
In article <1991Jun20.160334.11278@gdr.bath.ac.uk> P.Smee@bristol.ac.uk (Paul Smee) writes: >For a real real definitive answer, you'll have to get friendly with >someone in IBM. I know THEY are trying to come up with a definitive >ASCII<->EBCDIC mapping as part of their SAA project; to dig themselves >out of the hole caused by the fact that IBM/PCs speak ASCII, while IBM >mainframes speak EBCDIC. God only knows what mapping they'll end up >with. I believe the IBM effort is leading toward Unicode, a double-byte set. As I recall, a recent SHARE-wide survey indicated that SHARE members were not happy with the Unicode idea (which will STILL be unusable in parts of the world). I don't think the ASCII/EBCDIC dichotomy will ever be resolved. - Dave