noren@dinl.uucp (Charles Noren) (10/03/90)
We are communicating between Sun 3 (with SunOS 4.0.3) and an IBM Mainframe (don't know the model, we're not IBM jocks) via TCP/IP. Our question, is there any program in Netland that converts back and forth between EBCDIC and ASCII (preferrably in C, but we will take any example)? If you have some knowledge about the conversion process, such as the bit/byte ordering of an IBM vs. a Sun, any comments would be very helpful. Thanks, -- Chuck Noren NET: dinl!noren@ncar.ucar.edu US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260, Denver, CO 80201-1260 Phone: (303) 971-7930
jik@athena.mit.edu (Jonathan I. Kamens) (10/03/90)
In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes: |> We are communicating between Sun 3 (with SunOS 4.0.3) and an |> IBM Mainframe (don't know the model, we're not IBM jocks) |> via TCP/IP. Our question, is there any program in Netland |> that converts back and forth between EBCDIC and ASCII |> (preferrably in C, but we will take any example)? The Unix program "dd" does this. In particular, the "conv=ascii" option converts EBCDIC to ASCII, and the "conv=ebcdic" option goes the other way. See the man page for more information. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
jeffb@blia.BLI.COM (Jeff Beard) (10/03/90)
dd conv=ebcdic is in error for some codes, and is why dd also supplies conv=ibm. However, this too is in error frequently due to ambiguities in the EBCDIC table(s) ... it depends on which table you read and how one needs to map the ASCII to EBCDIC set difference. The following two tables will allow you to define your own translation. /* This routine contains only the two tables needed to convert ASCII to EBCDIC and EBCDIC to ASCII. The conversion is according to BTL character set standards. There are some anomolies in a one/one mapping: not all characters are in both charater sets eg: PL/1 not '~' ascii karot '^' tidle '~' not all devices can display all characters in the set eg: square braces '[]' brackets '{}' we accept ASCII data in a 1/1 mapping we translate EBCDIC ~ to ASCII ~ Table used to convert ascii to ebcdic. ~ */ static char dummy[] = {0}; /* for EOF index */ char atoe[] = { 0x00 , 0x01 , 0x02 , 0x03 , 0x37 , 0x2d , 0x2e , 0x2f , 0x16 , 0x05 , 0x25 , 0x0b , 0x0c , 0x0d , 0x0e , 0x0f , 0x10 , 0x11 , 0x12 , 0x13 , 0x3c , 0x3d , 0x32 , 0x26 , 0x18 , 0x19 , 0x1a , 0x27 , 0x1c , 0x1d , 0x1e , 0x1f , 0x40 /* ' ' */, 0x5a /* '!' */, 0x7f /* '"' */, 0x7b /* '#' */, 0x5b /* '$' */, 0x6c /* '%' */, 0x50 /* '&' */, 0x7d /* ''' */, 0x4d /* '(' */, 0x5d /* ')' */, 0x5c /* '*' */, 0x4e /* '+' */, 0x6b /* ',' */, 0x60 /* '-' */, 0x4b /* '.' */, 0x61 /* '/' */, 0xf0 /* '0' */, 0xf1 /* '1' */, 0xf2 /* '2' */, 0xf3 /* '3' */, 0xf4 /* '4' */, 0xf5 /* '5' */, 0xf6 /* '6' */, 0xf7 /* '7' */, 0xf8 /* '8' */, 0xf9 /* '9' */, 0x7a /* ':' */, 0x5e /* ';' */, 0x4c /* '<' */, 0x7e /* '=' */, 0x6e /* '>' */, 0x6f /* '?' */, 0x7c /* ']' */, 0xc1 /* 'A' */, 0xc2 /* 'B' */, 0xc3 /* 'C' */, 0xc4 /* 'D' */, 0xc5 /* 'E' */, 0xc6 /* 'F' */, 0xc7 /* 'G' */, 0xc8 /* 'H' */, 0xc9 /* 'I' */, 0xd1 /* 'J' */, 0xd2 /* 'K' */, 0xd3 /* 'L' */, 0xd4 /* 'M' */, 0xd5 /* 'N' */, 0xd6 /* 'O' */, 0xd7 /* 'P' */, 0xd8 /* 'Q' */, 0xd9 /* 'R' */, 0xe2 /* 'S' */, 0xe3 /* 'T' */, 0xe4 /* 'U' */, 0xe5 /* 'V' */, 0xe6 /* 'W' */, 0xe7 /* 'X' */, 0xe8 /* 'Y' */, 0xe9 /* 'Z' */, 0xad /* '[' */, 0xe0 /* '\' */, 0xbd /* ']' */, 0x9a /* '^' */, 0x6d /* '_' */, 0x79 /* '`' */, 0x81 /* 'a' */, 0x82 /* 'b' */, 0x83 /* 'c' */, 0x84 /* 'd' */, 0x85 /* 'e' */, 0x86 /* 'f' */, 0x87 /* 'g' */, 0x88 /* 'h' */, 0x89 /* 'i' */, 0x91 /* 'j' */, 0x92 /* 'k' */, 0x93 /* 'l' */, 0x94 /* 'm' */, 0x95 /* 'n' */, 0x96 /* 'o' */, 0x97 /* 'p' */, 0x98 /* 'q' */, 0x99 /* 'r' */, 0xa2 /* 's' */, 0xa3 /* 't' */, 0xa4 /* 'u' */, 0xa5 /* 'v' */, 0xa6 /* 'w' */, 0xa7 /* 'x' */, 0xa8 /* 'y' */, 0xa9 /* 'z' */, 0xc0 /* '{' */, 0x4f /* '|' */, 0xd0 /* '}' */, 0xa1 /* '~' */, 0x07 , 0x04 , 0x06 , 0x08 , 0x09 , 0x0a , 0x14 , 0x15 , 0x17 , 0x1b , 0x20 , 0x21 , 0x22 , 0x23 , 0x24 , 0x28 , 0x29 , 0x2a , 0x2b , 0x2c , 0x30 , 0x31 , 0x33 , 0x34 , 0x35 , 0x36 , 0x38 , 0x39 , 0x3a , 0x3b , 0x3e , 0x3f , 0x41 , 0x42 , 0x43 , 0x44 , 0x45 , 0x46 , 0x47 , 0x48 , 0x49 , 0x4a , 0x51 , 0x52 , 0x53 , 0x54 , 0x55 , 0x56 , 0x57 , 0x58 , 0x59 , 0x62 , 0x63 , 0x64 , 0x65 , 0x66 , 0x67 , 0x68 , 0x69 , 0x6a , 0x70 , 0x71 , 0x72 , 0x73 , 0x74 , 0x75 , 0x76 , 0x77 , 0x78 , 0x80 , 0x8a , 0x8b , 0x8c , 0x8d , 0x8e , 0x8f , 0x90 , 0x9b , 0x9c , 0x9d , 0x9e , 0x9f , 0xa0 , 0xa1 , 0xaa , 0xab , 0xac , 0xae , 0xaf , 0xb0 , 0xb1 , 0xb2 , 0xb3 , 0xb4 , 0xb5 , 0xb6 , 0xb7 , 0xb8 , 0xb9 , 0xba , 0xbb , 0xbC , 0xbe , 0xbf , 0xca , 0xcb , 0xcc , 0xcd , 0xce , 0xcf , 0xda , 0xdb , 0xdc , 0xdd , 0xde , 0xdf , 0xe1 , 0xea , 0xeb , 0xec , 0xed , 0xee , 0xef , 0xfa , 0xfb , 0xfc , 0xfd , 0xfe , 0xff , }; /* Table used to convert ebcdic to ascii. */ static char dummy2[] = {0}; /* for EOF index */ char etoa[] = { /* 0 1 2 3 */ 0000 , 0001 , 0002 , 0003 , 0200 , 0011 , 0201 , 0177 , 0202 , 0203 , 0204 , 0013 , 0014 , 0015 , 0016 , 0017 , 0020 , 0021 , 0022 , 0023 , 0205 , 0206 , 0010 , 0207 , 0030 , 0031 , 0032 , 0210 , 0034 , 0035 , 0036 , 0037 , 0211 , 0212 , 0213 , 0214 , 0215 , 0012 , 0027 , 0033 , 0216 , 0217 , 0220 , 0221 , 0222 , 0005 , 0006 , 0007 , 0223 , 0224 , 0026 , 0225 , 0226 , 0227 , 0230 , 0004 , 0231 , 0232 , 0233 , 0234 , 0024 , 0025 , 0235 , 0236 , 0040 /* ' ' */, 0237 , 0240 , 0241 , 0242 , 0243 , 0244 , 0245 , 0246 , 0247 , 0250 , 0056 /* '.' */, 0074 /* '<' */, 0050 /* '(' */, 0053 /* '+' */, 0174 /* '|' */, 0046 /* '&' */, 0251 , 0252 , 0253 , 0254 , 0255 , 0256 , 0257 , 0260 , 0261 , 0041 /* '!' */, 0044 /* '$' */, 0052 /* '*' */, 0051 /* ')' */, 0073 /* ';' */, 0176 /* '~' */, 0055 /* '-' */, 0057 /* '/' */, 0262 , 0263 , 0264 , 0265 , 0266 , 0267 , 0270 , 0271 , 0272 , 0054 /* ',' */, 0045 /* '%' */, 0137 /* '_' */, 0076 /* '>' */, 0077 /* '?' */, 0273 , 0274 , 0275 , 0276 , 0277 , 0300 , 0301 , 0302 , 0303 , 0140 /* '`' */, 0072 /* ':' */, 0043 /* '#' */, 0100 /* '@' */, 0047 /* ''' */, 0075 /* '=' */, 0042 /* '"' */, 0304 , 0141 /* 'a' */, 0142 /* 'b' */, 0143 /* 'c' */, 0144 /* 'd' */, 0145 /* 'e' */, 0146 /* 'f' */, 0147 /* 'g' */, 0150 /* 'h' */, 0151 /* 'i' */, 0305 , 0306 , 0307 , 0310 , 0311 , 0312 , 0313 , 0152 /* 'j' */, 0153 /* 'k' */, 0154 /* 'l' */, 0155 /* 'm' */, 0156 /* 'n' */, 0157 /* 'o' */, 0160 /* 'p' */, 0161 /* 'q' */, 0162 /* 'r' */, 0136 /* '^' */, 0314 , 0315 , 0316 , 0317 , 0320 , #ifdef OLDC 0321 , 0322 /* ~ */, 0163 /* 's' */, 0164 /* 't' */, #else OLDC 0321 , 0176 /* ~ */, 0163 /* 's' */, 0164 /* 't' */, #endif OLDC 0165 /* 'u' */, 0166 /* 'v' */, 0167 /* 'w' */, 0170 /* 'x' */, 0171 /* 'y' */, 0172 /* 'z' */, 0323 , 0324 , 0325 , 0133 /* '[' */, 0326 , 0327 , 0330 , 0331 , 0332 , 0333 , 0334 , 0335 , 0336 , 0337 , 0340 , 0341 , 0342 , 0343 , 0344 , 0135 /* ']' */, 0345 , 0346 , 0173 /* '{' */, 0101 /* 'A' */, 0102 /* 'B' */, 0103 /* 'C' */, 0104 /* 'D' */, 0105 /* 'E' */, 0106 /* 'F' */, 0107 /* 'G' */, 0110 /* 'H' */, 0111 /* 'I' */, 0347 , 0350 , 0351 , 0352 , 0353 , 0354 , 0175 /* '}' */, 0112 /* 'J' */, 0113 /* 'K' */, 0114 /* 'L' */, 0115 /* 'M' */, 0116 /* 'N' */, 0117 /* 'O' */, 0120 /* 'P' */, 0121 /* 'Q' */, 0122 /* 'R' */, 0355 , 0356 , 0357 , 0360 , 0361 , 0362 , 0134 /* '\' */, 0363 , 0123 /* 'S' */, 0124 /* 'T' */, 0125 /* 'U' */, 0126 /* 'V' */, 0127 /* 'W' */, 0130 /* 'X' */, 0131 /* 'Y' */, 0132 /* 'Z' */, 0364 , 0365 , 0366 , 0367 , 0370 , 0371 , 0060 /* '0' */, 0061 /* '1' */, 0062 /* '2' */, 0063 /* '3' */, 0064 /* '4' */, 0065 /* '5' */, 0066 /* '6' */, 0067 /* '7' */, 0070 /* '8' */, 0071 /* '9' */, 0372 , 0373 , 0374 , 0375 , 0376 , 0377 , };
luke@modus.sublink.ORG (Luciano Mannucci) (10/04/90)
In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes: > We are communicating between Sun 3 (with SunOS 4.0.3) and an > IBM Mainframe (don't know the model, we're not IBM jocks) > via TCP/IP. Our question, is there any program in Netland > that converts back and forth between EBCDIC and ASCII > (preferrably in C, but we will take any example)? Apologies for posting C code in the wrong newsgroup. There are two very simple functions converting ASCII into EBCDIC and vice-versa having been working for many years in many programs: --- cut --- cut --- cut --- cut --- cut --- cut --- cut --- cut --- static char _tob[] = { 0x00,0x01,0x02,0x03,0x37,0x2d,0x2e,0x2f,0x16,0x05,0x25,0x0b,0x0c,0x0d,0x0e,0x0f, 0x10,0x11,0x12,0x13,0x3c,0x3d,0x32,0x26,0x18,0x19,0x3f,0x27,0x22,0x40,0x35,0x40, 0x40,0x5a,0x7f,0x7b,0x5b,0x6c,0x50,0x7d,0x4d,0x5d,0x5c,0x4e,0x6b,0x60,0x4b,0x61, 0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0x7a,0x5e,0x4c,0x7e,0x6e,0x6f, 0x7c,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6, 0xd7,0xd8,0xd9,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0x4f,0xe1,0x5f,0x40,0x6d, 0x40,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x91,0x92,0x93,0x94,0x95,0x96, 0x97,0x98,0x99,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xc0,0x6a,0xd0,0xa1,0x07, }; static char _toa[] = { 0x00,0x01,0x02,0x03,0x00,0x09,0x00,0x7f,0x00,0x00,0x00,0x0b,0x0c,0x0d,0x0e,0x0f, 0x10,0x11,0x12,0x13,0x00,0x0a,0x08,0x00,0x18,0x19,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x1c,0x00,0x00,0x0a,0x17,0x1b,0x00,0x00,0x00,0x00,0x00,0x05,0x06,0x07, 0x00,0x00,0x16,0x00,0x00,0x1e,0x00,0x04,0x00,0x00,0x00,0x00,0x14,0x15,0x00,0x1a, 0x20,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x2e,0x3c,0x28,0x2b,0x5b, 0x26,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x21,0x24,0x2a,0x29,0x3b,0x5d, 0x2d,0x2f,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x7c,0x2c,0x25,0x5f,0x3e,0x3f, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x3a,0x23,0x40,0x27,0x3d,0x22, 0x00,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,0x70,0x71,0x72,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x7e,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x7b,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x00,0x00,0x00,0x00,0x00,0x00, 0x7d,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,0x50,0x51,0x52,0x00,0x00,0x00,0x00,0x00,0x00, 0x5c,0x00,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x00,0x00,0x00,0x00,0x00,0x00, 0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x00,0x00,0x00,0x00,0x00,0x00, }; char btoa(c) char c; { if (c < 0) return -1; return _toa[c]; } char atob(c) char c; { if (c < 0) return -1; if (c <= 0x7f) return _tob[c]; else return 0; } /* " @(#)btoa.c 1.1 (lkdb) - 87/02/11 \n"; */ --- cut --- cut --- cut --- cut --- cut --- cut --- cut --- cut --- luke. - -- _ _ __ Via Aleardo Aleardi, 12 - 20154 Milano (Italy) | | | _ _| (__ PHONE : +39 2 3315328 FAX: +39 2 3315778 | | |(_)(_||_|___) Srl E-MAIL: luke@modus.sublink.ORG ______________________________ Software & Services for Advertising & Marketing
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/09/90)
In article <12609@blia.BLI.COM>, jeffb@blia.BLI.COM (Jeff Beard) writes: > static char dummy[] = {0}; /* for EOF index */ > char atoe[] = { > }; This isn't going to work. A compiler may insert any amount of padding after dummy[]. It may even put atoe[] at a lower address than dummy[]. (There is nothing to stop a compiler sorting top-level variables into alphabetic order...) 'static' and 'extern' variables might well go into different sections. And so on. Even struct { char eof_code; char atoe[256] } = { 0, /* atoe values as before */ } isn't going to work in general because a compiler may insert padding between fields. The only method that is going to work is char RAWatoe[] = { 0, /* atoe values as before */ }; char *atoe = RAWatoe+1; /* OR #define atoe(x) RAWateo[1+(x)] */ > static char dummy2[] = {0}; /* for EOF index */ > char etoa[] = { > #ifdef OLDC > 0321 , 0322 /* ~ */, 0163 /* 's' */, 0164 /* 't' */, > #else OLDC > 0321 , 0176 /* ~ */, 0163 /* 's' */, 0164 /* 't' */, > #endif OLDC > }; This one isn't going to work for an additional reason: the ANSI C standard doesn't accept tokens after #else or #endif, and the ANSI standard doesn't accept it because it wasn't universal practice. For example, the C compiler for UNIX V.3 chokes on them. The two following 'ed' commands may be useful to people who still have to fix this in their code. (My code used to be _full_ of this stuff, but I don't blame ANSI, it really wasn't portable.) 1,$ s:^\([ \t]*#[ \t]*else[ \t][ \t]*\)\([^ \t/].*\)$:\1/*\2*/: 1,$ s:^\([ \t]*#[ \t]*endif[ \t][ \t]*\)\([^ \t/].*\)$:\1/*\2*/: -- Fear most of all to be in error. -- Kierkegaard, quoting Socrates.
bakke@plains.NoDak.edu (Jeffrey P. Bakke) (10/11/90)
In article <661@modus.sublink.ORG> luke@modus.sublink.ORG (Luciano Mannucci) writes: > In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes: > > We are communicating between Sun 3 (with SunOS 4.0.3) and an > > IBM Mainframe (don't know the model, we're not IBM jocks) > > via TCP/IP. Our question, is there any program in Netland > > that converts back and forth between EBCDIC and ASCII > > (preferrably in C, but we will take any example)? > > Apologies for posting C code in the wrong newsgroup. That's alright, its always interesting to me. Anyway, if you're on a Sun system, an easier way would be to send the file over from the IBM to the Sun, and then use the dd program. This program allows file copies with translation. There are options to translate from EBCDIC to ASCII and vice versa. You'd have to look through the man pages. But this will probably do what you have to. The 'dd' program is part of the standard SunOS installation tape I believe. It should be located in the /usr/bin directory. No need to write your own conversion program if the utilities already exist. -- Jeffrey P. Bakke Internet: bakke@plains.NoDak.edu UUCP : ...!uunet!plains!bakke BITNET : bakke@plains.bitnet
jc@atcmp.nl (Jan Christiaan van Winkel) (10/12/90)
From article <661@modus.sublink.ORG>, by luke@modus.sublink.ORG (Luciano Mannucci): | In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes: |> via TCP/IP. Our question, is there any program in Netland |> that converts back and forth between EBCDIC and ASCII | | static char _tob[] = { | 0x00,0x01,0x02,0x03,0x37,0x2d,0x2e,0x2f,0x16,0x05,0x25,0x0b,0x0c,0x0d,0x0e,0x0f, . . | 0x97,0x98,0x99,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xc0,0x6a,0xd0,0xa1,0x07, | }; | static char _toa[] = { | 0x00,0x01,0x02,0x03,0x00,0x09,0x00,0x7f,0x00,0x00,0x00,0x0b,0x0c,0x0d,0x0e,0x0f, . . | 0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x00,0x00,0x00,0x00,0x00,0x00, | }; | | char btoa(c) | char c; | { | if (c < 0) return -1; | return _toa[c]; | } This is *THE* case where you need unsigned char's in stead of plain char's The problem is that ebcdic uses the full range from 0 to 255 for it's char's. for example the ebcdic code for '3' is 0xf3. On a machine that uses signed characters for plain char's, the number in c will be inter- preted as a negative number fooling your test in btoa... JC -- ___ __ ____________________________________________________________________ |/ \ Jan Christiaan van Winkel Tel: +31 80 566880 jc@atcmp.nl | AT Computing P.O. Box 1428 6501 BK Nijmegen The Netherlands __/ \__/ ____________________________________________________________________
adw@otter.hpl.hp.com (Dave Wells) (10/16/90)
Charles Noren at Martin Marietta I&CS, Denver CO: >We are communicating between Sun 3 (with SunOS 4.0.3) and an >IBM Mainframe (don't know the model, we're not IBM jocks) >via TCP/IP. Our question, is there any program in Netland >that converts back and forth between EBCDIC and ASCII >(preferrably in C, but we will take any example)? Jonathan I. Kamens at Massachusetts Institute of Technology: | The Unix program "dd" does this. In particular, the "conv=ascii" option |converts EBCDIC to ASCII, and the "conv=ebcdic" option goes the other way. | See the man page for more information. It's particularly worth noting this (from that dd man page): ASCII and EBCDIC conversion tables are taken from the 256- character ACM standard, Nov, 1968. The ibm conversion, while less widely accepted as a standard, corresponds better to certain IBM print train conventions. There is no ^^^^^^^^^^^ universal solution. ^^^^^^^^^^^^^^^^^^^ If you're translating "ordinary text files", dd will probably do the trick. If you're hoping to translate files or streams containing "unusual" characters (e.g. control codes for a graphics terminal), the exact translation table may well vary on a per-site basis. Dave Wells
johncore@compnect.UUCP (John Core ) (10/18/90)
why write code do convert ascii to EBCDIC. it comes with Unix. it's called dd Wizard Systems | UUCP: uunet!wa3wbu!compnect!johncore P.O. Box 6269 |INTERNET: johncore@compnect.wa3wbu Harrisburg, Pa. 17112-6269 |a public bbs since 1978. Data(717)657-4992 & 4997 John Core, SYSOP |------------------------------------------------- ----------------------------| No matter where you go, there you are! a woman is just a woman, but a good cigar is a smoke. -R. Kipling
exspes@gdr.bath.ac.uk (P E Smee) (10/22/90)
In article <28020001@otter.hpl.hp.com> adw@otter.hpl.hp.com (Dave Wells) writes: >Jonathan I. Kamens at Massachusetts Institute of Technology: > >| The Unix program "dd" does this. In particular, the "conv=ascii" option >|converts EBCDIC to ASCII, and the "conv=ebcdic" option goes the other way. >| See the man page for more information. > >It's particularly worth noting this (from that dd man page): > > ASCII and EBCDIC conversion tables are taken from the 256- > character ACM standard, Nov, 1968. The ibm conversion, > while less widely accepted as a standard, corresponds better > to certain IBM print train conventions. There is no > ^^^^^^^^^^^ > universal solution. > ^^^^^^^^^^^^^^^^^^^ > >If you're translating "ordinary text files", dd will probably do the trick. >If you're hoping to translate files or streams containing "unusual" >characters (e.g. control codes for a graphics terminal), the exact >translation table may well vary on a per-site basis. Actually, it *can* be even worse than this, and you don't need to get into very complicated characters. Even 'ordinary text' can pose problems. Under VM/CMS (for example) there are at least 3 possible EBCDIC mappings for the square brackets [ and ]. Which you need may vary not only per-site, but even according to which package you used to produce the file on a single machine. Since commercial packages tend to arrive 'object only', there's not even much you can do about it. There are other similar problem characters. []'s come instantly to mind as a result of having spent some time trying to move a portable C program into EBCDIC. Generally, the problem is that such characters do not exist in 'formal' EBCDIC; but do exist (with varying codings) on different IBM printer belts. As a pragmatic solution, package writers have used the printer belt codes for them; and it appears that their results vary depending on which belts (and printer models) their development machine had. -- Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132
meissner@osf.org (Michael Meissner) (10/25/90)
In article <831@compnect.UUCP> johncore@compnect.UUCP (John Core ) writes: | why write code do convert ascii to EBCDIC. it comes with | Unix. it's called dd However, unlike say ISO646 or ASCII, there is no one standard EBCDIC. There are various EBCDIC's which share a lot of characters in common, but have some different translations. Also, you have the fun in most EBCDIC's in that there are two representatons for '[' and ']'. One that your printer will print, and one that your terminal will display correctly. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Do apple growers tell their kids money doesn't grow on bushes?
schafer@devils.rice.edu (Richard A. Schafer) (10/26/90)
In article <MEISSNER.90Oct24150859@osf.osf.org>, meissner@osf.org (Michael Meissner) writes: ||> However, unlike say ISO646 or ASCII, there is no one standard EBCDIC. To be fair, there is no *one* standard ASCII, either, if you consider ASCII to include any of the several European versions of ASCII available with ISO numbers which I don't remember off the top of my head. That's why ISO has been spending so much time over the past several years working up new code point standards.
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/26/90)
In article <1990Oct25.140442@devils.rice.edu>, schafer@devils.rice.edu (Richard A. Schafer) writes: > In article <MEISSNER.90Oct24150859@osf.osf.org>, meissner@osf.org > (Michael Meissner) writes: > ||> However, unlike say ISO646 or ASCII, there is no one standard > EBCDIC. > To be fair, there is no *one* standard ASCII, either There is one and only one ASCII (well, there was an old version, but there has been only one for many years). ASCII stands for *AMERICAN* Standard Code for Information Interchange. ASCII is one particular natiaonal variant of the ISO 646 standard. The European versions of ISO 646 aren't versions of ASCII. The new standard (ISO 8859) is a family of 8-bit codes. Every member of the family has the same graphic characters in the lower half (32..126) as ASCII; this is a compatible extension of ISO 646. If you want to draw a parallel between ISO 646 and EBCDIC, there are *lots* of versions of EBCDIC. There's a French one and a Spanish one and a Hebrew one and ... Undeniably commendable. The thing that people complain about is having several incompatible versions within the same "locale" (to use an ANSI-C-ism). -- Fear most of all to be in error. -- Kierkegaard, quoting Socrates.
henry@zoo.toronto.edu (Henry Spencer) (10/26/90)
In article <1990Oct25.140442@devils.rice.edu> schafer@devils.rice.edu (Richard A. Schafer) writes: >||> However, unlike say ISO646 or ASCII, there is no one standard >EBCDIC. >To be fair, there is no *one* standard ASCII, either, if you consider >ASCII to include any of the several European versions of ASCII... There are no, repeat *no*, European versions of ASCII. ASCII is a single precisely-specified character code with no versions or ambiguities. It is one of a family of codes derived from ISO646. There are a number of other 646-derived codes in use in Europe; they are not ASCII. It is true that the existence of a variety of 7-bit codes has turned out to be a major nuisance, which is why there has been considerable work on unified codes like ISO Latin. -- The type syntax for C is essentially | Henry Spencer at U of Toronto Zoology unparsable. --Rob Pike | henry@zoo.toronto.edu utzoo!henry