barryg@sdcrdcf.UUCP (05/25/83)
Relay-Version:version B 2.10 5/3/83; site harpo.UUCP
Posting-Version:version B 2.10 beta 3/9/83; site sdcrdcf.UUCP
Message-ID:<275@sdcrdcf.UUCP>
Date:Wed, 25-May-83 11:09:40 EDT
Organization:System Development Corporation--a Burroughs Company
dd gets several characters wrong when translating from EBCDIC to ASCII
(conv=ascii). Among these are !, [, and ], as well as several control
characters you could care less about.
The problem is in the array etoa, at line 39 in dd.c (at least in our copy).
The corrected version (which is also commented for readability) follows.
char etoa[] = {
/*NUL SOH STX ETX HT DEL*/
/*00*/ 0000,0001,0002,0003,0234,'\t',0206,0177,
/* VT FF CR SO SI*/
0227,0215,0216,0013,'\f','\r',0016,0017,
/*DLE DC1 DC2 DC3 NL BS*/
/*10*/ 0020,0021,0022,0023,0235,'\n','\b',0207,
/* EM SUB FS GS RS US*/
0030,0031,0222,0217,0034,0035,0036,0037,
/*LF ETB ESC*/
/*20*/ 0200,0201,0202,0203,0204,'\n',0027,0033,
/*ENQ ACK BEL*/
0210,0211,0212,0213,0214,0005,0006,0007,
/* SYN EOT*/
/*30*/ 0220,0221,0026,0223,0224,0225,0226,0004,
/* DC4 NAK SUB*/
0230,0231,0232,0233,0024,0025,0236,0032,
/*40*/ ' ',0240,0241,0242,0243,0244,0245,0246,
/*cent*/
0247,0250,0323, '.', '<', '(', '+', '^',
/*50*/ '&',0251,0252,0253,0254,0255,0256,0257,
0260,0261, '!', '$', '*', ')', ';', '^',
/*60*/ '-', '/',0262,0263,0264,0265,0266,0267,
0270,0271, '|', ',', '%', '_', '>', '?',
/*70*/ 0272,0273,0274,0275,0276,0277,0300,0301,
0302, '`', ':', '#', '@','\'', '=','\"',
/*80*/ 0303, 'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i',0304,0305,0306,0307,0310,0311,
/*90*/ 0312, 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r',0313,0314,0315,0316,0317,0320,
/*A0*/ 0321, '~', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z',0322,0324,0325, '[',0326,0327, /*AD = [*/
/*B0*/ 0330,0331,0332,0333,0334,0335,0336,0337,
0340,0341,0342,0344,0345, ']',0346,0347, /*BD = ]*/
/*C0*/ '{', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I',0350,0351,0352,0353,0354,0355,
/*D0*/ '}', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
'Q', 'R',0356,0357,0360,0361,0362,0363,
/*E0*/ '\\',0237, 'S', 'T', 'U', 'V', 'W', 'X',
'Y', 'Z',0364,0365,0366,0367,0370,0371,
/*F0*/ '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9',0372,0373,0374,0375,0376,0377,
};
barryg@sdcrdcf.UUCP (05/25/83)
Relay-Version:version B 2.10 5/3/83; site harpo.UUCP Posting-Version:version B 2.10 beta 3/9/83; site sdcrdcf.UUCP Message-ID:<278@sdcrdcf.UUCP> Date:Wed, 25-May-83 11:31:39 EDT Organization:System Development Corporation--a Burroughs Company Urr... I forgot to mention that the codes I chose for relatively obscure (in EBCDIC) characters like square brackets are those that are provided on IBM's SN and TN text-printing print trains (and more modern printers with 96 or more graphics in their vocabulary). The PrOOp lists different values for the brackets, but I don't know any running IBM system that uses THOSE values. Certainly our 4331 VM and AMDAHL MVS systems use AD and BD for [,] (which is what I coded in table etoa).
pdl@root44.UUCP (06/01/83)
Oh dear, when will people realise that `EBCDIC' is meaningless. See the BUGS section in dd(1) for where those tables come from, and for why not to use them unless you're SURE that they ARE correct. (I know of at least 3 different manufacturer's `EBCDIC's which are ALL different (DEC, Burroughs & IBM all seem to disagree on minor points), and that's without going into the funny business of printers, which generally seem to disagree even more. There are two solutions to the problem: 1) roll your own translator (possibly using tr) 2) defenestrate everybody who even thinks about EBCDIC, to ensure non-recurrence of the problem (IBM, I hope you're listening). Dave Lukes ...!vax135!ukc!root44!pdl (in ASCII not EBCDIC, please)