rob (06/11/82)
Our MIS facility has a tape that is IBM EBCDIC format. We can read the tape with dd. However they use a packed format where numeric data is packed two numbers (4 bit) per byte. dd translates this into a character. Has anyone ever ran into this problem? Not all bytes have to be broken down, some really are characters. Please mail replies to decvax!genradbolton!rob. Thanks. Rob Wood.
mo@LBL-UNIX@sri-unix (07/29/82)
Date: 27 Jun 1982 11:41:23-PDT Ah yes.... The wonders of packed decimal. Since the file contains records with mixed types, "dd" can't be made to do the job. You will have to write a special-purpose program which "knows about" the incoming record layout to hack it. You can steal the ascii-ebcdic conversion table from "dd" and maybe some of the logic (pretty trivial though). I assume you understand you are dealing with packed-decimal data. One other point - the ascii-ebcdic tables in "dd" are highly-tuned for the Bell Labs sites. I have a translation table which is 1-1 and onto. By that, I mean the following: 1) ascii-96 characters go onto their "most similar" glyph in the TN print-train graphics assignment. This means you can print conversion output with a TN print train and see everything you would see on an ASCII printing terminal. A couple of ascii characters don't have identical graphics, but they are similar and, most importantly, UNIQUE!! 2) ascii control characters to onto the ebcdic equivalent where possible (almost all cases), something similar where equivalence isn't possible 3) everthing else goes to somthing unique. The upper 128 go into holes in the ebcdic code table, which are generally a reflection of assigned graphics about some code 4) my version of "dd" which uses this table contains an algorithm which computes and verifies the inverse function of this table to get the ebcdic-ascii table This tables has the very nice property that if you translate something and don't like what you get, you can translate it the other way and get it back. It also means that all 256 ebcdic characters are preserved in translation; you may have to do a little work to find them, but they are not dropped on the floor!! This is of great use when cracking tapes written at random IBM-ish sites (or worse, non-IBM sites that think they understand ebcdic!). If there is enough interest in this, I will post it to Unix-wizards. -Mike