[net.unix-wizards] Packed EBCDIC converter needed for dd

rob (06/11/82)

Our MIS facility has a tape that is IBM EBCDIC format.  We can read
the tape with dd.  However they use a packed format where numeric
data is packed two numbers (4 bit) per byte.  dd translates this
into a character.  Has anyone ever ran into this problem?  Not all
bytes have to be broken down, some really are characters.  Please
mail replies to decvax!genradbolton!rob.  Thanks.  Rob Wood.

mo@LBL-UNIX@sri-unix (07/29/82)

Date: 27 Jun 1982 11:41:23-PDT
Ah yes....  The wonders of packed decimal.  Since the file contains records
with mixed types, "dd" can't be made to do the job.   You will have
to write a special-purpose program which "knows about" the
incoming record layout to hack it.  You can steal the ascii-ebcdic
conversion table from "dd" and maybe some of the logic (pretty
trivial though).  I assume you understand you are dealing with
packed-decimal data.

One other point - the ascii-ebcdic tables in "dd" are highly-tuned
for the Bell Labs sites.  I have a translation table which is 1-1
and onto.  By that, I mean the following:

	1) ascii-96 characters go onto their "most similar" glyph
	   in the TN print-train graphics assignment.  This means
	   you can print conversion output with a TN print train
	   and see everything you would see on an ASCII printing
	   terminal.  A couple of ascii characters don't have
	   identical graphics, but they are similar and, most
	   importantly, UNIQUE!!

	2) ascii control characters to onto the ebcdic equivalent
	   where possible (almost all cases), something similar
	   where equivalence isn't possible

	3) everthing else goes to somthing unique.  The upper 128
	   go into holes in the ebcdic code table, which are generally
	   a reflection of assigned graphics about some code

	4) my version of "dd" which uses this table contains an algorithm
	   which computes and verifies the inverse function of this
	   table to get the ebcdic-ascii table 

This tables has the very nice property that if you translate something
and don't like what you get, you can translate it the other way
and get it back.  It also means that all 256 ebcdic characters
are preserved in translation; you may have to do a little work to find
them, but they are not dropped on the floor!!  This is of great
use when cracking tapes written at random IBM-ish sites (or worse,
non-IBM sites that think they understand ebcdic!).

If there is enough interest in this, I will post it to Unix-wizards.

	-Mike