[comp.protocols.tcp-ip] SOS: C Routines for ASCII to EBCDIC Conversion and Vice-versa

satya@ssdc.honeywell.com (Satya Prabhakar) (06/20/91)

Hi,

I am involved in developing the ISO/RDA protocol between CDC 920
and IBM 3090.  CDC 920 uses ASCII representation and the IBM
uses EBCDIC. I am looking for C routines that convert ASCII strings
into EBCDIC strings and vice-versa.  We need these DESPERATELY and
asap. If you have some information re: these, please let me know
asap. I would REALLY appreciate your help.

Very many thanks in advance.

Satya Prabhakar  (satya@ssdc.honeywell.com)
(Office Phone: 612-782-7134)

rickert@mp.cs.niu.edu (Neil Rickert) (06/20/91)

In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes:
>uses EBCDIC. I am looking for C routines that convert ASCII strings
>into EBCDIC strings and vice-versa.  We need these DESPERATELY and

  I don't understand.  What is wrong with a simple loop replacing
c with EBCDIC[c] for each char c in the string of unsigned chars.
Something like:
	while(*p) { *p = EBCDIC[*p]; p++;}

 You do need to initialize your table.  Pick up the tables used on the
3090 and use those.  If you don't like that choice build your own in
the grand tradition of mutual inconsistency which currently exists in
the ASCII <-> EBCDIC translation world.


-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

exspes@gdr.bath.ac.uk (P E Smee) (06/20/91)

In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes:
>I am involved in developing the ISO/RDA protocol between CDC 920
>and IBM 3090.  CDC 920 uses ASCII representation and the IBM
>uses EBCDIC. I am looking for C routines that convert ASCII strings
>into EBCDIC strings and vice-versa.  We need these DESPERATELY and
>asap. If you have some information re: these, please let me know
>asap. I would REALLY appreciate your help.

There's a basic problem here, which is that there is (still) no such
thing as EBCDIC.  At present, for example (and I may have missed some
possibilities) the following codings might be used for 5 example
special chars:

    Char   ASCII    EBCDIC
    {      0x7b     0x43, 0x44, 0x51, 0x63, 0x9c, or 0xc0
    }      0x7d     0x47, 0x54, 0xd0, or 0xdc
    [      0x5b     0x4a, 0x90, 0x9e, 0xb1, 0xb5, 0xba
    ]      0x5d     0x51, 0x5a, 0x9f, 0xb5, 0xbb
    |      0x7c     0x4f, 0xbb

and, there are some characters in each set which don't exist in the
other.  Apropos those codes above, the problem is that original EBCDIC
did not define codes for a lot of the special chars, but they DID exist
on various IBM printer belts (in different places).  So, as a pragmatic
approach, various packages used, internally, the encodings which they
though most likely to work with the types of print belts that they
thought their type of users would use.

The UK's BlueBook file transfer protocol suggested using the following
mapping, which seems to work pretty well most of the time, which is as
good as you're gonna be able to do (has the advantage of being 1-1 and
reversible as long as the chars involved exist in the ASCII set):

    0x  1x  2x  3x  4x  5x  6x  7x  8x  9x  Ax  Bx  Cx  Dx  Ex  Fx
x0  00  10  40  F0  7C  D7  79  97
x1  01  11  5A  F1  C1  D8  81  98
x2  02  12  7F  F2  C2  D9  82  99
x3  03  13  7B  F3  C3  E2  83  A2
x4  37  3C  5B  F4  C4  E3  84  A3
x5  2D  3D  6C  F5  C5  E4  85  A4
x6  2E  32  50  F6  C6  E5  86  A5
x7  2F  26  7D  F7  C7  E6  87  A6
x8  16  18  4D  F8  C8  E7  88  A7
x9  05  19  5D  F9  C9  E8  89  A8
xA  25  3F  5C  7A  D1  E9  91  A9
xB  0B  27  4E  5E  D2  AD  92  C0
xC  0C  1C  6B  4C  D3  E0  93  4F
xD  0D  1D  60  7E  D4  BD  94  D0
xE  0E  1E  4B  6E  D5  71  95  5F
xF  0F  1F  61  6F  D6  6D  96  07

This is EBCDIC on the inside, ANSI (IA5) on the edges.  So, to turn
ASCII 0x21 into EBCDIC, you look in the column headed 2x on the row
labelled x1, and come up with EBCDIC 0x5a.  To find out what EBCDIC
0xD1 is, you find D1 in the table, in column 4x, row xA, so it is ASCII
0x4a.  (Note this assumes 0-parity ASCII, as well.)

Warning, EBCDIC is bigger than ASCII.  if you turn this into a set of
translation tables, you have to decide what you are going to do with
EBCDIC chars which do not appear in the table (with no corresponding
char in ASCII).  IBM's normal response is to just map them all to one
printing char (forgotten which, something like $ or &).

The above mapping seems to do the right sort of things with most IBM
applications -- e.g. ASCII C programs moved to an IBM and translated
this way will probably compile with the IBM C compiler.  Ditto other
language sources.

For a real real definitive answer, you'll have to get friendly with
someone in IBM.  I know THEY are trying to come up with a definitive
ASCII<->EBCDIC mapping as part of their SAA project; to dig themselves
out of the hole caused by the fact that IBM/PCs speak ASCII, while IBM
mainframes speak EBCDIC.  God only knows what mapping they'll end up
with.

-- 
Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK
 P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132

lnds@sherlock.mmid.ualberta.ca (Mark Israel) (06/21/91)

In article <1991Jun20.160334.11278@gdr.bath.ac.uk>, P.Smee@bristol.ac.uk 
(Paul Smee) writes:

> There's a basic problem here, which is that there is (still) no such
> thing as EBCDIC....   For a real real definitive answer, you'll have to get
> friendly with someone in IBM.

   I extracted the following from local documentation.

				Mark Israel
I have heard the Wobble!	userisra@mts.ucs.ualberta.ca

------------------------------------------------------------------------------

   In February 1987, a new eight-bit ISO character set standard,
8859/1, was ratified.  Also in 1987, IBM published an EBCDIC 
standard called Code Page 37, based on the ISO 8859/1 standard. 
Both of these standards contain identical character graphics.  The
ISO 8859/1 character set contains such EBCDIC characters as the
logical-not sign and the cent sign, and the new EBCDIC character
set contains the ISO tilde and circumflex, among other ASCII
characters.  The ISO 8859/1 standard is supported by many large
computer manufacturers, including DEC and IBM.  

   As we deal more and more with other machines using ISO-based
rather than EBCDIC-based character coding schemes, it becomes
imperative that we be able to move data from one machine to
another and back again without loss of information.  The mapping 
that results from using the ISO 8859/1 standard and the IBM
Code Page 37 EBCDIC will allow us to move information back and
forth between ISO- and EBCDIC-based machines with none of the
problems we have had in the past. 

    EBCDIC     ASCII     GRAPHIC  DESCRIPTION 
--------------------------------------------------------------------- 
    X'00'      X'00'       NUL   null 
    X'01'      X'01'       SOH   start of heading           (Ctrl-A) 
    X'02'      X'02'       STX   start of text              (Ctrl-B) 
    X'03'      X'03'       ETX   end of text                (Ctrl-C) 
    X'04'      X'9C'     ?        ... 
    X'05'      X'09'     	  HT    horizontal tabulation      (Ctrl-I) 
    X'06'      X'86'     ?        ... 
    X'07'      X'7F'       DEL   delete (rubout, DEL control char) 
    X'08'      X'97'     ?        ... 
    X'09'      X'8D'     ?        ... 
    X'0A'      X'8E'     ?        ... 
    X'0B'      X'0B'       VT    vertical tabulation        (Ctrl-K) 
    X'0C'      X'0C'       FF    form feed                  (Ctrl-L) 
    X'0D'      X'0D'     
    X'0E'      X'0E'       SO    shift-out                  (Ctrl-N) 
    X'0F'      X'0F'       SI    shift-in                   (Ctrl-O) 
    X'10'      X'10'       DLE   data link escape           (Ctrl-P) 
    X'11'      X'11'       DC1   device control 1    (X-Off, Ctrl-Q) 
    X'12'      X'12'       DC2   device control 2           (Ctrl-R) 
    X'13'      X'13'       DC3   device control 3     (X-On, Ctrl-S) 
    X'14'      X'9D'     ?        ... 
    X'15'      X'85'     ?        ... 
    X'16'      X'08'       BS    backspace                  (Ctrl-H) 
    X'17'      X'87'     ?        ... 
    X'18'      X'18'       CAN   cancel                     (Ctrl-X) 
    X'19'      X'19'       EM    end of medium              (Ctrl-Y) 
    X'1A'      X'92'     ?        ... 
    X'1B'      X'8F'     ?        ... 
    X'1C'      X'1C'       FS    file separator 
    X'1D'      X'1D'       GS    group separator 
    X'1E'      X'1E'       RS    record separator 
    X'1F'      X'1F'       US    unit separator 
    X'20'      X'80'     ?        ... 
    X'21'      X'81'     ?        ... 
    X'22'      X'82'     ?        ... 
    X'23'      X'83'     ?        ... 
    X'24'      X'84'     ?        ... 
    X'25'      X'0A'     
  LF    line feed                  (Ctrl-J) 
    X'26'      X'17'       ETB   end of transmission block  (Ctrl-W) 
    X'27'      X'1B'       ESC   escape                     (Escape) 
    X'28'      X'88'     ?        ... 
    X'29'      X'89'     ?        ... 
    X'2A'      X'8A'     ?        ... 
    X'2B'      X'8B'     ?        ... 
    X'2C'      X'8C'     ?        ... 
    X'2D'      X'05'       ENQ   enquiry                    (Ctrl-E) 
    X'2E'      X'06'       ACK   acknowledge                (Ctrl-F) 
    X'2F'      X'07'       BEL   bell                       (Ctrl-G) 
    X'30'      X'90'     ?        ... 
    X'31'      X'91'     ?        ... 
    X'32'      X'16'       SYN   synchronous idle           (Ctrl-V) 
    X'33'      X'93'     ?        ... 
    X'34'      X'94'     ?        ... 
    X'35'      X'95'     ?        ... 
    X'36'      X'96'     ?        ... 
    X'37'      X'04'       EOT   end of transmission        (Ctrl-D) 
    X'38'      X'98'     ?        ... 
    X'39'      X'99'     ?        ... 
    X'3A'      X'9A'     ?        ... 
    X'3B'      X'9B'     ?        ... 
    X'3C'      X'14'       DC4   device control 4           (Ctrl-T) 
    X'3D'      X'15'       NAK   negative acknowledge       (Ctrl-U) 
    X'3E'      X'9E'     ?        ... 
    X'3F'      X'1A'       SUB   substitute character       (Ctrl-Z) 
    X'40'      X'20'              space (blank) 
    X'41'      X'A0'     ?        no-break space 
    X'42'      X'E2'     ?        small a with circumflex accent 
    X'43'      X'E4'     ?        small a with diaeresis 
    X'44'      X'E0'     ?        small a with grave accent 
    X'45'      X'E1'     ?        small a with acute accent 
    X'46'      X'E3'     ?        small a with tilde 
    X'47'      X'E5'     ?        small a with ring above 
    X'48'      X'E7'     ?        small c with cedilla 
    X'49'      X'F1'     ?        small n with tilde 
    X'4A'      X'A2'     ?        cent sign 
    X'4B'      X'2E'     .        period, full stop 
    X'4C'      X'3C'     <        less-than sign 
    X'4D'      X'28'     (        left parenthesis 
    X'4E'      X'2B'     +        plus sign 
    X'4F'      X'7C'     |        vertical line (bar, "or" sign) 
    X'50'      X'26'     &        ampersand (and sign) 
    X'51'      X'E9'     ?        small e with acute accent 
    X'52'      X'EA'     ?        small e with circumflex accent 
    X'53'      X'EB'     ?        small e with diaeresis 
    X'54'      X'E8'     ?        small e with grave accent 
    X'55'      X'ED'     ?        small i with acute accent 
    X'56'      X'EE'     ?        small i with circumflex accent 
    X'57'      X'EF'     ?        small i with diaeresis 
    X'58'      X'EC'     ?        small i with grave accent 
    X'59'      X'DF'     ?        small sharp s, German 
    X'5A'      X'21'     !        exclamation mark 
    X'5B'      X'24'     $        dollar sign 
    X'5C'      X'2A'     *        asterisk (star) 
    X'5D'      X'29'     )        right parenthesis 
    X'5E'      X'3B'     ;        semicolon 
    X'5F'      X'AC'     ?        not sign 
    X'60'      X'2D'     -        minus sign or hyphen 
    X'61'      X'2F'     /        solidus (slash) 
    X'62'      X'C2'     ?        capital A with circumflex accent 
    X'63'      X'C4'     ?        capital A with diaeresis 
    X'64'      X'C0'     ?        capital A with grave accent 
    X'65'      X'C1'     ?        capital A with acute accent 
    X'66'      X'C3'     ?        capital A with tilde 
    X'67'      X'C5'     ?        capital A with ring 
    X'68'      X'C7'     ?        capital C with cedilla 
    X'69'      X'D1'     ?        capital N with tilde 
    X'6A'      X'A6'     ?        broken bar 
    X'6B'      X'2C'     ,        comma 
    X'6C'      X'25'     %        percent sign 
    X'6D'      X'5F'     _        low line (underscore) 
    X'6E'      X'3E'     >        greater-than sign 
    X'6F'      X'3F'     ?        question mark 
    X'70'      X'F8'     ?        small o with slash 
    X'71'      X'C9'     ?        capital E with acute accent 
    X'72'      X'CA'     ?        capital E with circumflex accent 
    X'73'      X'CB'     ?        capital E with diaeresis 
    X'74'      X'C8'     ?        capital E with grave accent 
    X'75'      X'CD'     ?        capital I with acute accent 
    X'76'      X'CE'     ?        capital I with circumflex accent 
    X'77'      X'CF'     ?        capital I with diaeresis 
    X'78'      X'CC'     ?        capital I with grave accent 
    X'79'      X'60'     `        grave accent 
    X'7A'      X'3A'     :        colon 
    X'7B'      X'23'     #        number sign (hash mark, sharp sign) 
    X'7C'      X'40'     @        commercial at 
    X'7D'      X'27'     '        apostrophe (single quote) 
    X'7E'      X'3D'     =        equals sign 
    X'7F'      X'22'     "        quotation mark (double quote) 
    X'80'      X'D8'     ?        capital O with slash 
    X'81'      X'61'     a        small a 
    X'82'      X'62'     b        small b 
    X'83'      X'63'     c        small c 
    X'84'      X'64'     d        small d 
    X'85'      X'65'     e        small e 
    X'86'      X'66'     f        small f 
    X'87'      X'67'     g        small g 
    X'88'      X'68'     h        small h 
    X'89'      X'69'     i        small i 
    X'8A'      X'AB'     ?        angle quotation mark left (<< mark) 
    X'8B'      X'BB'     ?        angle quotation mark right (>> mark) 
    X'8C'      X'F0'     ?        small eth, Icelandic 
    X'8D'      X'FD'     ?        small y with acute accent 
    X'8E'      X'DE'     ?        small thorn, Icelandic 
    X'8F'      X'B1'     ?        plus or minus sign 
    X'90'      X'B0'     ?        degree sign 
    X'91'      X'6A'     j        small j 
    X'92'      X'6B'     k        small k 
    X'93'      X'6C'     l        small l 
    X'94'      X'6D'     m        small m 
    X'95'      X'6E'     n        small n 
    X'96'      X'6F'     o        small o 
    X'97'      X'70'     p        small p 
    X'98'      X'71'     q        small q 
    X'99'      X'72'     r        small r 
    X'9A'      X'AA'     ?        ordinal indicator feminine 
    X'9B'      X'BA'     ?        ordinal indicator, masculine 
    X'9C'      X'E6'     ?        small ae dipthong 
    X'9D'      X'B8'     ?        cedilla 
    X'9E'      X'C6'     ?        capital AE dipthong 
    X'9F'      X'A4'     ?        currency sign (lozenge) 
    X'A0'      X'B5'     ?        micro sign (small mu) 
    X'A1'      X'7E'     ~        tilde (wavy line) 
    X'A2'      X'73'     s        small s 
    X'A3'      X'74'     t        small t 
    X'A4'      X'75'     u        small u 
    X'A5'      X'76'     v        small v 
    X'A6'      X'77'     w        small w 
    X'A7'      X'78'     x        small x 
    X'A8'      X'79'     y        small y 
    X'A9'      X'7A'     z        small z 
    X'AA'      X'A1'     ?        inverted exclamation mark 
    X'AB'      X'BF'     ?        inverted question mark 
    X'AC'      X'D0'     ?        capital D with stroke, Icelandic eth 
    X'AD'      X'DD'     ?        capital Y with acute accent 
    X'AE'      X'FE'     ?        capital thorn, Icelandic 
    X'AF'      X'AE'     ?        registered sign (circled capital R) 
    X'B0'      X'5E'     ^        circumflex accent 
    X'B1'      X'A3'     ?        pound sign (Sterling currency) 
    X'B2'      X'A5'     ?        yen sign (Nipponese currency) 
    X'B3'      X'B7'     ?        middle dot (scalar product) 
    X'B4'      X'A9'     ?        copyright sign (circled capital C) 
    X'B5'      X'A7'     ?        section sign (S-half-above-S sign) 
    X'B6'      X'B6'     ?        pilcrow (paragraph, double-barred P) 
    X'B7'      X'BC'     ?        fraction one-quarter (1/4) 
    X'B8'      X'BD'     ?        fraction one-half (1/2) 
    X'B9'      X'BE'     ?        fraction three-quarters (3/4) 
    X'BA'      X'5B'     [        left square bracket 
    X'BB'      X'5D'     ]        right square bracket 
    X'BC'      X'AF'     ?        macron 
    X'BD'      X'A8'     ?        diaeresis or umlaut 
    X'BE'      X'B4'     ?        acute accent 
    X'BF'      X'D7'     ?        multiply sign (vector product) 
    X'C0'      X'7B'     {        left curly bracket (left brace) 
    X'C1'      X'41'     A        capital A 
    X'C2'      X'42'     B        capital B 
    X'C3'      X'43'     C        capital C 
    X'C4'      X'44'     D        capital D 
    X'C5'      X'45'     E        capital E 
    X'C6'      X'46'     F        capital F 
    X'C7'      X'47'     G        capital G 
    X'C8'      X'48'     H        capital H 
    X'C9'      X'49'     I        capital I 
    X'CA'      X'AD'     ?        soft hyphen 
    X'CB'      X'F4'     ?        small o with circumflex accent 
    X'CC'      X'F6'     ?        small o with diaeresis 
    X'CD'      X'F2'     ?        small o with grave accent 
    X'CE'      X'F3'     ?        small o with acute accent 
    X'CF'      X'F5'     ?        small o with tilde 
    X'D0'      X'7D'     }        right curly bracket (right brace) 
    X'D1'      X'4A'     J        capital J 
    X'D2'      X'4B'     K        capital K 
    X'D3'      X'4C'     L        capital L 
    X'D4'      X'4D'     M        capital M 
    X'D5'      X'4E'     N        capital N 
    X'D6'      X'4F'     O        capital O 
    X'D7'      X'50'     P        capital P 
    X'D8'      X'51'     Q        capital Q 
    X'D9'      X'52'     R        capital R 
    X'DA'      X'B9'     ?        superscript one 
    X'DB'      X'FB'     ?        small u with circumflex accent 
    X'DC'      X'FC'     ?        small u with diaeresis 
    X'DD'      X'F9'     ?        small u with grave accent 
    X'DE'      X'FA'     ?        small u with acute accent 
    X'DF'      X'FF'     ?        small y diaeresis 
    X'E0'      X'5C'     \        reverse solidus (backslash) 
    X'E1'      X'F7'     ?        divide sign (dot over line over dot) 
    X'E2'      X'53'     S        capital S 
    X'E3'      X'54'     T        capital T 
    X'E4'      X'55'     U        capital U 
    X'E5'      X'56'     V        capital V 
    X'E6'      X'57'     W        capital W 
    X'E7'      X'58'     X        capital X 
    X'E8'      X'59'     Y        capital Y 
    X'E9'      X'5A'     Z        capital Z 
    X'EA'      X'B2'     ?        superscript two (squared) 
    X'EB'      X'D4'     ?        capital O with circumflex accent 
    X'EC'      X'D6'     ?        capital O with diaeresis 
    X'ED'      X'D2'     ?        capital O with grave accent 
    X'EE'      X'D3'     ?        capital O with acute accent 
    X'EF'      X'D5'     ?        capital O with tilde 
    X'F0'      X'30'     0        digit zero 
    X'F1'      X'31'     1        digit one 
    X'F2'      X'32'     2        digit two 
    X'F3'      X'33'     3        digit three 
    X'F4'      X'34'     4        digit four 
    X'F5'      X'35'     5        digit five 
    X'F6'      X'36'     6        digit six 
    X'F7'      X'37'     7        digit seven 
    X'F8'      X'38'     8        digit eight 
    X'F9'      X'39'     9        digit nine 
    X'FA'      X'B3'     ?        superscript three (cubed) 
    X'FB'      X'DB'     ?        capital U with circumflex accent 
    X'FC'      X'DC'     ?        capital U with diaeresis 
    X'FD'      X'D9'     ?        capital U with grave accent 
    X'FE'      X'DA'     ?        capital U with acute accent 
    X'FF'      X'9F'     ?        ...

gary@sci34hub.sci.com (Gary Heston) (06/21/91)

In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes:

>I am involved in developing the ISO/RDA protocol between CDC 920
>and IBM 3090.  CDC 920 uses ASCII representation and the IBM
>uses EBCDIC. I am looking for C routines that convert ASCII strings
>into EBCDIC strings and vice-versa.  We need these DESPERATELY and
>asap. If you have some information re: these, please let me know
>asap. I would REALLY appreciate your help.

Well, let's see--this is a hard one, seeing as how it's a C programming
problem, of course it's posted to comp.lang.c, isn't it?

>Newsgroups: comp.protocols.ibm,comp.protocols.iso,comp.protocols.tcp-ip,comp.unix.questions

Oops, guessed wrong on that one...

Try building a 256 by 2 char array, initializing [x,0] with the EBCDIC
code set and [x,1] with the ANSI (it hasn't been ASCII for many years) 
code set, so that you can use the character as the offset to find out
what it converts to.

Typing in the initialization strings are left as an exercise to the
programmer. (You could do this with pointers, but that's a little
more complicated, and you said you were in a hurry.)

>Very many thanks in advance.

Anytime.

-- 
Gary Heston   System Mismanager and technoflunky   uunet!sci34hub!gary or
My opinions, not theirs.    SCI Systems, Inc.       gary@sci34hub.sci.com
I support drug testing. I believe every public official should be given a
shot of sodium pentathol and ask "Which laws have you broken this week?".

dag@fciva.FRANKCAP.COM (Daniel A. Graifer) (06/22/91)

In article <1991Jun20.115613.13073@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes:
>In article <1991Jun19.190752.28034@ssdc.honeywell.com> satya@ssdc.honeywell.com (Satya Prabhakar) writes:
>>uses EBCDIC. I am looking for C routines that convert ASCII strings
>>into EBCDIC strings and vice-versa.  We need these DESPERATELY and
>
>  I don't understand.  What is wrong with a simple loop replacing
>c with EBCDIC[c] for each char c in the string of unsigned chars.
>Something like:
>	while(*p) { *p = EBCDIC[*p]; p++;}
>
> You do need to initialize your table.  Pick up the tables used on the
>3090 and use those.  If you don't like that choice build your own in
>the grand tradition of mutual inconsistency which currently exists in
>the ASCII <-> EBCDIC translation world.

I wrote a two line c program to output chars0-256, and
piped this through dd -ascii and od -c to create my
table. This way, the table is at least consistant with
what dd produces.  Note that the table produced this way
will not necessarily be invertable.  (But then neither
are the mainframe translate tables.  If I recall my
Burroughs B[67]000 days correctly (now Unisys A-Series),
the translation of ASCII "!" was not invertable.  And
you may have a real problem with the >100 EBCDIC chars
that aren't meaningful.

Dan
-- 
Daniel A. Graifer			Coastal Capital Funding Corp.
Sr. Vice President, Financial Systems	7900 Westpark Dr. Suite A-130
(703)821-3244				McLean, VA  22102
uunet!fciva!dag				fciva.FRANKCAP.COM!dag@uunet.uu.net

gwilliam@SH.CS.NET (George Williams) (06/24/91)

[ Disclaimer: Views and opinions, expressed or implied my own ]

I agree with that which you state below. There is no questions it's best to
go with a 'network-neutral' code set between disparate systems ( such
as those in question ). There are a couple of points of note , however:

() The mechanics of the operation are straight-forward.

   Declare a static array of 255 ( size of EBCDIC code set ) bytes.
   Then use the current character as an offset to this array to obtain
   the translated value. This is simple, quick, and widely used technique.

   The same table can be used in both direction as long as array initialization
   is consistent across your compute environment, i.e. all nodes used the
   same ( agreed on) 'network data' character sets. 


() There is the problem, however, ( and not just with 8859/1 ), when character
   defaults, "?", are used for non-display or print characters. If the original
   data is retrieved from the target system it is no longer intact. At      
   least for presentation purposes. One way I have gotten around this problem
   in the past was to encode these characters as transparent entries in 
   any table used. In other words the entries for these characters match
   the array index.

There are no doubt more clever and eloquent methods, but this proved very
flexible. 

BTW:

What happened to SCS data ( codes '00'-'3f', the SNA character string for
the old LU type 1 ) with this..is that what Code Page 37 addresses ?

 George Williams


    Date:    21 Jun 91 09:11:19 GMT
    From:    Mark Israel <aunro!alberta!sherlock.mmid.ualberta.ca!lnds@lll-wink
	      en.llnl.gov>
    Subject: Re: SOS: C Routines for ASCII to EBCDIC Conversion and Vice-versa

    In article <1991Jun20.160334.11278@gdr.bath.ac.uk>, P.Smee@bristol.ac.uk 
    (Paul Smee) writes:

    > There's a basic problem here, which is that there is (still) no such
    > thing as EBCDIC....   For a real real definitive answer, you'll have to g
    et
    > friendly with someone in IBM.

       I extracted the following from local documentation.

    				Mark Israel
    I have heard the Wobble!	userisra@mts.ucs.ualberta.ca

    ---------------------------------------------------------------------------
    ---

       In February 1987, a new eight-bit ISO character set standard,
    8859/1, was ratified.  Also in 1987, IBM published an EBCDIC 
    standard called Code Page 37, based on the ISO 8859/1 standard. 
    Both of these standards contain identical character graphics.  The
    ISO 8859/1 character set contains such EBCDIC characters as the
    logical-not sign and the cent sign, and the new EBCDIC character
    set contains the ISO tilde and circumflex, among other ASCII
    characters.  The ISO 8859/1 standard is supported by many large
    computer manufacturers, including DEC and IBM.  

       As we deal more and more with other machines using ISO-based
    rather than EBCDIC-based character coding schemes, it becomes
    imperative that we be able to move data from one machine to
    another and back again without loss of information.  The mapping 
    that results from using the ISO 8859/1 standard and the IBM
    Code Page 37 EBCDIC will allow us to move information back and
    forth between ISO- and EBCDIC-based machines with none of the
    problems we have had in the past. 

        EBCDIC     ASCII     GRAPHIC  DESCRIPTION 
    --------------------------------------------------------------------- 
        X'00'      X'00'       NUL   null 
        X'01'      X'01'       SOH   start of heading           (Ctrl-A) 
        X'02'      X'02'       STX   start of text              (Ctrl-B) 
        X'03'      X'03'       ETX   end of text                (Ctrl-C) 
        X'04'      X'9C'     ?        ... 
        X'05'      X'09'     	  HT    horizontal tabulation      (Ctrl-I) 
        X'06'      X'86'     ?        ... 
        X'07'      X'7F'       DEL   delete (rubout, DEL control char) 
        X'08'      X'97'     ?        ... 
        X'09'      X'8D'     ?        ... 
        X'0A'      X'8E'     ?        ... 
        X'0B'      X'0B'       VT    vertical tabulation        (Ctrl-K) 
        X'0C'      X'0C'       FF    form feed                  (Ctrl-L) 
        X'0D'      X'0D'     
        X'0E'      X'0E'       SO    shift-out                  (Ctrl-N) 
        X'0F'      X'0F'       SI    shift-in                   (Ctrl-O) 
        X'10'      X'10'       DLE   data link escape           (Ctrl-P) 
        X'11'      X'11'       DC1   device control 1    (X-Off, Ctrl-Q) 
        X'12'      X'12'       DC2   device control 2           (Ctrl-R) 
        X'13'      X'13'       DC3   device control 3     (X-On, Ctrl-S) 
        X'14'      X'9D'     ?        ... 
        X'15'      X'85'     ?        ... 
        X'16'      X'08'       BS    backspace                  (Ctrl-H) 
        X'17'      X'87'     ?        ... 
        X'18'      X'18'       CAN   cancel                     (Ctrl-X) 
        X'19'      X'19'       EM    end of medium              (Ctrl-Y) 
        X'1A'      X'92'     ?        ... 
        X'1B'      X'8F'     ?        ... 
        X'1C'      X'1C'       FS    file separator 
        X'1D'      X'1D'       GS    group separator 
        X'1E'      X'1E'       RS    record separator 
        X'1F'      X'1F'       US    unit separator 
        X'20'      X'80'     ?        ... 
        X'21'      X'81'     ?        ... 
        X'22'      X'82'     ?        ... 
        X'23'      X'83'     ?        ... 
        X'24'      X'84'     ?        ... 
        X'25'      X'0A'     
      LF    line feed                  (Ctrl-J) 
        X'26'      X'17'       ETB   end of transmission block  (Ctrl-W) 
        X'27'      X'1B'       ESC   escape                     (Escape) 
        X'28'      X'88'     ?        ... 
        X'29'      X'89'     ?        ... 
        X'2A'      X'8A'     ?        ... 
        X'2B'      X'8B'     ?        ... 
        X'2C'      X'8C'     ?        ... 
        X'2D'      X'05'       ENQ   enquiry                    (Ctrl-E) 
        X'2E'      X'06'       ACK   acknowledge                (Ctrl-F) 
        X'2F'      X'07'       BEL   bell                       (Ctrl-G) 
        X'30'      X'90'     ?        ... 
        X'31'      X'91'     ?        ... 
        X'32'      X'16'       SYN   synchronous idle           (Ctrl-V) 
        X'33'      X'93'     ?        ... 
        X'34'      X'94'     ?        ... 
        X'35'      X'95'     ?        ... 
        X'36'      X'96'     ?        ... 
        X'37'      X'04'       EOT   end of transmission        (Ctrl-D) 
        X'38'      X'98'     ?        ... 
        X'39'      X'99'     ?        ... 
        X'3A'      X'9A'     ?        ... 
        X'3B'      X'9B'     ?        ... 
        X'3C'      X'14'       DC4   device control 4           (Ctrl-T) 
        X'3D'      X'15'       NAK   negative acknowledge       (Ctrl-U) 
        X'3E'      X'9E'     ?        ... 
        X'3F'      X'1A'       SUB   substitute character       (Ctrl-Z) 
        X'40'      X'20'              space (blank) 
        X'41'      X'A0'     ?        no-break space 
        X'42'      X'E2'     ?        small a with circumflex accent 
        X'43'      X'E4'     ?        small a with diaeresis 
        X'44'      X'E0'     ?        small a with grave accent 
        X'45'      X'E1'     ?        small a with acute accent 
        X'46'      X'E3'     ?        small a with tilde 
        X'47'      X'E5'     ?        small a with ring above 
        X'48'      X'E7'     ?        small c with cedilla 
        X'49'      X'F1'     ?        small n with tilde 
        X'4A'      X'A2'     ?        cent sign 
        X'4B'      X'2E'     .        period, full stop 
        X'4C'      X'3C'     <        less-than sign 
        X'4D'      X'28'     (        left parenthesis 
        X'4E'      X'2B'     +        plus sign 
        X'4F'      X'7C'     |        vertical line (bar, "or" sign) 
        X'50'      X'26'     &        ampersand (and sign) 
        X'51'      X'E9'     ?        small e with acute accent 
        X'52'      X'EA'     ?        small e with circumflex accent 
        X'53'      X'EB'     ?        small e with diaeresis 
        X'54'      X'E8'     ?        small e with grave accent 
        X'55'      X'ED'     ?        small i with acute accent 
        X'56'      X'EE'     ?        small i with circumflex accent 
        X'57'      X'EF'     ?        small i with diaeresis 
        X'58'      X'EC'     ?        small i with grave accent 
        X'59'      X'DF'     ?        small sharp s, German 
        X'5A'      X'21'     !        exclamation mark 
        X'5B'      X'24'     $        dollar sign 
        X'5C'      X'2A'     *        asterisk (star) 
        X'5D'      X'29'     )        right parenthesis 
        X'5E'      X'3B'     ;        semicolon 
        X'5F'      X'AC'     ?        not sign 
        X'60'      X'2D'     -        minus sign or hyphen 
        X'61'      X'2F'     /        solidus (slash) 
        X'62'      X'C2'     ?        capital A with circumflex accent 
        X'63'      X'C4'     ?        capital A with diaeresis 
        X'64'      X'C0'     ?        capital A with grave accent 
        X'65'      X'C1'     ?        capital A with acute accent 
        X'66'      X'C3'     ?        capital A with tilde 
        X'67'      X'C5'     ?        capital A with ring 
        X'68'      X'C7'     ?        capital C with cedilla 
        X'69'      X'D1'     ?        capital N with tilde 
        X'6A'      X'A6'     ?        broken bar 
        X'6B'      X'2C'     ,        comma 
        X'6C'      X'25'     %        percent sign 
        X'6D'      X'5F'     _        low line (underscore) 
        X'6E'      X'3E'     >        greater-than sign 
        X'6F'      X'3F'     ?        question mark 
        X'70'      X'F8'     ?        small o with slash 
        X'71'      X'C9'     ?        capital E with acute accent 
        X'72'      X'CA'     ?        capital E with circumflex accent 
        X'73'      X'CB'     ?        capital E with diaeresis 
        X'74'      X'C8'     ?        capital E with grave accent 
        X'75'      X'CD'     ?        capital I with acute accent 
        X'76'      X'CE'     ?        capital I with circumflex accent 
        X'77'      X'CF'     ?        capital I with diaeresis 
        X'78'      X'CC'     ?        capital I with grave accent 
        X'79'      X'60'     `        grave accent 
        X'7A'      X'3A'     :        colon 
        X'7B'      X'23'     #        number sign (hash mark, sharp sign) 
        X'7C'      X'40'     @        commercial at 
        X'7D'      X'27'     '        apostrophe (single quote) 
        X'7E'      X'3D'     =        equals sign 
        X'7F'      X'22'     "        quotation mark (double quote) 
        X'80'      X'D8'     ?        capital O with slash 
        X'81'      X'61'     a        small a 
        X'82'      X'62'     b        small b 
        X'83'      X'63'     c        small c 
        X'84'      X'64'     d        small d 
        X'85'      X'65'     e        small e 
        X'86'      X'66'     f        small f 
        X'87'      X'67'     g        small g 
        X'88'      X'68'     h        small h 
        X'89'      X'69'     i        small i 
        X'8A'      X'AB'     ?        angle quotation mark left (<< mark) 
        X'8B'      X'BB'     ?        angle quotation mark right (>> mark) 
        X'8C'      X'F0'     ?        small eth, Icelandic 
        X'8D'      X'FD'     ?        small y with acute accent 
        X'8E'      X'DE'     ?        small thorn, Icelandic 
        X'8F'      X'B1'     ?        plus or minus sign 
        X'90'      X'B0'     ?        degree sign 
        X'91'      X'6A'     j        small j 
        X'92'      X'6B'     k        small k 
        X'93'      X'6C'     l        small l 
        X'94'      X'6D'     m        small m 
        X'95'      X'6E'     n        small n 
        X'96'      X'6F'     o        small o 
        X'97'      X'70'     p        small p 
        X'98'      X'71'     q        small q 
        X'99'      X'72'     r        small r 
        X'9A'      X'AA'     ?        ordinal indicator feminine 
        X'9B'      X'BA'     ?        ordinal indicator, masculine 
        X'9C'      X'E6'     ?        small ae dipthong 
        X'9D'      X'B8'     ?        cedilla 
        X'9E'      X'C6'     ?        capital AE dipthong 
        X'9F'      X'A4'     ?        currency sign (lozenge) 
        X'A0'      X'B5'     ?        micro sign (small mu) 
        X'A1'      X'7E'     ~        tilde (wavy line) 
        X'A2'      X'73'     s        small s 
        X'A3'      X'74'     t        small t 
        X'A4'      X'75'     u        small u 
        X'A5'      X'76'     v        small v 
        X'A6'      X'77'     w        small w 
        X'A7'      X'78'     x        small x 
        X'A8'      X'79'     y        small y 
        X'A9'      X'7A'     z        small z 
        X'AA'      X'A1'     ?        inverted exclamation mark 
        X'AB'      X'BF'     ?        inverted question mark 
        X'AC'      X'D0'     ?        capital D with stroke, Icelandic eth 
        X'AD'      X'DD'     ?        capital Y with acute accent 
        X'AE'      X'FE'     ?        capital thorn, Icelandic 
        X'AF'      X'AE'     ?        registered sign (circled capital R) 
        X'B0'      X'5E'     ^        circumflex accent 
        X'B1'      X'A3'     ?        pound sign (Sterling currency) 
        X'B2'      X'A5'     ?        yen sign (Nipponese currency) 
        X'B3'      X'B7'     ?        middle dot (scalar product) 
        X'B4'      X'A9'     ?        copyright sign (circled capital C) 
        X'B5'      X'A7'     ?        section sign (S-half-above-S sign) 
        X'B6'      X'B6'     ?        pilcrow (paragraph, double-barred P) 
        X'B7'      X'BC'     ?        fraction one-quarter (1/4) 
        X'B8'      X'BD'     ?        fraction one-half (1/2) 
        X'B9'      X'BE'     ?        fraction three-quarters (3/4) 
        X'BA'      X'5B'     [        left square bracket 
        X'BB'      X'5D'     ]        right square bracket 
        X'BC'      X'AF'     ?        macron 
        X'BD'      X'A8'     ?        diaeresis or umlaut 
        X'BE'      X'B4'     ?        acute accent 
        X'BF'      X'D7'     ?        multiply sign (vector product) 
        X'C0'      X'7B'     {        left curly bracket (left brace) 
        X'C1'      X'41'     A        capital A 
        X'C2'      X'42'     B        capital B 
        X'C3'      X'43'     C        capital C 
        X'C4'      X'44'     D        capital D 
        X'C5'      X'45'     E        capital E 
        X'C6'      X'46'     F        capital F 
        X'C7'      X'47'     G        capital G 
        X'C8'      X'48'     H        capital H 
        X'C9'      X'49'     I        capital I 
        X'CA'      X'AD'     ?        soft hyphen 
        X'CB'      X'F4'     ?        small o with circumflex accent 
        X'CC'      X'F6'     ?        small o with diaeresis 
        X'CD'      X'F2'     ?        small o with grave accent 
        X'CE'      X'F3'     ?        small o with acute accent 
        X'CF'      X'F5'     ?        small o with tilde 
        X'D0'      X'7D'     }        right curly bracket (right brace) 
        X'D1'      X'4A'     J        capital J 
        X'D2'      X'4B'     K        capital K 
        X'D3'      X'4C'     L        capital L 
        X'D4'      X'4D'     M        capital M 
        X'D5'      X'4E'     N        capital N 
        X'D6'      X'4F'     O        capital O 
        X'D7'      X'50'     P        capital P 
        X'D8'      X'51'     Q        capital Q 
        X'D9'      X'52'     R        capital R 
        X'DA'      X'B9'     ?        superscript one 
        X'DB'      X'FB'     ?        small u with circumflex accent 
        X'DC'      X'FC'     ?        small u with diaeresis 
        X'DD'      X'F9'     ?        small u with grave accent 
        X'DE'      X'FA'     ?        small u with acute accent 
        X'DF'      X'FF'     ?        small y diaeresis 
        X'E0'      X'5C'     \        reverse solidus (backslash) 
        X'E1'      X'F7'     ?        divide sign (dot over line over dot) 
        X'E2'      X'53'     S        capital S 
        X'E3'      X'54'     T        capital T 
        X'E4'      X'55'     U        capital U 
        X'E5'      X'56'     V        capital V 
        X'E6'      X'57'     W        capital W 
        X'E7'      X'58'     X        capital X 
        X'E8'      X'59'     Y        capital Y 
        X'E9'      X'5A'     Z        capital Z 
        X'EA'      X'B2'     ?        superscript two (squared) 
        X'EB'      X'D4'     ?        capital O with circumflex accent 
        X'EC'      X'D6'     ?        capital O with diaeresis 
        X'ED'      X'D2'     ?        capital O with grave accent 
        X'EE'      X'D3'     ?        capital O with acute accent 
        X'EF'      X'D5'     ?        capital O with tilde 
        X'F0'      X'30'     0        digit zero 
        X'F1'      X'31'     1        digit one 
        X'F2'      X'32'     2        digit two 
        X'F3'      X'33'     3        digit three 
        X'F4'      X'34'     4        digit four 
        X'F5'      X'35'     5        digit five 
        X'F6'      X'36'     6        digit six 
        X'F7'      X'37'     7        digit seven 
        X'F8'      X'38'     8        digit eight 
        X'F9'      X'39'     9        digit nine 
        X'FA'      X'B3'     ?        superscript three (cubed) 
        X'FB'      X'DB'     ?        capital U with circumflex accent 
        X'FC'      X'DC'     ?        capital U with diaeresis 
        X'FD'      X'D9'     ?        capital U with grave accent 
        X'FE'      X'DA'     ?        capital U with acute accent 
        X'FF'      X'9F'     ?        ...

dandrews@bilver.uucp (Dave Andrews) (06/26/91)

In article <1991Jun20.160334.11278@gdr.bath.ac.uk> P.Smee@bristol.ac.uk (Paul Smee) writes:
>For a real real definitive answer, you'll have to get friendly with
>someone in IBM.  I know THEY are trying to come up with a definitive
>ASCII<->EBCDIC mapping as part of their SAA project; to dig themselves
>out of the hole caused by the fact that IBM/PCs speak ASCII, while IBM
>mainframes speak EBCDIC.  God only knows what mapping they'll end up
>with.

I believe the IBM effort is leading toward Unicode, a double-byte set.   
As I recall, a recent SHARE-wide survey indicated that SHARE members
were not happy with the Unicode idea (which will STILL be unusable in
parts of the world).

I don't think the ASCII/EBCDIC dichotomy will ever be resolved.

- Dave