[comp.unix.misc] EBCDIC <--> ASCII conversion

noren@dinl.uucp (Charles Noren) (10/03/90)

We are communicating between Sun 3 (with SunOS 4.0.3) and an
IBM Mainframe (don't know the model, we're not IBM jocks)
via TCP/IP.  Our question, is there any program in Netland
that converts back and forth between EBCDIC and ASCII
(preferrably in C, but we will take any example)?

If you have some knowledge about the conversion process, such
as the bit/byte ordering of an IBM vs. a Sun, any comments
would be very helpful.

Thanks,


-- 
Chuck Noren
NET:     dinl!noren@ncar.ucar.edu
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930

jik@athena.mit.edu (Jonathan I. Kamens) (10/03/90)

In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes:
|> We are communicating between Sun 3 (with SunOS 4.0.3) and an
|> IBM Mainframe (don't know the model, we're not IBM jocks)
|> via TCP/IP.  Our question, is there any program in Netland
|> that converts back and forth between EBCDIC and ASCII
|> (preferrably in C, but we will take any example)?

  The Unix program "dd" does this.  In particular, the "conv=ascii" option
converts EBCDIC to ASCII, and the "conv=ebcdic" option goes the other way.

  See the man page for more information.

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8495			      Home: 617-782-0710

zebr360@emx.utexas.edu (Jerry Heyman) (10/03/90)

In article <1756@dinl.mmc.UUCP> noren@dinl.UUCP (Charles Noren) writes:
>We are communicating between Sun 3 (with SunOS 4.0.3) and an
>IBM Mainframe (don't know the model, we're not IBM jocks)
>via TCP/IP.  Our question, is there any program in Netland
>that converts back and forth between EBCDIC and ASCII
>(preferrably in C, but we will take any example)?
>

Why would you be looking for such a program?  I have an IBM RT on my desk
(and access to a Risc System/6000) which is ASCII based, and use 'ftp' from
our VM host all day.  I don't even think about doing a conversion of the
files as that is taken care of automatically.  If you're really paranoid
about it, then you can as another poster suggested, use dd.

>Thanks,
>
> Chuck Noren

I think you're trying to solve a problem that doesn't exist.

jerry

luke@modus.sublink.ORG (Luciano Mannucci) (10/04/90)

In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes:
> We are communicating between Sun 3 (with SunOS 4.0.3) and an
> IBM Mainframe (don't know the model, we're not IBM jocks)
> via TCP/IP.  Our question, is there any program in Netland
> that converts back and forth between EBCDIC and ASCII
> (preferrably in C, but we will take any example)?

Apologies for posting C code in the wrong newsgroup.

There are two very simple functions converting ASCII into EBCDIC and
vice-versa having been working for many years in many programs:

--- cut --- cut --- cut --- cut --- cut --- cut --- cut --- cut ---
static char _tob[] = {
0x00,0x01,0x02,0x03,0x37,0x2d,0x2e,0x2f,0x16,0x05,0x25,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x3c,0x3d,0x32,0x26,0x18,0x19,0x3f,0x27,0x22,0x40,0x35,0x40,
0x40,0x5a,0x7f,0x7b,0x5b,0x6c,0x50,0x7d,0x4d,0x5d,0x5c,0x4e,0x6b,0x60,0x4b,0x61,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0x7a,0x5e,0x4c,0x7e,0x6e,0x6f,
0x7c,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,
0xd7,0xd8,0xd9,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0x4f,0xe1,0x5f,0x40,0x6d,
0x40,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x91,0x92,0x93,0x94,0x95,0x96,
0x97,0x98,0x99,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xc0,0x6a,0xd0,0xa1,0x07,
};
static char _toa[] = {
0x00,0x01,0x02,0x03,0x00,0x09,0x00,0x7f,0x00,0x00,0x00,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x00,0x0a,0x08,0x00,0x18,0x19,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x1c,0x00,0x00,0x0a,0x17,0x1b,0x00,0x00,0x00,0x00,0x00,0x05,0x06,0x07,
0x00,0x00,0x16,0x00,0x00,0x1e,0x00,0x04,0x00,0x00,0x00,0x00,0x14,0x15,0x00,0x1a,
0x20,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x2e,0x3c,0x28,0x2b,0x5b,
0x26,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x21,0x24,0x2a,0x29,0x3b,0x5d,
0x2d,0x2f,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x7c,0x2c,0x25,0x5f,0x3e,0x3f,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x3a,0x23,0x40,0x27,0x3d,0x22,
0x00,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,0x70,0x71,0x72,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x7e,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x7b,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x00,0x00,0x00,0x00,0x00,0x00,
0x7d,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,0x50,0x51,0x52,0x00,0x00,0x00,0x00,0x00,0x00,
0x5c,0x00,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x00,0x00,0x00,0x00,0x00,0x00,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x00,0x00,0x00,0x00,0x00,0x00,
};

char btoa(c)
char c;
{
	if (c < 0) return -1;
	return _toa[c];
}

char atob(c)
char c;
{
	if (c < 0) return -1;
	if (c <= 0x7f)
		return _tob[c];
	else
		return 0;
}
/*
	" @(#)btoa.c	1.1 (lkdb) - 87/02/11 \n";
	*/
--- cut --- cut --- cut --- cut --- cut --- cut --- cut --- cut ---

luke.
-
-- 
  _ _           __             Via Aleardo Aleardi, 12 - 20154 Milano (Italy)
 | | | _  _|   (__             PHONE : +39 2 3315328 FAX: +39 2 3315778
 | | |(_)(_||_|___) Srl        E-MAIL: luke@modus.sublink.ORG
______________________________ Software & Services for Advertising & Marketing

bakke@plains.NoDak.edu (Jeffrey P. Bakke) (10/11/90)

In article <661@modus.sublink.ORG> luke@modus.sublink.ORG (Luciano Mannucci) writes:
> In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes:
> > We are communicating between Sun 3 (with SunOS 4.0.3) and an
> > IBM Mainframe (don't know the model, we're not IBM jocks)
> > via TCP/IP.  Our question, is there any program in Netland
> > that converts back and forth between EBCDIC and ASCII
> > (preferrably in C, but we will take any example)?
> 
> Apologies for posting C code in the wrong newsgroup.
That's alright, its always interesting to me.  Anyway, if you're on a Sun
system, an easier way would be to send the file over from the IBM to the
Sun, and then use the dd program.  This program allows file copies with
translation.  There are options to translate from EBCDIC to ASCII and vice
versa.  You'd have to look through the man pages.  But this will probably
do what you have to.  The 'dd' program is part of the standard SunOS 
installation tape I believe.  It should be located in the /usr/bin directory.

No need to write your own conversion program if the utilities already
exist.

-- 
Jeffrey P. Bakke               Internet: bakke@plains.NoDak.edu 
                      UUCP    : ...!uunet!plains!bakke
           BITNET  : bakke@plains.bitnet  

jc@atcmp.nl (Jan Christiaan van Winkel) (10/12/90)

From article <661@modus.sublink.ORG>, by luke@modus.sublink.ORG (Luciano Mannucci):
| In article <1756@dinl.mmc.UUCP>, noren@dinl.uucp (Charles Noren) writes:
|> via TCP/IP.  Our question, is there any program in Netland
|> that converts back and forth between EBCDIC and ASCII
| 
| static char _tob[] = {
| 0x00,0x01,0x02,0x03,0x37,0x2d,0x2e,0x2f,0x16,0x05,0x25,0x0b,0x0c,0x0d,0x0e,0x0f,
.
.
| 0x97,0x98,0x99,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xc0,0x6a,0xd0,0xa1,0x07,
| };

| static char _toa[] = {
| 0x00,0x01,0x02,0x03,0x00,0x09,0x00,0x7f,0x00,0x00,0x00,0x0b,0x0c,0x0d,0x0e,0x0f,
.
.
| 0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x00,0x00,0x00,0x00,0x00,0x00,
| };
| 
| char btoa(c)
| char c;
| {
| 	if (c < 0) return -1;
| 	return _toa[c];
| }
This is *THE* case where you need unsigned char's in stead of plain char's
The problem is that ebcdic uses the full range from 0 to 255 for it's char's.
for example the ebcdic code for '3' is 0xf3. On a machine
that uses signed characters for plain char's, the number in c will be inter-
preted as a negative number fooling your test in btoa...
JC
-- 
___  __  ____________________________________________________________________
   |/  \   Jan Christiaan van Winkel      Tel: +31 80 566880  jc@atcmp.nl
   |       AT Computing   P.O. Box 1428   6501 BK Nijmegen    The Netherlands
__/ \__/ ____________________________________________________________________

johncore@compnect.UUCP (John Core ) (10/18/90)

why write code do convert ascii to EBCDIC. it comes with
Unix. it's called dd


Wizard Systems              |    UUCP:   uunet!wa3wbu!compnect!johncore
P.O. Box 6269               |INTERNET:   johncore@compnect.wa3wbu
Harrisburg, Pa. 17112-6269  |a public bbs since 1978. Data(717)657-4992 & 4997
John Core, SYSOP            |-------------------------------------------------
----------------------------| No matter where you go, there you are!
a woman is just a woman, but a good cigar is a smoke.   -R. Kipling

meissner@osf.org (Michael Meissner) (10/25/90)

In article <831@compnect.UUCP> johncore@compnect.UUCP (John Core )
writes:

| why write code do convert ascii to EBCDIC. it comes with
| Unix. it's called dd

However, unlike say ISO646 or ASCII, there is no one standard EBCDIC.
There are various EBCDIC's which share a lot of characters in common,
but have some different translations.  Also, you have the fun in most
EBCDIC's in that there are two representatons for '[' and ']'.  One
that your printer will print, and one that your terminal will display
correctly.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

schafer@devils.rice.edu (Richard A. Schafer) (10/26/90)

In article <MEISSNER.90Oct24150859@osf.osf.org>, meissner@osf.org
(Michael Meissner) writes:
||> However, unlike say ISO646 or ASCII, there is no one standard
EBCDIC.
To be fair, there is no *one* standard ASCII, either, if you consider
ASCII to include any of the several European versions of ASCII available
with ISO numbers which I don't remember off the top of my head.  That's
why ISO has been spending so much time over the past several years
working up new code point standards.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/26/90)

In article <1990Oct25.140442@devils.rice.edu>, schafer@devils.rice.edu (Richard A. Schafer) writes:
> In article <MEISSNER.90Oct24150859@osf.osf.org>, meissner@osf.org
> (Michael Meissner) writes:
> ||> However, unlike say ISO646 or ASCII, there is no one standard
> EBCDIC.
> To be fair, there is no *one* standard ASCII, either

There is one and only one ASCII (well, there was an old version, but
there has been only one for many years).  ASCII stands for
	*AMERICAN* Standard Code for Information Interchange.
ASCII is one particular natiaonal variant of the ISO 646 standard.
The European versions of ISO 646 aren't versions of ASCII.

The new standard (ISO 8859) is a family of 8-bit codes.  Every member
of the family has the same graphic characters in the lower half
(32..126) as ASCII; this is a compatible extension of ISO 646.

If you want to draw a parallel between ISO 646 and EBCDIC, there
are *lots* of versions of EBCDIC.  There's a French one and a Spanish
one and a Hebrew one and ...  Undeniably commendable.  The thing
that people complain about is having several incompatible versions
within the same "locale" (to use an ANSI-C-ism).
-- 
Fear most of all to be in error.	-- Kierkegaard, quoting Socrates.

henry@zoo.toronto.edu (Henry Spencer) (10/26/90)

In article <1990Oct25.140442@devils.rice.edu> schafer@devils.rice.edu (Richard A. Schafer) writes:
>||> However, unlike say ISO646 or ASCII, there is no one standard
>EBCDIC.
>To be fair, there is no *one* standard ASCII, either, if you consider
>ASCII to include any of the several European versions of ASCII...

There are no, repeat *no*, European versions of ASCII.  ASCII is a single
precisely-specified character code with no versions or ambiguities.  It
is one of a family of codes derived from ISO646.  There are a number of
other 646-derived codes in use in Europe; they are not ASCII.

It is true that the existence of a variety of 7-bit codes has turned out
to be a major nuisance, which is why there has been considerable work on
unified codes like ISO Latin.
-- 
The type syntax for C is essentially   | Henry Spencer at U of Toronto Zoology
unparsable.             --Rob Pike     |  henry@zoo.toronto.edu   utzoo!henry

malc@iconsys.icon.com (Malcolm Weir) (10/31/90)

In article <4089@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>  ASCII stands for
>	*AMERICAN* Standard Code for Information Interchange.

Gee, and all these years I thought it stood for:

	*ALMOST* Standard Code for Information Interchange.

	:-)

Malc.