[comp.mail.misc] ASCII/EBCDIC Translation

skl@van-bc.UUCP (Samuel Lam) (12/13/87)

In article <425@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
)It would be interesting to learn just what characters (i.e., hex values)
)can be safely transferred through ASCII/EBCDIC interfaces.  An encoding
)scheme like uuencode could be written using translation tables, if there
)are 64 character codes that can be guaranteed reliable in all ASCII/EBCDIC
)interfaces.
)
)...  Are there 64 codes that can be trusted to any ASCII/EBCDIC translators,
)and will come out the same when fed to any other EBCDIC/ASCII translator?

The upper *and* lower case alphabet and the digits together make up 62 of
these characters.  Anyone have suggestions for the two more to go?  (I think
'*' and '#', among others, could do it.)

...Sam
-- 
Samuel Lam     <Samuel.Lam@van-bc.UUCP> or
{ihnp4!alberta,watmath,uw-beaver}!ubc-vision!van-bc!skl

rayan@ai.toronto.edu (Rayan Zachariassen) (12/15/87)

In article <1622@van-bc.UUCP> skl@van-bc.UUCP (Samuel Lam) writes:
# In article <425@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
# )...  Are there 64 codes that can be trusted to any ASCII/EBCDIC translators,
# )and will come out the same when fed to any other EBCDIC/ASCII translator?
# 
# The upper *and* lower case alphabet and the digits together make up 62 of
# these characters.  Anyone have suggestions for the two more to go?

Yes, + and -. Why? Because it has already been done that way. A couple of
programs called rscs{en,de}code were written by Reg Quinton & Ken Lalonde
to protect news batches flowing over BITNET/NetNorth a year or two ago.
In my experience, even though uuencoded data can be mangled by the random
machines a message travels through, I've never had any problem when using
rscs*code. In addition, it is just as efficient as uuencode on BITNET,
because the LRECL of 80 is partially wasted by the 60-odd linelength used
by uu*code, but rscs*code uses 79 characters per line, which translates into
about the same number of card images for both encodings. It is in use on
one of the news backbone links (watmath--utgpu).

rayan

meissner@xyzzy.UUCP (Michael Meissner) (12/15/87)

In article <1622@van-bc.UUCP> skl@van-bc.UUCP (Samuel Lam) writes:
| In article <425@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
| )...  Are there 64 codes that can be trusted to any ASCII/EBCDIC translators,
| )and will come out the same when fed to any other EBCDIC/ASCII translator?
| 
| The upper *and* lower case alphabet and the digits together make up 62 of
| these characters.  Anyone have suggestions for the two more to go?  (I think
| '*' and '#', among others, could do it.)

If you really want to be portable, don't use '#', since it is a national
character set symbol.  I would recomend, something like '+' and
'-' (or '.' and ',') myself.
-- 
Michael Meissner, Data General.		Uucp: ...!mcnc!rti!xyzzy!meissner
					Arpa/Csnet:  meissner@dg-rtp.DG.COM

jc@minya.UUCP (John Chambers) (12/25/87)

> # In article <425@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
> # )...  Are there 64 codes that can be trusted to any ASCII/EBCDIC translators,
> # )and will come out the same when fed to any other EBCDIC/ASCII translator?
> # 
> # The upper *and* lower case alphabet and the digits together make up 62 of
> # these characters.  Anyone have suggestions for the two more to go?
> 
> Yes, + and -. Why? Because it has already been done that way. A couple of
> programs called rscs{en,de}code were written by Reg Quinton & Ken Lalonde
> to protect news batches flowing over BITNET/NetNorth a year or two ago.

This sounds like a useful pair of programs; is the source PD?  If not, it
would be trivial to write them, if the authors would tell us just what the
mapping is.  I could do my own version in maybe 15 minutes, I guess, but
I'd likely use a different mapping, and their rscsdecode would mistranslate
my rscsencoded file.  I'd make the guess that the order is "0-9A-Za-z+-",
but it'd be nice to know fer shur.











[To satisfy inews' requirement for more new text than quoted :-]

-- 
John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)

dboyes@uoregon.UUCP (David Boyes) (12/27/87)

In article <1622@van-bc.UUCP> skl@van-bc.UUCP (Samuel Lam) writes:
>In article <425@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
>[stuff about ASCII-EBCDIC translation]
>
>The upper *and* lower case alphabet and the digits together make up 62 of
>these characters.  Anyone have suggestions for the two more to go?  (I think
>'*' and '#', among others, could do it.)

Careful with the '#' sign. Some IBM systems define that symbol as a
delete character symbol. You'd probably be better off using '*' and
'$' -- both of which have defined meanings in most IBM languages, so
they don't play fast and loose with them.

>Samuel Lam     <Samuel.Lam@van-bc.UUCP> or
>{ihnp4!alberta,watmath,uw-beaver}!ubc-vision!van-bc!skl

inews fodderinews fodder
inews fodder
inews fodder
inews fodder
inews fodder
inews fodder
inews fodder
inews fodder







-- 
David Boyes         | ARPA: 556%OREGON1.BITNET@CUNYVM.CUNY.EDU
Systems Division    | BITNET: 556@OREGON1
UO Computing Center | UUCP: dboyes@uoregon.UUCP
'How long d'ya think it'll be before just us oldtimers remember WISCVM?'      

rayan@ai.toronto.edu (Rayan Zachariassen) (12/29/87)

In article <439@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
# This sounds like a useful pair of programs; is the source PD?  If not, it
# would be trivial to write them, if the authors would tell us just what the
# mapping is.

They are; it is. The mapping is "A-Za-z0-9+-". I asked about the availability
of the code, and got a shar back from one of the authors. As I type this, it
is on its way to the comp.sources.misc submission address. The latest rev. has
a CRC check in it. Paranoia is worth the extra resources.

rayan

KEN@ORION.BITNET (Kenneth Ng) (01/04/88)

>From: jc@minya.UUCP (John Chambers)
>This sounds like a useful pair of programs; is the source PD?  If not, it
>would be trivial to write them, if the authors would tell us just what the
>mapping is.  I could do my own version in maybe 15 minutes, I guess, but
>I'd likely use a different mapping, and their rscsdecode would mistranslate
>my rscsencoded file.  I'd make the guess that the order is "0-9A-Za-z+-",
>but it'd be nice to know fer shur.
     
I've got one developed independly, it assumes "a..zA..Z0..9,.".
It won't be much good unless you can use REXX though.  Also its
got a prefix code in front of all the lines.
     
>John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)
-------
Kenneth Ng: ken@orion.bitnet
Also: ken@argus.uucp, ken@njitsc1.uucp, ken@njit-eies.mailnet
     
WISCVM.WISC.EDU is history, does your signature still use it?
     

lindsay@dscatl.UUCP (Lindsay Cleveland) (01/11/88)

Perhaps I missed some earlier part of this discussion, but whenever
I've needed to do ASCII <-> EBCDIC conversion, I've found that the
"dd" command works just fine, especiallly since it will also takes
care of the IBM need for fixed-length records (punched-card
images), or will strip off trailing blanks when going from
fixed-length IBM records to UNIX files.  Likewise, there are some
IBM applications which require that the input is in upper-case
characters, which "dd" also handles.

Case 1:
  dd if=ascii.file of=card.image conv=ebcdic cbs=80
 or
  dd if=ascii.file of=card.image conv=ibm,ucase cbs=80


Case 2:
  dd if=card.image of=ascii.file conv=ascii cbs=80

Excuse me if I am restating that which has already been said.

Cheers,
  Lindsay

Lindsay Cleveland         Digital Systems Co.   Atlanta, Ga
  gatech!dscatl!lindsay     (404) 497-1902
                         (U.S. Mail:  PO Box 1140, Duluth, GA  30136)

jbeard@quintus.UUCP (01/12/88)

In article <3089@dscatl.UUCP>, lindsay@dscatl.UUCP (Lindsay Cleveland) writes:
> I've needed to do ASCII <-> EBCDIC conversion, I've found that the
> "dd" command works just fine, ....

However EBCDIC contains a cent-sign and a PL/1 NOT and ASCII characters like
	[]{}`\~^
are often mis-translated.

most offensive, is the incompatible representations of \n (newline), which has
a history of:

	cr, lf pair for old tty-35 devices

and a later development of

	nl

DD typically maps \n to the lf, which isn't correct.

Posting C source to an EBCDIC environment can be fun ...