[net.micro.atari16] New way to post binaries - discussion on format

csc@watmath.UUCP (Computer Sci Club) (10/21/86)

I mentioned in a previous article that I had written some arbitrary 8 bit
encoding programs that I could easily turn into an archiver specifically
designed for network transfer.  Since there seems to be some interest in
the topic in general, I'll tell you what my programs do.

In the most general case, my encoding routines are given a list of characters
that the transmission medium does not want to see.  It will then never
generate those characters.  It accepts a stream of arbitrary 8 bit bytes,
and produces an encoded or decoded stream.

The actual encoder operates in two modes:  eight bit encoding (M8) and
seven bit encoding (M7).  The encoder decides what mode it should be running
in, depending on the cost of the mode.

In seven bit encoding mode, any seven bit character can be transmitted.  Any
character the encoder has been told is pathological is mapped into a two
character escape code.  This mode is obviously very cheap for text.  Any
transmittable character is sent as itself.

In eight bit mode, the encoder can accept any 8 bit value.  Each group of
three input bytes is turned into four output 6 bit values.  These six bit
values are then mapped onto the ASCII characters "a-zA-Z,.".

Run length encoding is performed in both modes.

The archive format I was considering would have a special archive control
character (something non-controversial) which would never be generated
by the encoder.  The archive control characters would signal the beginning
of easily parse text strings that would describe the beginning and end
of archived files, their CRC's and lengths.  It would be possible to
generate checkpoints in the archive.  The archiver could extract all
undamaged files from an damaged archive.

The checkpoints could be used for retransmitting parts of the archive.  If
an archive was damaged, the archiver could tell the user what part of the
archive it needed replaced.  Anyone else who had a complete archive could use
that information to generate just the data needed to repair the broken archive.

The archiver will treat a set of unordered files as an archive.  Each
file would be searched for a header.  The header would identify the archive
and be used to order the parts.  It would then read the files in the
correct order.  The archive creation command would automatically generate
a numbered set of files for posting, according to a maximum size constraint.
No more editing and cat'ing of news articles.

My experiments show me that this program encodes a.out files slightly more
cheaply than uuencode, text files are very cheap.  The experimental version
does not transmit any of the following characters:  all control characters,
"<>{}[]^|\\~", and del.

If there is any interest, myself and another fellow here, Mike Gore, will
put this together out of code we already have (for encoding, CRC checking)
and we will post the source (for UNIX and Atari) and uuencoded versions for
the Atari ST.

It would be written to be portable.

Comments?

Tracy Tims
mail to ihnp4!watmath!unit36!tracy

csc@watmath.UUCP (Computer Sci Club) (10/22/86)

As a followup to my previous article, I have come across a new encoding
technique (courtesy of the Math Faculty Computing Facility here at
the university) which has the following properties:

	- eight bit input data
	- very restricted character set
	- generally compresses files rather than expanding them
	  (including binaries)
	- need no bit level twiddling
	- involves only table lookup

The encoding system is easy to implement.  I am whipping one up now.

This should make the transmittable archive format very easy to implement.
My co-developer, Mike Gore, is planning to do a Basic (gack spew) version
after I write one in C.

Still:  comments?

Tracy Tims
mail to ihnp4!watmath!unit36!tracy

braner@batcomputer.TN.CORNELL.EDU (braner) (10/24/86)

[]

I've been thinking recently about transferring ST screen-dumps (32K
bit-maps, one bit deep in "hires" (monochrome) mode) over the modem
(to a VAX running UNIX, to print the graphics on a laser-printer).

Since my applications are concerned with B&W line drawings which are
mostly white space, it would be very inefficient to send the whole
32K.  Also I'd like some compression method to compress it before I
ever save it on the ST disk.

I guess I could use ARC or some other standard, general purpose
compression program.  But it seems that an algorithm designed
especially for this purpose should beat general ones handily!
I am thinking about an algorithm that would compare bytes (or words)
VERTICALLY DOWN THE SCREEN, rather than along consecutive RAM.
Such an algorithm would find more runs of identical bytes.  It should
also work for color.

Is such a thing available (preferably PD)?  Is anybody working on one?
Or should I do it myself?

- Moshe Braner

ref0070@ritcv.UUCP (Robert E. Fortin) (10/27/86)

In article <1275@batcomputer.TN.CORNELL.EDU> braner@batcomputer.UUCP (braner) writes:
>[]
>
>I've been thinking recently about transferring ST screen-dumps (32K
>bit-maps, one bit deep in "hires" (monochrome) mode) over the modem
>(to a VAX running UNIX, to print the graphics on a laser-printer).
>
>Since my applications are concerned with B&W line drawings which are
>mostly white space, it would be very inefficient to send the whole
>32K.  Also I'd like some compression method to compress it before I
>ever save it on the ST disk.
>

I have the C - source for a compression algorithm that reduces files to
about 20-30% of their original size. It uses the LZW algorithm (if you
know what that is - I don't). It would be great to compress your files,
but you might want to uuencode them before transmitting them. The only
problem is that I don't have a C-compiler yet. If anyone is interested,
I could get it to work on Unix 4.3 and then someone could translate it
to the ST. You would need to keep the uncompress algorithm on your host
system anyway.

Bob Fortin
{allegra seismo decvax}!rochester!ref0070

braner@batcomputer.TN.CORNELL.EDU (braner) (10/29/86)

[]

By now I have a working AL RAM-resident program that, upon Alt-Help,
saves the screen in a file in a format that is BOTH compressed and
in a text (modem-able) form.  Doing both in one algorithm
is not only convenient, it is essential for getting the most compact
final product.  A typical desktop yields a TEXT FILE of about 7600
chars: 25% of the length of the bit map!

(Details on the coding algorithm used do not belong here,
It is similar to uuencode, but does not
use space chars nor is it sensitive to added/deleted control
chars or spaces.)

I am now working on a decoding program, to view such files, and on
a translator from the compressed format to Postscript (my intended
use of this whole mess is to send ST graphics to be printed on a
remote Laserwriter...).  I'll post it all when done.

Problems:

The coding program (I call it scode, or perhaps sencode?)
works fine from the desktop, but hangs if Alt-Help is pressed inside
Micro-C-Shell.  Why?  (When first run, the program replaces the screen
dump vector at $502, then does Ptermres() to stay in RAM.)
(I also bite hard and do OS TRAP calls from my code, even though it
is called from the Alt-Help interrupt handler.  I don't really have
a choice, do I?)

Also:  At the end of the ROM screen dump routine, it has:

	ADDQ.L	#4,A7
	RTS

If I end my dump routine with the same or with just plain RTS, it
works the same.  What's going on?

Any advice would be appreciated.

- Moshe Braner