[comp.sys.hp] HP28S memory enhancement thru software- file compression.

peraino@gmu90x.UUCP ( ) (09/18/88)

                ARC - An HP28S File Compression/Archival System
                -----------------------------------------------
                     Ver. 1.0  Copyright September 12, 1988
                                  Bob Peraino









Introduction
------------
     There are two ways to get more memory on an HP28S. One way is to add
more physical memory. This is not a viable solution. The other way is to
use LESS memory- through data compression.
     Considering the size and complexity of current compression schemes, is
compression on a 28S feasable? With 32k, one could write a very robust
compression algorithm implementation. But how useful would it be if it took
up half, or even more, of existing memory? The real trick is to develop
a compression that not only works, but is useable. The implementation
presented here uses approximately only 3k bytes. One might think that
even 3k bytes is a lot to pay. But what one pays is not important, as long
as a net gain can be realized. After the initial version of ARC was
written, a quick test was performed to test the ability of the software
to produce a good return for the 3k investment. A collection of various
28S programs, totalling 6,839 bytes, was compressed. The compressed code
took up only 2,890 bytes. This is a 3,949 byte savings, which pays for
ARC already! So it is obvious that ARC easily pays for itself.
     How much "effective" memory can one have on an HP28S with ARC?
In the above example, we see code compressed down to 42% of its original
size. If only even a 50% compression rate is achieved, the "effective"
memory on a 28S is 48k bytes. Tests have shown that compression rates better
than 50% are easily achieved with ARC.
     This document and accompanying software represent months of work in
in my spare time, in the search for an effective compression implementation
for the HP28S. Many tests were done to find an optimal method to implement,
Taking into consideration code size, execution speed, and compression
robustness. I consider this software to be Shareware. Feel free to use,
redistribute, or modify this code for your own special needs. The code is
copiously documented, for that reason. All I ask is that this document be
redistributed intact, and the code not be republished for personal monetary
gain. If you find this software to be very useful, I would most appreciate
it if you would send me 5 bucks (U.S. Dollars) or whatever you think the
software is worth to you, to let me know that I am writing stuff that people
like, and to give me some incentive to work on version 2.0. Enjoy!

                                  Bob Peraino
                            George Mason University
                 University Computing and Information Systems
                             Thompson Hall, rm. 2
                             4400 University Drive
                              Fairfax, VA  22030
                                (703)-323-2549



Brief description of the parts.
-------------------------------

     The ARC compression system is broken into 3 main areas. The first
area is PAK/XPAK, and is used for compressing programs. The second area
is MASH/XASH, and is used for compressing matrices and vectors. The third
area is ARC/XARC/ARCL, and is used for archiving many files into one file, as
well as managing the use of the above two areas, so that the user does not
have to worry about which things are compressed, and which are not. ARC is
broken up in this fashion so that the user may take advantage of only the
code needed. For example, if you personally never work with vectors or
matrices, then you might want to skip the MASH/XMASH code. The documentation
will tell you how to maintain ARC/XARC so that it will know about the presence
or absence of code like MASH/XMASH. Or you might decide that you do not want
to waste space on the ARC/XARC/ARCL archiving system, and just manage your
compressed files yourself. The memory needs of each code area will be listed
seperately, so that you may weigh the options. The code areas will be
presented in the above mentioned order. Be VERY CAREFUL when keying in the
code, to make sure you don't introduce a bug. You wouldn't want ARC to
bomb, after compressing something! In an effort to minimize typing error, I
am reposting a checksum program here, which was posted to this newsgroup
earlier. Just put the NAME of a variable on the stack and invoke CHK. A hex
number will be returned which is the checksum. All program listings presented
here have a line above the code which says CHK[] with the hex checksum
number between the brackets. You should get this number when CHKing the
code.

CHK - Calculate a checksum for an object
      name : checksum
      CHK[129]

<< RCLF HEX 16 STWS                         ;set up word size.
   SWAP RCL ->STR DUP                       ;set up loop.
   #0h 1 ROT SIZE FOR j
     OVER j j SUB NUM
     R->B XOR RL                            ;work char into checksum.
   NEXT
   ->STR 3 OVER SIZE 1 - SUB                ;Restore things.
   ROT STOF SWAP DROP
>>

     Also take note of how non-printable characters are presented:
The less-than-or-equal-to sign above "S" is presented as "<="
The greater-than-or-equal-to sign above "T" is presented as ">="
The not-equal-to sign above "=" is presented as "<>"

     If you do not get a checksum match, you should first make sure
that uppercase and lowercase characters are specified properly. This was
a prime problem during beta-testing.
     As a final note, I have been using ARC on all of my stuff for quite a
while now, and have found it to be EXTREMELY stable. In other words, it has
never trashed anything on me. I would be most interested in hearing from
anyone about potential problems with the ARC system.



PAK/XPAK   Memory- 1457 bytes as listed. (Varies)
--------

     Place a program on the stack and invoke PAK. After a while, the program
will be replaced on the stack by a string, which is the compressed program.
It's that simple. How PAK works is detailed after the code listing. Note
that PAK will not run without its token dictionary, which is listed further
on. So don't immediately try running PAK after keying it in.


PAK- compress a program object.
      program object : compressed string
      CHK[FECB]

<< ->STR {" "} 10 CHR + -> obj eoo      ;  Set things up.
   << "" DUP 1 obj SIZE                 ;  Loop from 1 to end of string
      FOR i                             ;+-Set up loop.
        i 3 ROLLD obj i i SUB eoo OVER  ;|  Check for end of object.
        IF POS DUP THEN                 ;|+-If end,
          IF 2 SAME THEN                ;||+-If end of object was a newline,
            4 ROLL                      ;|||  Get object out of the way.
            DO 1 + UNTIL                ;|||+-Count until...
              obj OVER DUP SUB " " <>   ;||||  past the newline indentation.
            END                         ;|||+-Enddo
            1 - 4 ROLLD                 ;|||  Restore object.
          END                           ;||+-Endif
          DROP dict OVER                ;||  Check for obj in dict.
          IF POS DUP THEN               ;||+-If in dict,
            CHR SWAP DROP +             ;|||  Add token to output string.
          ELSE                          ;|||-else
            DROP DUP SIZE 128 + CHR     ;|||  Set token for unknown string,
            SWAP + +                    ;|||  and append to output string.
          END                           ;||+-Endif.
          ""                            ;||  Restore empty string for next obj.
        ELSE                            ;||-else
          DROP +                        ;||   add char to obj.
        END                             ;|+-Endif
        3 ROLL i - 1 +                  ;|  Calculate amount to step loop.
      STEP                              ;+-Loop back.
      DROP                              ;  Drop left-over empty string.
      WHILE DUP DUP SIZE DUP            ;+-While last token is ">>"...
      SUB NUM 2 SAME REPEAT             ;|
        1 OVER SIZE 1 - SUB             ;|  Strip off unneeded ">>"
      END                               ;+-Endwhile.
      2 OVER SIZE SUB                   ;  Strip 1st "<<". Who needs it?
   >>
>>                                      ;That's all folks...


     Next is XPAK, which will uncompress a compressed program string. Place
A compressed program string on the stack, and invoke XPAK. In a few seconds,
that string will be replaced by the original program object.


XPAK - uncompress a program object
         compressed string : program object
         CHK[5A25]

<< -> ob                            ;  Pick up object.
   << "<<" 1 ob SIZE                ;  Set up initial output string,
      FOR i                         ;+-and loop.
        " " + ob i DUP SUB NUM DUP  ;|  Add delimiter, get next byte.
        IF 127 > THEN               ;|+-If high bit set, it's a string.
          ob SWAP 128 - i 1 +       ;||  Strip high bit, add count to index.
          SWAP DUP 1 + 5 ROLLD      ;||  Roll new loop step value out of way.
          i + SUB + SWAP            ;||  Extract the string.
        ELSE                        ;||-Else
          dict SWAP GET + 1         ;||  It's a token. Add keyword to output.
        END                         ;|+-Endif
      STEP                          ;+-Loop back.
      STR->                         ;  Convert to real object.
   >>                               ;
>>                                  ;  That's all, folks...


     PAK compresses program objects through a process of tokenization.
PAK parses the program into it's individual words (a word being a string of
characters delimited by a space or newline) and looks those words up in a
token dictionary. If the word is found, then the entire word and its
delimeters are replace by a one byte token. If the word is not in the
dictionary, then the word cannot be tokenized, and the word is put into the
compressed string as-is, with a 1-byte header representing its size. The
format of a header/token byte is as follows:

               High bit----+             +----Low bit
                           |             |
                           v             v
                           7 6 5 4 3 2 1 0     <--- One 8-bit byte.
                           ^ ^           ^
                           | |           |
              +------------+ +-----------+
              |                    |
         Sub-header                If sub-header=0, 7-bit token.
         0=Token byte              If subheader=1, length of upcoming string.
         1=String header

This of course means that a string within a program can be a maximum of 127
bytes (it's unlikely that a string within a program would be longer than that),
and there can be a maximum of 127 words in the token dictionary.
     As an example, consider this small sample program:

<< SWAP DROP PROP >>

This program on the stack needs 35.5 bytes. If we were to compress it, how
much space would it need? First of all, PAK throws out the beginning "<<"
and all trailing ">>"'s. They are unnecessary. SWAP and DROP can be tokenized,
but PROP cannot. Assuming the use of the upcoming token dictionary, this
program can be compressed into the following 7 bytes:

7   -Token for SWAP
15  -Token for DROP
132 -Header for unknown word. Subtract high bit (128) = 4 bytes.
80  -"P"
82  -"R"
79  -"O"
80  -"P"

The resulting string might look like this (non-printable chars are "."):

"...PROP"


     Next is the dictionary of tokens, which PAK/XPAK use for en/de-coding
programs. Technical aspects of the dictionary follow.

dict- dictionary for PACK/UNPACK
      CHK[657E]

{ "<<" ">>" "+" "DUP" "STO" "END" "SWAP" "->" "THEN" "IF" "NEXT" "FOR"
  "GET" "-" "DROP" "SIZE" "->STR" "*" "SUB" "{" "}" "PURGE" "SAME"
  "ELSE" "PATH" "RCL" "STR->" "CHR" "TYPE" "NUM" "PUT" "->LIST"
  "EVAL" "/" "^" "R->B" "DISP" "POS" "MENU" "NOT" ">" "DO" "UNTIL"
  "START" "HOME" "PIXEL" "[" "]" "CLLCD" "STO+" "R->C" "RDM" "B->R"
  "SQ" "ROT" "PICK" "SF" "REPEAT" "WHILE" "ROLL" "ROLLD" "OR" "IP" ">="
  "MOD" "DROP2" "AND" "FS?" "#" "OVER" "IFERR" "KEY" "LIST->" "STOF"
  "CRDIR" "VARS" "FC?" "PMIN" "PMAX" "RAND" "SIN" "COS" "C->R" "FP"
  "<" "<=" "DGTIZ" "DEPTH" "DRAX" "BEEP" "ROLL2" "CLEAR" "RCLF" "HALT"
  "STO-" "CF" "<>" "INV" "SYSEVAL" "STWS" "XOR" }

     One might wonder how I came up with this dictionary. There are certainly
more than 127 possible commands on the 28S. I analyzed every 28S program I
could get my hands on, and added every unique 28S command to the dictionary.
I then did a frequency analysis on the commands, and sorted the dictionary
from most often to least often referenced order, to optimize command lookup
time. As a side note, after sorting, the difference in speed was nominal.
     It was mentioned earlier that the size of the PAK/XPAK segment varies.
It varies depending on the size of your dictionary. The above dictionary
may be optimal for me, but it may by no means be optimal for you. It is
presented as a starting point, and not a bad one at that. Because even if
you were to use an empty dictionary (So that nothing could be tokenized)
the resulting compressed string is guaranteed to still be smaller than
just taking the program object and performing ->STR. I suggest that you use
this dictionary to start, and if there are other commands you use heavily,
add them to the end. There are 101 words in the above dictionary. This means
that you can add another 26 words to this dictionary before it is full.
When building a personal dictionary, if you so desire, there are two
VERY IMPORTANT caveats:

1. "<<" and ">>" MUST BE THE FIRST TWO TOKENS, RESPECTIVELY. PAK depends on
   this.

2. ONCE YOU START COMPRESSING THINGS, YOU CAN ADD WORDS TO THE END OF THE
   DICTIONARY, BUT YOU CANNOT DELETE OR RE-ARRANGE THEM. UPON XPAKing,
   YOUR PRECIOUS CODE MAY COME OUT AS GARBAGE, IF YOU DO THIS. If for some
   reason, you must do this, XPAK everything first, or throw it all out. It
   is for this reason that I suggest that you use the above dictionary, add
   up to another 26 words if you need them, and forget about it.

     The size of the dictionary affects PAKing time, since word lookups
will take longer. This is why PAKing takes a while, and by comparison,
XPAK is extremely fast. As an example, the code for PAK is 405 bytes. It takes
168 seconds to compress the code to a string of 168 bytes (41% of original
size), but only 19 seconds to XPAK, using the above dictionary. This works
out quite well, since you will most likely only PAK code once, and XPAK a
copy of it when needed. (Actually, you will most likely use ARC/XARC to
manage it, but more on that later).
     You should have noticed by now that PAK is just a parser, with
modifications for optimal program compression. With a few simple changes, you
could use the PAK code and any home-made dictionary to parse and tokenize
ANY language or keyword set. As they say in the textbooks, this shall be left
as an exercise for the reader.



MASH/XMASH   Memory - 795 bytes.
----------

     Place a vector or matrix on the stack and invoke MASH. MASH will compress
the matrix and return the compress string to the stack. Again, how it works
is detailed after the code. MASH will not run without the mlex variable
installed, so do not try to invoke it until then.


MASH - Compress a matrix object
       matrix object : compressed string
       CHK[2C9]

<< 1 OVER TYPE                           ;  Get type.
   IF 4 SAME THEN                        ;+-Is it a complex matrix?
     SF                                  ;|  Yes, set flag 1.
   ELSE                                  ;|-Else
     CF                                  ;|  Clear flag 1.
   END                                   ;+-Endif
   ARRY-> DUP ->STR 3 OVER SIZE 2 - SUB  ;  Push array to stack. Get dimensions
   "}" + 1 FS? CHR SWAP + DEPTH ROLLD    ;  Add matrix type, throw on top.
   LIST->                                ;  Break up dimensions.
   IF 2 SAME THEN                        ;+-If 2-dimensional,
     *                                   ;|  Calc total # of elements.
   END                                   ;+-Endif
   1                                     ;  Add loop end,
   FOR j                                 ;+-And loop through the elements.
     j ROLL ->STR                        ;|  Pull off top (next) element.
     IF 1 FS? THEN                       ;|+-If it was a complex matrix,
       2 OVER SIZE 1 - SUB               ;||  Remove the parenthesis.
     END                                 ;|+-Endif
     DUP SIZE DUP CHR 3 ROLLD -> ob sz   ;|  Throw size on top, pass obj, size.
     << 1 sz                             ;|  Set up loop.
        FOR i                            ;|+-Loop thru chars in matrix element.
          mlex ob i i SUB POS 16 *       ;||  Find token nybble, shift left.
          IF sz i > THEN                 ;||+-Size could end on odd number...
            mlex ob i 1 + DUP SUB        ;|||  So get next char and put it in
            POS +                        ;|||  the low nybble of the byte.
          END                            ;||+-Endif
          CHR +                          ;||  Change to char, add to buffer.
        2 STEP                           ;|+-End of matrix element.
     >>                                  ;|
     DEPTH ROLL SWAP + DEPTH ROLLD       ;|  Add buffer to output string.
   -1 STEP                               ;+-End of matrix.
   DEPTH ROLL                            ;  In case it's on top of something,
>>                                       ;  Pull it down. That's all folks...


     Place a compressed vector/matrix string on the stack, and invoke XMASH.
The original vector/matrix will be returned.


XMASH - Uncompress a matrix object
        compressed string : matrix object
        CHK[7169]

<< DUP 1 1 SUB NUM "(" "" IFTE -> ob t   ;  If complex matrix, prefix is paren.
   << ob "}" POS 1 + ob SIZE             ;  Find size of compress string.
      FOR i                              ;+-Loop to pick up the bytes.
        t i 1 + ob i i SUB NUM 2 /       ;|  Get size of next element, and
        DUP 4 ROLLD i + .5 +             ;|  throw a copy on top of the stack.
        FOR j                            ;|+-Loop through the element.
          mlex ob j IP DUP SUB NUM       ;||  Get next byte from input string,
          DUP 16 / IP                    ;||  And get high nybble token.
          IF j FP .5 SAME THEN           ;||+-If we're talking about low nybble
            16 * -                       ;|||  Then grab low nybble instead.
          ELSE                           ;|||-Else
            SWAP DROP                    ;|||  Get rid of unneeded copy of byte
          END                            ;||+-Endif
          DUP SUB +                      ;||  Get char for token, add to output
        .5 STEP                          ;|+-Get next char in matrix element.
        STR-> SWAP DUP                   ;|  Convert element to #, recall size.
        IF FP .5 SAME THEN               ;|+-If size was odd,
          .5 +                           ;||  Then round up to next byte.
        END                              ;|+-Endif
        1 +                              ;|  +1 to get 1st byte of next element
      STEP                               ;+-Loop and get size of next element.
      "{" ob 2 ob "}" POS SUB +          ;  Rebuild dimensions,
      STR-> ->ARRY                       ;  And pull stack back to matrix.
   >>                                    ;
>>                                       ;  That's all folks...


     MASH/XMASH also works by a process of tokenization, by reducing the
number of bits needed to store each character, due to the limited number
of characters that could possibly appear in a vector/matrix. Only 4 bits are
needed to store each character, which means that two characters can be stored
in one byte. You might immediately think that this means a 50% reduction in
size, but that is not the case, as you will soon see.
     MASH first pushes all of the elements of the matrix onto the stack
( ARRY-> ). This does away with the need to store [ and ]. The dimensions are
stored with the packed matrix. MASH then compresses each individual element,
by packing two characters to a byte. This packed element is preceded in the
output string by a one byte header which specifies the length of the element.
For complex vectors/matrices, MASH also removes the unneeded parenthesis.
Elements are compressed on byte-boundaries. This means that if an element has
an odd number of characters, a nybble will be wasted on the end of the
compressed element. This decision was made after taking into consideration
the increased execution time and code complexity (more of it) necessary to
do away with byte boundaries.
     It is then a simple matter to uncompress the matrix. XMASH reads headers
and uncompresses elements, pushing them onto the stack. Then, it computes the
original dimensions and converts the stacked elements back into the original
vector/matrix ( ->ARRY ).
     MASH/XMASH use the following string to encode the vector/matrix
characters.


mlex - Token list for MASH/XMASH
CHK[456]

"0123456789,.E-"


     Now, on to performance considerations. As an example of what MASH will
do, a 10 X 10 real matrix, each element = 3.14, needs 820.5 bytes. MASH
takes 36 seconds to reduce to a 307 byte string. XMASH needs 52 seconds to
uncompress. This is a 62% reduction in the array size. As you may recall, I
mentioned earlier that memory savings will not be a flat 50%. When I talk
about the "length" of a number, I am referring to the number of characters
necessary to display the number. In a real vector/matrix, each element takes
8 bytes, regardless of the display length of the element. (For a complex
vector/matrix, the element cost is 16 bytes each).  This is not the case
where MASH is concerned. The longer the display length of a number, the more
information MASH has to store. As a result, compression capability depends on
the overall length of the numbers. In the above matrix example, if the matrix
were all zeroes, the compressed matrix would be 207 bytes, for a 75% reduction
in size.
     All of this indicates that there is a break-even point, at which MASH is
not saving you anything. For a real vector/matrix, that point is an average
length of 14. The 14 characters can be packed into 7 bytes, plus a 1 byte
header makes 8 bytes, which is the same number of bytes needed to store the
element in a real matrix. (For a complex matrix, the break-even point would
be 28 characters, excluding the parenthesis).
    All of this also indicates that the data can be "massaged" to improve
compression performance. For example, I normally do all of my work in
STD format. While doing some statistical analysis, I filled a size 50 vector
with some percentages. The vector takes 418 bytes. The results of the
calculations were numbers to an accuracy of 11 digits past the decimal point.
I can only MASH this vector down to 411 bytes, because I am close to the
break-even point. This is only a savings of 7 bytes. But I don't really need
that much precision for the work I am doing, so I FIXed the 28S at 2, (this
immediately reformats the vector on the stack), because I only want to save
them out to 2 digits past the decimal point, and that same vector returns
a string size of 159 bytes, for a reduction to 38% of its original size!
You may find that data massaging can help you in this way, as well. The
moral of this story is, don't store more precision than you need.



ARC/XARC/ARCL   Memory - 773 bytes
-------------

     ARC/XARC/ARCL manages the use of PAK/XPAK and MASH/XMASH so that you don't
have to. When I first started using PAK/XPAK, it was very easy to compress
just a few modules, to later go back and try to execute code which called those
modules, because I forgot they were compressed. ARC performs two important
functions; it invokes the proper compression/decompression routine by keeping
track of what's what, and it stores compressed things all in one file. In this
manner, you can ARC an entire directory to one file for storage. It is then
an easy matter to move that one file wherever you want. When you are ready to
use those files again, XARC the ARC-file into a temporary directory. XARC
will decompress as needed. What about object types other than programs and
vectors/matrices? They are archived as well, but not compressed. How this is
handled is coming up.
     To archive a bunch of files, put the file names into a list, put on the
stack, then push the name of an arc file onto the stack, and invoke ARC. If
the ARC-file doesn't exist, it will be created. If the file list duplicates
some names in the ARC-file, those archives will be replaced. If they are not
duplicates, they are added. This gives you the ability to update objects in
the ARC-file.


ARC - Archive a list of files to one file.
      {Filelist},arcfilename :
      CHK[D385]

<< -> ls af                            ;  Pick up filelist and arcfile name.
   << 31 CF af                         ;  Force system to drop args on error.
      IFERR RCL THEN                   ;+-If arcfile doesn't already exist,
        {}                             ;|  Start a new one.
      END                              ;+-Endif
      1 ls SIZE                        ;  Calc number of files to arc.
      FOR i                            ;+-Loop through the files.
        ls i GET DUP "ARCing "         ;|  Get file name,
        OVER ->STR + 1 DISP            ;|  Update display.
        IFERR RCL THEN                 ;|+-If file doesn't exist,
          DROP                         ;||  Then skip it.
        ELSE                           ;||-Else
          DUP TYPE -> t                ;||  Get object type.
          << arck t 1 + GET            ;||  Get commands from action list.
             STR-> -> on ob            ;||  Exec cmds, pass object name,
             << DUP                    ;||  and resulting object.
                IF on POS DUP THEN     ;||+-If object already exists,
                  SWAP OVER 1 - ob PUT ;|||  Replace the existing object
                  SWAP 1 + t PUT       ;|||  and type.
                ELSE                   ;|||-Else
                  DROP ob + on + t +   ;|||  Append new object parts.
                END                    ;||+-Endif
             >>                        ;||
          >>                           ;||
        END                            ;|+-Endif
      NEXT                             ;+-Get next filename.
      af STO CLMF                      ;  Store updated arcfile, restore
   >>                                  ;  Display.
>>                                     ;  That's all folks...


     ARC loops through the file list specified. It determines the object type
and invokes the proper compression routine, if any. (How this is done is coming
up). It then looks for a duplicate file name in the ARC-file, which is a list
object. If the name is found, the object and its type number are replaced. If
it is not found, then the object, its name, and its type are appended to the
ARC-file.
     To remove some files from an ARC-file, put the selected file names into a
list, and push the ARC-file name onto the stack. Those selected files will be
extracted and created. If the selected file already exists in the current
directory, it is replaced.


XARC - Remove objects from an archive file.
       {filename list},arcfilename :
       CHK[C82B]

<< -> ls af                     ;  Grab namelist and arcfile name.
   << 1 ls SIZE                 ;  Get number of filenames.
      FOR i                     ;+-Loop through filenames.
        ls i GET -> on          ;|  Get object name.
        << "XARCing " on ->STR  ;|  Update display.
           + 1 DISP af RCL DUP  ;|  Get arc file,
           DUP on               ;|  And object name.
           IF POS DUP THEN      ;|+-If name is in arcfile,
             SWAP OVER 1 - GET  ;||  Get the object.
             3 ROLLD 1 + GET    ;||  Get the object type.
             12 + arck SWAP GET ;||  Add 12 offset, get commands for that
             STR-> on STO       ;||  type, exec, store resulting output.
           ELSE                 ;||-Else
             3 DROPN            ;||  Not found, skip.
           END                  ;|+-Endif.
        >>                      ;|
      NEXT                      ;+-Get next file name.
      CLMF                      ;  Restore display.
   >>                           ;
>>                              ;  That's all folks...


     How do you recall what is in an ARC-file? With ARCL. Push the ARC-file
name onto the stack and invoke ARCL, and a list of the contents will be
returned.


ARCL- Arclist. List contents of an archive file.
      arcfilename : {contents list}
      CHK[3664]

<< RCL -> ls         ;  Get archive file.
   << {} 2 ls SIZE   ;  Calc number of entries.
      FOR i          ;+-Loop through entries.
        ls i GET +   ;|  Append entry name to list.
      3 STEP         ;+-Go to next entry.
   >>                ;
>>                   ;  That's all, folks...


     As an example, let's say we want to archive an entire directory of
stat work, to later retrieve it:

VARS              ;Get all file names.
'stat.a'          ;Specify ARC-file name.
ARC               ;Invoke ARC.


Now let's say we want to work on that stuff and want to extract EVERYTHING:

'stat.a'          ;Specify ARC-file name.
ARCL              ;Get a list of everything in it.
'stat.a'          ;Specify ARC-file name again.
XARC              ;And extract all of it.

And as a variation on the extraction to avoid specifying the ARC-file name
twice, we could have done:  'stat.a' DUP ARCL SWAP XARC.
     If you recall, we previously questioned what ARC/XARC does with object
types other than programs and vectors/matrices. ARC/XARC determines what to
do with each object type by using what I call an "action list". The action
list is a list object with a command string for every object type, which
specifies what to do with that object type upon ARCing and XARCing.


arck - Arc key. Action list for ARC/XARC
       CHK[DE75]

{"" "" "" "MASH" "MASH" "1 ->LIST" "->STR" "" "PAK" "->STR"
 "" "" "" "" "XMASH" "XMASH" "" "STR->" "" "XPAK" "STR->" ""}


     As an example, keeping in mind that object types start at 0, we see that
for object type 3, which is a vector/matrix object, we invoke "MASH" when
ARCing, and we invoke "XMASH" when XARCing. You should now be able to go
through the action list and determine what is being done for each object type.
     It was mentioned earlier that you could leave out PAK/XPAK or MASH/XMASH
if you did not need one or the other. This is where you maintain ARC, if you
wish to do so. For instance, if you aren't interested in the MASH/XMASH code,
find the action-list entries for compressing/uncompressing real and complex
vectors/matrices, and change them to empty strings. (The entry numbers would
be 4,5,15,16).
     So what about nonexistant files? They are ignored. My opinion is, an
archive or extraction process shouldn't bomb if one file was wrongly
specified. It can be specified later. And what about sub-directories? Version
1.0 of ARC ignores them. Version 2.0 of ARC just might do sub-directories as
well, such that archiving one directory recursively archives the entire
file structure below, into one file. If there is interest, I may persue this.



Conclusion.
-----------

     ALL 10 variables listed above should be installed in your HOME directory,
or a directory where it is globally accessable to all directories. If you are
a bit paranoid about not being able to retrieve something that was compressed,
then put your mind at rest by compressing a copy of something, then
uncompressing and compare. (The SAME function will suffice).
     I would be most interested in getting feedback, for better or worse.
I'll entertain any change suggestion, and I especially want to hear about
bugs, and overall impressions. I have other projects in the pipeline, so
stay tuned to this newsgroup!