iverson@cory.BERKELEY.EDU (Tim Iverson) (02/07/86)
-------- This here is the real thing; cut here and download -------- ARC File Archive Utility Version 5.0 (C) COPYRIGHT 1985, 1986 by System Enhancement Associates; ALL RIGHTS RESERVED This document describes the ARC file utility, version 5.0, which was created by System Enhancement Associates in January of 1986. Table of Contents Table of Contents Introduction .................................................... 1 Using ARC ....................................................... 3 ARC commands .................................................... 4 Adding files ................................................ 4 Extracting files ............................................ 6 Deleting files .............................................. 6 Listing archive entries ..................................... 7 Printing files .............................................. 9 Running files ............................................... 9 Testing an archive .......................................... 10 Converting an archive ....................................... 10 ARC options ..................................................... 11 Suppressing compression ..................................... 11 Backup retention ............................................ 12 Message suppression ......................................... 13 Encryption/decryption ....................................... 14 RAMdisk support ................................................. 15 MARC ............................................................ 16 XARC ............................................................ 17 Version numbers ................................................. 18 Program update service .......................................... 19 Common questions and answers .................................... 20 Revision history ................................................ 22 Changes in version 3 ........................................ 22 Changes in version 4 ........................................ 22 Changes in version 4.1 ...................................... 23 Changes in version 4.3 ...................................... 23 Changes in version 4.4 ...................................... 24 Changes in version 4.5 ...................................... 24 Changes in version 5.0 ...................................... 25 Program history and credits ..................................... 27 Site license .................................................... 28 Order form ...................................................... 30 INTRODUCTION INTRODUCTION ARC is the copyrighted property of System Enhancement Associates. You are granted a limited license to use ARC, and to copy it and distribute it, provided that the following conditions are met: 1) No fee may be charged for such copying and distribution. 2) ARC may ONLY be distributed in its original, unmodified state. Any voluntary contributions for the use of this program will be appreciated, and should be sent to: System Enhancement Associates 21 New Street Wayne, NJ 07470 You may not use this product in a commercial environment or a governmental organization without paying a license fee of $35. Site licenses and commercial distribution licenses are available. A program disk and printed documentation are available for $50. See the order form in the back of this manual for more details. A word about user supported software: freeware The user supported software concept (usually referred to as freeware) is an attempt to provide software at low cost. The cost of offering a new product by conventional means is staggering, and hence dissuades many independent authors and small companies from developing and promoting their ideas. User supported software is an attempt to develop a new marketing channel, where products can be introduced at low cost. If user supported software works, then everyone will benefit. The user will benefit by receiving quality products at low cost, and by being able to "test drive" software thoroughly before purchasing it. The author benefits by being able to enter the commercial software arena without first needing large sources of venture capital. But it can only work with your support. We're not just talking about ARC here, but about all user supported software. If you find that you are still using a program after a couple of weeks, then pretty obviously it is worth something to you, and you should send in a contribution. ARC Page 1 And now, back to ARC: ARC is used to create and maintain file archives. An archive is a group of files collected together into one file in such a way that the individual files may be recovered intact. ARC is different from other archive and library utilities in that it automatically compresses the files being archived, so that the resulting archive takes up a minimum amount of space. When ARC is used to add a file to an archive it analyzes the file to determine which of four storage methods will result in the greatest savings. These four methods are: 1) No compression; the file is stored as is. 2) Repeated-character compression; repeated sequences of the same byte value are collapsed into a three-byte code sequence. 3) Huffman squeezing; the file is compressed into variable length bit strings, similar to the method used by the SQ programs. 4) Dynamic Lempel-Zev compression; the file is stored as a series of variable size bit codes which represent character strings, and which are created "on the fly". Note that since one of the four methods involves no compression at all, the resulting archive entry will never be larger than the original file. An interesting note: It has been brought to our attention that BASIC not programs compress to a smaller size when they are not tokenized. If you are more concerned with space than speed, you may wish to convert your BASIC programs to ASCII form before adding them to an archive. Your BASIC manual should give instructions on how to do this. ARC Page 2 USING ARC USING ARC ARC is invoked with a command of the following format: ARC <x> <arcname> [<template> . . .] Where: <x> is an ARC command letter (see below), in either upper or lower case. <arcname> is the name of the archive to act on, with or without an extension. If no extension is supplied, then ".ARC" is assumed. The archive name may include path and drive specifiers. <template> is one or more file name templates. The "wildcard" characters "*" and "?" may be used. A file name template may include a path or drive specifier, though it isn't always meaningful. If ARC is invoked with no arguments (by typing "ARC", and pressing "enter"), then a brief command summary is displayed. Following is a brief summary of the available ARC commands: a = add files to archive m = move files to archive u = update files in archive f = freshen files in archive d = delete files from archive x,e = extract files from archive r = run files from archive p = copy files from archive to standard output l = list files in archive v = verbose listing of files in archive t = test archive integrity c = convert entry to new packing method Following is a brief summary of the available ARC options, which may alter how a command works: b = retain backup copy of archive s = suppress compression (store only) w = suppress warning messages n = suppress notes and comments g = encode or decode archive entry ARC Page 3 ARC COMMANDS ARC COMMANDS This section describes each of the commands. ARC will accept any one command at a time. If no commands are given, then a brief command list is displayed. ____________ ADDING FILES Files are added to an archive using the "A" (Add), "M" (Move), "U" (Update), or "F" (Freshen) commands. Add always adds the file. Move differs from Add in that the source file is deleted once it has been added to the archive. Update differs from Add in that the file is only added if it is not already in the archive, or if it is newer that the corresponding entry in the archive. Freshen is similar to Update, except that new files are not added to the archive; only files already in the archive are updated. For example, if you wish to add a file named "TEST.DAT" to an archive named "MY.ARC", you would use a command of the form: ARC a my test.dat If you wanted to add all files with a ".C" extension, and all files named "STUFF" to an archive named "JUNK.ARC", you could type: ARC a junk *.c stuff.* If you wanted to move all files in your current directory into an archive named "SUM.ARC", you could use a command of the form: ARC m sum *.* If you have an archive named "TEXT.ARC", and you wanted to add to it all of your files with an extension of ".TXT" which have been created or changed since they were last archived, then you would type: ARC u text *.txt If you have a bunch of files in your current directory, with backup copies being stored in an archive named "SAFE.ARC", then if you wanted to make sure that every file in the archive is the latest version of that file, you would type: ARC f safe ARC Page 4 A word about Update and Freshen: These are similar in that they look at the date and time of last change on the file, and only add it if the file has been changed since it was last archived. They differ in that Update will add new files, while Freshen will not. In other words, Update looks for the files on disk, and adds them if they are new or have changed, while Freshen looks in the archive, and tries to update the files which are already there. Archive entries are always maintained in alphabetic order. Archive entries may not have duplicate names. If you add a file to an archive that already contains a file by that name, then the existing entry in the archive is replaced. Also, the archive itself and its backup will not be added. You may also add a file which is in a directory other than your current directory. For example, it is perfectly legal to type: ARC a junk c:\dustbin\stuff.txt You cannot add two files with the same name. In other words, if you have a file named "C:\DUSTBIN\STUFF.TXT" and another file named "C:\BUCKET\STUFF.TXT", then typing: arc a junk c:\dustbin\*.* c:\bucket\*.* will not work. ARC does not save the path name. In other words, if you specify a drive and/or path when adding a file, only the actual file name is stored in the archive. ARC will never add an archive to itself, nor will it add the temporary copy or a backup copy of the archive. ARC Page 5 ________________ EXTRACTING FILES Archive entries are extracted with the "E" (Extract) and "X" (eXtract) commands. For example, if you had an archive named "JUNK.ARC", and you wanted all files in it with an extension of ".TXT" or ".DOC" to be recreated on your disk, you could type: ARC x junk *.txt *.doc If you wanted to extract all of the files in an archive named "JUNK.ARC", you could simply type: ARC x junk Whatever method of file compression was used in storing the files is reversed, and uncompressed copies are created in the current directory. You can also specify a path name, in which case the decompressed copy is placed in the specified directory. For example, if you had an archive named "JUNK.ARC", and you wanted all files in it with an extension of ".TXT" to be placed in the directory "C:\WASTE\LAND", then you could type: ARC x junk c:\waste\land\*.txt If you wanted to put the file "TRASH.TXT" on your A: drive, and the file "LITTER.TXT" on your B: drive, you could type: ARC x junk a:trash.txt b:litter.txt If you give more than one path for a file, then only the first one is used. For example, if you typed: ARC x junk a:trash.txt b:trash.txt then TRASH.TXT will be placed on your A: drive. ______________ DELETING FILES Archive entries are deleted with the "D" (Delete) command. For example, if you had an archive named "JUNK.ARC", and you wished to delete all entries in it with a filename extension of ".C", you could type: ARC d junk *.c ARC Page 6 _______________________ LISTING ARCHIVE ENTRIES You can obtain a list of the contents of an archive by using the "L" (List) command or the "V" (Verbose list) command. For example, to see what is in an archive named "JUNK.ARC", you could type: ARC l junk If you are only interested in files with an extension of ".DOC", then you could type: ARC l junk *.doc ARC prints a short listing of an archive's contents like this: Name Length Date ============ ======== ========= ALPHA.TXT 6784 16 May 85 BRAVO.TXT 2432 16 May 85 COCO.TXT 256 16 May 85 ==== ======== Total 3 9472 "Name" is simply the name of the file. "Length" is the unpacked file length. In other words, it is the number of bytes of disk space which the file would take up if it were extracted. "Date" is the date on which the file had last been modified, as of the time when it was added to the archive. "Total" is pretty obvious, I think. ARC prints a verbose listing of an archive's contents like this: Name Length Stowage SF Size now Date Time CRC ============ ======== ======== ==== ======== ========= ====== ==== ALPHA.TXT 6784 Squeezed 35% 4413 16 May 85 11:53a 8708 BRAVO.TXT 2432 Squeezed 41% 1438 16 May 85 11:53a 5BD6 COCO.TXT 256 Packed 5% 244 16 May 85 11:53a 3AFB ==== ======== ==== ======== Total 3 9472 27% 6095 "Name", "Length", and "Date" are the same as for a short listing. "Stowage" is the compression method used. The following compression methods are currently employed: -- No compression. Packed Runs of repeated byte values are collapsed. Squeezed Huffman squeeze technique employed. Crunched Lempel-Zev compression technique employed. ARC Page 7 "SF" is the stowage factor. In other words, it is the percentage of the file length which was saved by compression. The total stowage factor is the stowage factor for the archive as a whole, not counting archive overhead. "Size now" is the number of bytes the file is occupying while in the archive. "Time" is the time of last modification, and is associated with the date of last modification. "CRC" is the CRC check value which has been stored with the file. Another CRC value will be calculated when the file is extracted or tested to ensure data integrity. There is no especially good reason for displaying this value. ARC Page 8 ______________ PRINTING FILES Archive entries may be examined with the "P" (Print) command. This works the same as the Extract command, except that the files are not created on disk. Instead, the contents of the files are written to standard output. For example, if you wanted to see the contents of every ".TXT" file in an archive named "JUNK.ARC", but didn't want them saved on disk, you could type: ARC p junk *.txt If you wanted them to be printed on your printer instead of on your screen, you could type: ARC p junk *.txt >prn _____________ RUNNING FILES Archive entries may be run without being extracted by use of the "R" (Run) command. For example, if you had an archive named "JUNK.ARC" which contained a file named "LEMON.COM", which you wished to run, you could type: ARC r junk lemon You can run any file from an archive which has an extension of ".COM", ".EXE", ".BAT", or ".BAS". You do not have to specify the extension, but all matching files are run if you do not. In other words, if you had an archive named "JUNK.ARC" which contained the files "LEMON.COM", "LEMON.EXE", and "LEMON.BAS", and you typed: ARC r junk lemon Then all three programs will be run. You can avoid this by specifying an extension in this case. You cannot give arguments to a program you are running from an archive. Also, you will need a fair amount of memory to run a program from an archive. It probably cannot be done with less than 256k. In practice, the file to be run is extracted, run, and then deleted. ARC Page 9 __________________ TESTING AN ARCHIVE The integrity of an archive may be tested by use of the "T" (Test) command. This checks to make sure that all of the file headers are properly placed, and that all of the files are in good shape. This can be very useful for critical archives, where data integrity must be assured. When an archive is tested, all of the entries in the archive are unpacked (without saving them anywhere) so that a CRC check value may be calculated and compared with the recorded CRC value. For example, if you just received an archive named "JUNK.ARC" over a phone line, and you want to make sure that you received it properly, you could type: ARC t junk It defeats the purpose of the T command to combine it with N or W. _____________________ CONVERTING AN ARCHIVE The "C" (Convert) command is used to convert an archive entry to take advantage of newer compression techniques. This is occasionally desirable when a new version of ARC is released. Please refer to the revision history section for details on when new compression methods were implemented. For example, if you had an archive named "JUNK.ARC", and you wanted to make sure that all files with an extension of ".DOC" were encoded using the very latest methods, you could type: ARC c junk *.doc Or if you wanted to convert every file in the archive, you could type: ARC c junk ARC Page 10 ARC OPTIONS ARC OPTIONS This section describes the options which are available to modify how ARC works. Any of these options can be combined with any of the commands, though the result may not always be something you'd want to do. _______________________ SUPPRESSING COMPRESSION The "S" (Suppress compression) option can be combined with any command that updates archive entries. These include Add, Move, Update, Freshen, and Convert. The effect of the S option is to prevent any compression techniques from being employed. This is intended to allow you to add a few files at a time to an archive quickly, and then later convert the archive to compress everything at once. For example, over the course of a day you might give each of the following commands: ARC as junk *.txt ARC as junk *.mac ARC as junk *.doc At the end of the day, when you have finished adding things to the archive, you could have all of the archive entries compressed at once by typing: ARC c junk You could also decompress the archive by typing: ARC cs junk though I can't imagine why you'd want to. ARC Page 11 ________________ BACKUP RETENTION When ARC changes an archive (during an Add, Move, Update, Freshen, Delete, or Convert) it creates a new archive with the same name, but with an extension of ".$$$". For example, if you add a file to an archive named STUFF.ARC, then ARC will create a new archive named STUFF.$$$. ARC will read from your existing archive and write out the new archive with any changes to the ".$$$" copy. Normally when ARC is finished it deletes the original and renames the new archive to the original name (ie. STUFF.ARC goes away, and STUFF.$$$ becomes the new STUFF.ARC). Among other things, this means that if anything goes wrong and ARC is unable to finish, then your original archive will still be intact. In some circumstances you may wish to retain the original version of the archive as a backup copy. You can do this easily by using the Backup option. Add the letter "B" to your command, and ARC will rename your original archive to have an extension of ".BAK" instead of deleting it. In other words, if you wanted to add "WASTE.TXT" to an archive named "JUNK.ARC", but wanted to keep a backup copy, then you would type: ARC ab junk waste.txt Your original archive would become "JUNK.BAK", while "JUNK.ARC" would contain the new "WASTE.TXT" file. If you keep a backup of an archive which already has a backup, then the older backup copy is deleted. ARC Page 12 ___________________ MESSAGE SUPPRESSION ARC prints three types of messages: warnings, comments, and errors. Warnings are messages about suspected error conditions, such as when a file to be extracted already exists, or when an extracted file fails the CRC error check. Warnings may be suppressed by use of the "W" (Warn) command. You should use this command sparingly. In fact, you should probably not use this command at all. Comments (or notes) are informative messages, such as naming each file as it is added to the archive. Comments and notes may be suppressed by use of the "N" (Note) command. Errors are actual system problems, such as running out of disk space. You cannot suppress errors. For example, suppose you extracted all files with an extension of ".BAS" from an archive named "JUNK.ARC" Then, after making some changes which you decide not to keep, you decide that you want to extract them all again, but you don't want to be asked to confirm every one. In this case, you could type: ARC xw junk *.bas Or, if you are going to add a hundred files with an extension of ".MSG" to an archive named "TRASH.ARC", and you don't want ARC to list them as it adds them, you could type: ARC an trash *.msg Or, if you want to extract the entire contents of an archive named "JUNK.ARC", and you don't want to hear anything, then type: ARC xnw junk ARC Page 13 _____________________ ENCRYPTION/DECRYPTION Archive entries may be encrypted and decrypted by using the "G" (Garble) option. The Garble option takes the remainder of the command last string as the password to use, so it must be the last option. For example, if you wanted to add a file named "WASTE.TXT" to an archive named "JUNK.ARC", and you wanted to encrypt it using the password "DEBRIS", then you would type: ARC agdebris junk waste.txt Later on, when you want to extract it again, you would type: ARC xgdebris junk waste.txt The password you supply is used to encrypt (or decrypt) the archive entry by performing an exclusive OR between each byte of the packed data and each byte of the password. The password can be any length, and each of its bytes is used in rotation. The password is converted not to uppercase before it is used, so it is not case sensitive. Since the encryption is performed on the packed data, it has no effect on stowage factors. This is not a particularly sophisticated means of encryption, and it is theoretically possible to crack. Still, since it is performed on the packed data, the result should be quite sufficient for casual use. You can, if you wish, use different passwords for different files in an archive, but we advise against it. If you are going to encrypt an archive, we suggest you use the same password for every file, and give the password whenever you do anything at all with the archive. It is possible to list the entries in an encrypted archive using the "L" and "V" commands without giving the password, but nothing else will work properly. We advise that you use this option sparingly, if at all. If you should forget or mistype your password, it is highly unlikely that you will ever recover your data. ARC Page 14 RAMDISK SUPPORT RAMDISK SUPPORT If you have a RAMdisk, or other high-speed storage, then you can speed up ARC somewhat by telling it to put its temporary files on the RAMdisk. You do this by setting the ARCTEMP environment string with the MS-DOS SET command. For example, if drive B: is your RAMdisk, then you would type: set ARCTEMP=B: Refer to the MS-DOS manual for more details about the SET command. You need only set the ARCTEMP string once, and ARC will use it from then on until you change its value or reboot your system. If ARC does not find an environment string named ARCTEMP, then it looks for one named TEMP to use instead. Several packages already use the TEMP string for exactly this purpose. If you have need of an environment string named TEMP for something else, then you should be sure to define ARCTEMP. There are a limited number of temporary files created by ARC. The one most often used is "$ARCTEMP.CRN", which is created (if possible) when adding a file to an archive. The Convert command uses a file named "$ARCTEMP.CVT" to hold each file as it is being converted. The Run command also creates a temporary file, which has the name "$ARCTEMP", and whose extension matches that of the file being run. ARC Page 15 MARC MARC MARC is a separate program which is used to merge archives created by ARC. MARC moves files from one archive to another without unpacking them. MARC is used as follows: MARC <target> <source> [<template> . . .] Where: <target> is the name of the archive to add files to. <source> is the name of the archive to read files from. <template> is one or more file name templates. The wildcard characters "*" and "?" may be used. If no template is supplied, then all of the files in <source> are added to <target>. It is not necessary for the target to exist. If it does not exist, then it is created. Thus, MARC can be used as an "extractor" as well as a "merger". For example, if you wanted to create an archive named "JUNK.ARC", which is to contain all of the files with an extension of ".TXT" which are currently contained in another archive named "WASTE.ARC", then you could type: MARC junk waste *.txt If you wanted to create an archive named "JUNK.ARC", which is to contain all of the files currently in the archives "WASTE.ARC" and "TRASH.ARC", you could type: MARC junk waste MARC junk trash Though it would probably be faster to type: COPY waste.arc junk.arc MARC junk trash If MARC is invoked with no arguments, then it gives brief directions in its use. ARC Page 16 XARC XARC XARC is a separate program which is used to extract all files from one or more archives. It doesn't do anything that ARC doesn't do, and it isn't any faster, but it may be preferred in certain cases, as it is much smaller than ARC. XARC is used as follows: XARC <arcname> . . . Where <arcname> is the name of one or more archives. The wildcard characters "*" and "?" may be used. All files are extracted from the named archives. For example, if you wanted to extract everything from two archives named "WASTE.ARC" and "JUNK.ARC", you could type: XARC waste junk If you wanted to extract every file from every archive in a subdirectory named "TRASH", you could type: XARC trash\*.arc If XARC is invoked with no arguments, then it gives brief directions in its use. ARC Page 17 VERSION NUMBERS VERSION NUMBERS There seems to be some confusion about our version numbering scheme. All of our version numbers are given as a number with two decimal places. The units indicate a major revision, such as adding a new packing algorithm. The first decimal place (tenths) indicates a minor revision that is not essential, but which may be desired. The second decimal place (hundredths) indicates a trivial revision that will probably only be desired by specific individuals or by die- hard "latest version" fanatics. ARC also displays its date and time of last edit. A change of the date and time without a corresponding change in version number indicates a truly trivial change, such as fixing a spelling error. To sum up: If the units change, then you should get the newer version as soon as you can. If the tenths change, then you may want to get the newer version, but there's no hurry. If anything else changes, then you probably shouldn't bother. This is reflected by our own habit of referring to "version 4.5" instead of "version 4.52". ARC Page 18 PROGRAM UPDATE SERVICE PROGRAM UPDATE SERVICE A license to ARC entitles you to use all future versions of ARC. New versions are generally available through normal freeware distribution channels, and we prefer that you obtain them that way. However, many users of ARC have written to ask us about an update service. A program disk containing the latest version is returned on every order of $50 or more. If you wish to purchase a single-user license and want an update disk, please enclose a check or money order for $50 instead of $35. For a fee of $50 per year you can subscribe to our program update service. Subscribers get up to five program updates per year mailed to them as new versions come out. This does not include trivial releases. ARC Page 19 COMMON QUESTIONS AND ANSWERS COMMON QUESTIONS AND ANSWERS Here are some of the more common questions we've received about ARC, along with their answers: Q: Why do you bother with squeezing when crunching is so much faster? A: Because crunching isn't really as fast as it looks. Crunching is a one pass operation, while squeezing requires two passes. ARC actually does the crunching during the analysis pass, and puts the crunched output in a file named "$ARCTEMP.CRN". If crunching turns out to be the best method, then this temporary file is copied into the new archive. In other words, when ARC says "crunching" it isn't really crunching, it's just copying a file. much Also, there are a lot of files out there that squeeze much better than they crunch. Q: Why does ARC run out of room if I make an archive bigger than about 180k? A: Because you are working on a floppy disk. ARC creates a copy of your archive, incorporating any new files as it goes. When it is done, it deletes the original and renames the new one. There are a number of reasons for doing it this way, one being that your original archive is still intact if anything happens while ARC is running. ARC also needs room for the temporary file used in crunching. You can save some space by using the ARCTEMP environment string to put the temporary file on another disk, as well as using drive specifiers and having the archive and the files to add on separate disks, but you still won't be able to make an archive larger than about 180k. If you need to make a larger archive, and if you have a fixed disk, then you can create the archive on the fixed disk and then copy it to the floppy. Q: I've seen an ARC.COM and an ARC.EXE. Which one is the right one? A: ARC.EXE. One or more people have been running ARC through a utility that converts an ".EXE" file to a ".COM" file. But this space utility is designed to save space, not speed. On ARC it saves about 250 bytes, and makes no measurable difference in program speed. We've decided that the savings are not worth the extra step in development in this case. ARC Page 20 Q: Can I use ARC to distribute my public domain or freeware program? A: Yes, of course. Q: Can I use ARC to distribute my commercial software package? A: Yes. Please contact us for a commercial distribution license. Q: I'm a commercial user. Why should I pay for freeware that others get for free? all A: Because you cannot credibly plead poverty. Freeware, all freeware, is an attempt to develop a new marketing channel to the benefit of everyone. You can still "test drive" freeware for a short period, but if you decide to use it in your business, then you should pay for it. Q: Why not allow me to select which method of compression I want ARC to use? A: It would needlessly complicate ARC, both internally and in use. The exact nature of the compression methods used are complex, and quite different. The only sure way to tell which will be best in any given case is to analyze the data, as ARC does. The method chosen may not always be what you expect. Q: How can I get the latest version of ARC? A: ARC updates are distributed through normal freeware channels, and by FidoNet. We also ship a program update disk on every order of $50 or more. In addition, we also offer an update subscription service. See the previous section for more details. ARC Page 21 REVISION HISTORY REVISION HISTORY ____________________ CHANGES IN VERSION 3 The function used to calculate the CRC check value in previous versions has been found to be in error. It has been replaced with the proper function. ARC will still read archives created with earlier versions of ARC, but it will report a warning that the CRC value is in error. All archives created prior to version 3.0 should be unpacked and repacked with the latest version of ARC. Transmitting a file with XMODEM protocol rounds the size up to the next multiple of 128 bytes, adding garbage to the end of the file. This used to confuse ARC, causing it to think that the end of the archive was invalidly formatted. This has been corrected in version 3.01. Older archives may still be read, but ARC may report them to be improperly formatted. All files can be extracted, and no data is lost. In addition, ARC will automatically correct the problem when it is encountered. ____________________ CHANGES IN VERSION 4 ARC is adding another data compression technique in this version. We have been looking for some technique that could improve on Huffman squeezing in at least a few cases. So far, Lempel-Zev compression seems to be fulfilling our fondest hopes, often achieving compression rates as much as 20% better than squeezing, and sometimes even better. Huffman squeezing depends on some bytes being more "popular" than others, taking the file as a whole. Lempel-Zev compression is instead looking for strings of bytes which are repeated at various points (such as an end of line followed by spaces for indentation). Lempel- Zev compression is therefore looking for repetition at a more "macro" level, often achieving impressive packing rates. In the typical case a file is added to an archive once and then extracted many times, so the increased time for an update should more than pay for itself in increased disk space and reduced file transmission time. As usual, ARC version 4.0 is completely upward compatible. That is, it can deal properly with any archive created by any earlier version of ARC. It is NOT reverse compatible. Archives created by ARC 4.0 will generally not be usable by earlier versions of ARC. ARC Page 22 ______________________ CHANGES IN VERSION 4.1 Lempel-Zev coding has been improved somewhat by performing non-repeat compression on the data before it is coded (as was already done with Huffman squeezing). This has the two fold advantage of (a) reducing to some extent the amount of data to be encoded, and (b) increasing the time it takes for the string table to fill up. Performance gains are small, but noticeable. The primary changes are in internal organization. ARC is now much "cleaner" inside. In addition to the aesthetic benefits to the author, this should make life easier for the hackers out there. There is also a slight, but not noticeable, improvement in overall speed when doing an update. Version 4.1 is still fully upward compatible. But regretfully, it is again not downward compatible. Version 4.1 can handle any existing archive, but creates archives which older versions (including 4.0) cannot unpack. ______________________ CHANGES IN VERSION 4.3 Version 4.3 adds the much-demanded feature of using pathnames when adding files to an archive. For obscure technical reasons, files being extracted still go in the current directory on the current drive. Pathnames are also not supported for any of the other commands, because it would make no sense. Version 4.3 is also using a slightly different approach when adding a file to an archive. The end result is twofold: 1) Slightly more disk space is required on the drive containing the archive. This should only be noticeable to those creating very large archives on a floppy based system. 2) A 30% reduction in packing time has been achieved in most cases. This should be noticeable to everyone. As always, version 4.3 is still fully upwards compatible, and is backwards compatible as far as version 4.1. ARC Page 23 ______________________ CHANGES IN VERSION 4.4 The temporary file introduced in version 4.3 occasionally caused problems for people who had not added a FILES= statement to their CONFIG.SYS file. This has now been corrected. Also, support of the ARCTEMP environment string was added to allow placing of the temporary file on a RAMdisk. A bug was reported in the Run command, which has been fixed. From the nature of the bug, and the extreme time required before the bug was reported, it is deduced that the Run command is probably the least used feature of ARC. The Update command was changed. It is no longer a straight synonym for Add. Instead, Update now only adds a file if it is newer than the version already in the archive, as shown by the MS-DOS date/time stamp. ______________________ CHANGES IN VERSION 4.5 The Convert command was not making use of RAMdisk support. Now it is. The Freshen command was added. Our first choice for a name was Refresh, but we already had a Run command. Assuming that you have an archive which already contains everything you want in it (for software distribution, perhaps), then Freshen would be used to update the archive. It was pointed out to us that ARC already knows what is in the archive, so it should be able to look on disk for newer versions. Now it can. The Suppress compression option was added by popular demand. It allows files to be added quickly to an archive, since the files are not analyzed or compressed, but merely stored straight. The intent is to allow users to build an archive "in pieces", and then compress all of the entries at once with the Convert command. The conversion is much faster if you take advantage of RAMdisk support. A minor bug was detected in our handling of date/time stamps which occasionally resulted in stamping an archive with the wrong date and time. This has been corrected. ARC Page 24 ______________________ CHANGES IN VERSION 5.0 Several users of ARC have written us to suggest that we should put an order form in the documentation. It seems that several types of organizations require something like that or they cannot pay no matter how much they'd like to. While doing that, we also went over the documentation from top to bottom and "slicked it up". It's now a bit more of an actual manual. We think you'll appreciate the table of contents as well; we sure do! The Move command used to delete the files as it went. It now waits until it is finished updating the archive, and deletes them all at did once. (You did know that Move is just an Add where the file gets deleted, didn't you?) This, along with the changes made in version 4.5, means that it is now much safer to interrupt ARC while it is working. The Print command no longer prints the name of each file. Instead, it prints a formfeed after each file. The Run command now supports BASICA programs. Also, the filename extension is no longer required on the Run command. The Garble option was added. It provides a convenient means of low level data security for the casual user. Use it sparingly, if at all. ARC no longer tests for the presence of $ARCTEMP.CRN before creating a new one. If you interrupt ARC a lot, you'll find this much more convenient. If you happen to have a file named $ARCTEMP.CRN which you want to keep, too bad. Improved error recovery was added when reading an archive. ARC now has a good chance of recovering the data from a corrupted archive (the corrupted entry is still lost, of course). Path support has been added for all commands, though it doesn't do anything on most of them. For example, there isn't much we can do with a path in the List command. But many users will be glad to know that a path can be used when extracting a file, and specifies where the file is to be placed. Support for the TEMP environment string was added. If ARC doesn't find an environment string named ARCTEMP, then it looks for one named TEMP to use instead. Several packages already use the TEMP string for exactly this purpose. With any luck, maybe we can get a standard going. ARC Page 25 ARC is now using a different variation of Lempel-Zev coding, courtesy of Kent Williams, who found it on USENET and adapted it to the IBM PC. The new method differs from the old in several respects. The most significant differences are: 1) Where our previous implementation used a fixed code size of twelve bits, the new one starts with a code size of nine bits and increases it as it needs to. 2) The earlier method tended to "choke" on large files when its string table filled up. The new method has a rather ingenious scheme its adaptive reset authors call adaptive reset. When it notices that its string table has filled, and its compression ratio is starting to suffer, it clears the table and starts defining new strings. Our benchmarks show an improvement in compression on the order of 10% when crunching is used. Additionally, ARC 5.0 is on the order of 23% faster at adding a file when crunching is used, or 13% faster when squeezing is used. Extracting a file crunched with the new method is 27% faster than it is with the old method. Extraction of any other type of file (including those crunched with the older method) is no faster than before. These figures are based on our own benchmark tests; your results may vary. The previous implementation of Lempel-Zev coding is no longer used to pack files. The "V" (Verbose listing) command distinguishes between the two by referring to the older method as "crunched" (with a lower- case "c"), and the newer method as "Crunched" (with a capital "C"). ARC 5.0 can still read archives created by earlier versions of ARC, but once again it creates archives which older versions cannot read. ARC Page 26 PROGRAM HISTORY AND CREDITS PROGRAM HISTORY AND CREDITS In its short life thus far, ARC has astounded us with its popularity. We first wrote it in March of 1985 because we wanted an archive utility that used a distributive directory approach, since this has certain advantages over the more popular central directory approach. We added automatic squeezing in version 2 at the prompting of a friend. In version 2.1 we added the code to test for the best compression method. Now (in October of 1985) we find that our humble little program has spread across the country, and seems to have become a new institution. We are thankful for the support and appreciation we have received. We hope that you find this program of use. If we have achieved greatness, it is because we have stood upon the shoulders of giants. Nothing is created as a thing unto itself, and ARC is no exception. Therefore, we would like to give credit to the following people, without whose efforts ARC could not exist: Brian W. Kernighan and P. J. Plauger, whose book "Software Tools" provided many of the ideas behind the distributive directory approach used by ARC. Dick Greenlaw, who wrote the public domain SQ and USQ programs, in which the Huffman squeezing algorithm was first developed. Robert J. Beilstein, who adapted SQ and USQ to Computer Innovations C86 (the language we use), thus providing us with important parts of our squeezing logic. Kent Williams, who graciously allowed us to use his LZWCOM and LZWUNC programs as a basis for our Lempel-Zev compression logic, and who continues to make valuable contributions. David Schwaderer, whose article in the April 1985 issue of PC Tech Journal provided us with the logic for calculating the CRC 16 bit polynomial. Terry A. Welch, whose article "A Technique for High Performance Data Compression", IEEE Computer Vol 17 No 6 (June 1984) seems to have started all the research on Lempel-Zev coding. Spencer W. Thomas, Jim McKie, Steve Davies, Ken Turkowski, James A. Woods, and Joe Orost, who are the authors of the UNIX compress utility. And many, many others whom we could not identify. ARC Page 27 SITE LICENSE SITE LICENSE Corporate users may wish to obtain a site license for the use of ARC. Please use the order form in this manual to order a site license. Site licenses are granted as of when we receive your payment. License fees vary depending on the number of computers on which ARC will be used, as follows: 1 to 9 copies $35 each 10 to 24 copies $25 each 25 to 49 copies $20 each 50 to 99 copies $15 each over 99 copies $1500 one time fee The following page is a site license agreement, which should be signed and sent with your payment when ordering a commercial site license. ARC Page 28 The use of ARC in a commercial environment or government organization is granted under the following terms: 1. Payment of the license fee must be made to System Enhancement Associates. The fee is based on the number of computers which will be used to run ARC, as follows: 1 to 9 copies $35 each 10 to 24 copies $25 each 25 to 49 copies $20 each 50 to 99 copies $15 each over 99 copies $1500 one time fee 2. You may use ARC on the number of computers included in the license fee. If you have paid the fee for over 99 copies, then you may use ARC on any number of computers within your organization. 3. You may make copies of the program, in its original, unmodified form, without restriction. You may distribute these copies of the program without restriction. 4. If these copies are distributed outside of your organization, you have no obligation to control the use of those copies which are outside of your organization. 5. You may make copies of the program documentation, in both its printed form and machine readable form, without restriction. 6. You may use future versions of ARC under this license. The latest version is available from System Enhancement Associates for a small service charge. 7. You may NOT modify the program or charge a fee for copying or distributing the program. 8. It is your responsibility to make the necessary copies and to deliver them to the computers which they will be used on. I agree to abide by the terms and conditions of this license. _____________________________ __________________________ Signature Date _____________________________ Name (please print or type) _____________________________ Title _____________________________ Company ARC Page 29 ORDER FORM ORDER FORM Check which items you wish to purchase: (_) Noncommercial license for the use of ARC. (_) Commercial license for the use of ARC on ___ computers (see price schedule and terms on preceding page). (_) Program disk and documentation (only on orders of $50 or more). (_) Program update subscription service (not more than five updates, does not include trivial changes), $50/year. (_) Payment of $_____ is enclosed (check or money order). (_) Please charge $_____ to my (_) Visa or (_) MasterCard: Card number: ______________________________ Expiration date: __________________________ _______________________________________________ Name _______________________________________________ _______________________________________________ Address ______________________ ________ ____________ City State Zip _______/_______ FidoNet address Send this completed form to: System Enhancement Associates 21 New Street Wayne, NJ 07470 For program disk orders outside the U.S., please add an additional $5, and enclose an international money order payable in U.S. currency. For commercial site license orders, please enclose a signed copy of the site license agreement. ARC Page 30