tgcpwd@rc3.urc.tue.nl (Wim van Dorst) (07/07/90)
Hello *,
I have tested the newly posted lharc and comic and compared them with
the old compress. I took a large binary file (kermit), a large source
file (three concatenated floppy.c) and an archive file (ar q wmail/*,
thus bin source object doc etc). I noted down the comparative sizes,
the time it took to compress and to decompress. Lo and behold:
Original compress lharc comic
binary
size (bytes) 63920 50539 34875 34222
size (%) 100 79 55 54
relative (%) 100 69 68
compr time (sec) 56 135 1207
decompr time (sec) 34 66 50
source
size (bytes) 114114 48324 40625 37542
size (%) 100 42 36 33
relative (%) 100 84 78
compr time (sec) 71 224 1781
decompr time (sec) 40 76 65
archive
size (bytes) 151896 95007 64762 61669
size (%) 100 63 43 41
relative (%) 100 68 65
compr time (sec) 128 290 2353
decompr time (sec) 71 121 98
The conclusions are obvious (to me :-):
When you're keen on speed and have plenty of disk media:
do not compress at all.
When you're keen on speed, but rather low on disk media:
use oldy-but-goody compress
When you have hardly any space on disk, when you've to use these very
small 360kB floppy for archiving like I, but you do have some computer
time to spare (e.g.in the small hours), like I:
use comic
When you want a reasonable inbetween:
use lharc
_My_ choice is definitively comic, but I must say _each_ program
has its own advantages.
ps. Due to official regulations which I cannot sufficiently influence,
I seem to have trouble sending and receiving mail. I assume though
that the .urc.tue.nl domain should work properly for me now.
Met vriendelijke groeten, Wim 'Blue Baron' van Dorst
------------------------------------------------------------------
Blue Baron = Wim van Dorst, voice (+31) 074-443937 and 02152-42319
baron@wiesje.mug.hobby.nl or tgcpwd@eutrc3.urc.tue.nl
"This sentence have three erors."williams@umaxc.weeg.uiowa.edu (Kent Williams) (07/08/90)
I compiled comic on an Encore, and it appears to work. It is V E R Y slow compressing tars -- I didn't have the patience to let it run to completion while I waited. File sizes are impressive: -rw------- 1 williams 81920 Jul 7 00:56 comic.tar -rw------- 1 williams 31119 Jul 7 00:47 comic.tar.Z -rw------- 1 williams 21696 Jul 7 01:06 comic.tar-X I suspect the slowness comes in compressing the piles of nulls that tar pads everything with. Can the author summarize the algorithm used? It looks as though too much time is spent in looking for stuff -- maybe hashing could be used for one or another step to speed things up. -- Kent Williams 'Look, I can understand "teenage mutant ninja williams@umaxc.weeg.uiowa.edu turtles", but I can't understand "mutually williams@herky.cs.uiowa.edu recursive inline functions".' - Paul Chisholm
evans@ditsyda.oz (Bruce.Evans) (07/10/90)
In article <1815@ns-mx.uiowa.edu> williams@umaxc.weeg.uiowa.edu.UUCP (Kent Williams) writes: > >Can the author summarize the algorithm used? It looks as though too much >time is spent in looking for stuff -- maybe hashing could be used for >one or another step to speed things up. Profiling the 16-bit version gave the following (for 18.8 sec to compress comic.c): 10.042 10042 53.31% _memrchr ******************************************* 4.801 4801 25.48% _get2pai ********************* 1.735 1735 9.21% _getp1 ******* 0.524 524 2.78% _cdsret ** 0.238 238 1.26% _eqlen * ... 18.093 18093 96.06% TOTAL IN RANGE 0.742 742 3.93% OTHER 18.835 18835 100.00% TOTAL That's with memrchr in assembler! memrchr is used to find an anchor for a string lookup. Hashing just on the first char would reduce this step considerably. I don't know how expensive maintaining the hash table would be. Here are some times for the compression programs available to me on a 20MHz 386. I/O times are significant for the faster programs (5 to 10 sec). --- Times for packing the Minix dictionary (407564 bytes). Program compression time un-time O/S compiler ------------------------------------------------------------------------ pkzip1.10 -es 60% 9 6 DOS ? pkpak3.61 59% 9 6 DOS ? compress4.01 -b 13 57% 14 12 minix-3 gcc1.36 compress4.01 -b 16 58% 15 12 minix-3 gcc1.36 zoo2.01 59% 20 17 minix-3 gcc1.37 compress4.01 -b 13 57% 27 14 DOS msc 4.0 compress-minix1.3 57% 41 23 minix cc1.3 compress4.01 -b 13 57% 45 23 minix bcc1.0 pkzip1.01 66% 48 6 DOS ? pak2.10 67% 52 9 DOS ? compress-??? -b 16 58% 83 43 DOS ? lharc1.13c 67% 89 13 DOS ? lharc1.00 67% 168 17 minix-3 gcc1.37 lharc1.00 67% 275 36 minix bcc1.0 comic 67% 1504 20 minix-3 gcc1.37 comic 67% 1782 34 minix bcc1.0 -- Bruce Evans evans@ditsyda.syd.dit.csiro.au