tgcpwd@rc3.urc.tue.nl (Wim van Dorst) (07/07/90)
Hello *, I have tested the newly posted lharc and comic and compared them with the old compress. I took a large binary file (kermit), a large source file (three concatenated floppy.c) and an archive file (ar q wmail/*, thus bin source object doc etc). I noted down the comparative sizes, the time it took to compress and to decompress. Lo and behold: Original compress lharc comic binary size (bytes) 63920 50539 34875 34222 size (%) 100 79 55 54 relative (%) 100 69 68 compr time (sec) 56 135 1207 decompr time (sec) 34 66 50 source size (bytes) 114114 48324 40625 37542 size (%) 100 42 36 33 relative (%) 100 84 78 compr time (sec) 71 224 1781 decompr time (sec) 40 76 65 archive size (bytes) 151896 95007 64762 61669 size (%) 100 63 43 41 relative (%) 100 68 65 compr time (sec) 128 290 2353 decompr time (sec) 71 121 98 The conclusions are obvious (to me :-): When you're keen on speed and have plenty of disk media: do not compress at all. When you're keen on speed, but rather low on disk media: use oldy-but-goody compress When you have hardly any space on disk, when you've to use these very small 360kB floppy for archiving like I, but you do have some computer time to spare (e.g.in the small hours), like I: use comic When you want a reasonable inbetween: use lharc _My_ choice is definitively comic, but I must say _each_ program has its own advantages. ps. Due to official regulations which I cannot sufficiently influence, I seem to have trouble sending and receiving mail. I assume though that the .urc.tue.nl domain should work properly for me now. Met vriendelijke groeten, Wim 'Blue Baron' van Dorst ------------------------------------------------------------------ Blue Baron = Wim van Dorst, voice (+31) 074-443937 and 02152-42319 baron@wiesje.mug.hobby.nl or tgcpwd@eutrc3.urc.tue.nl "This sentence have three erors."
williams@umaxc.weeg.uiowa.edu (Kent Williams) (07/08/90)
I compiled comic on an Encore, and it appears to work. It is V E R Y slow compressing tars -- I didn't have the patience to let it run to completion while I waited. File sizes are impressive: -rw------- 1 williams 81920 Jul 7 00:56 comic.tar -rw------- 1 williams 31119 Jul 7 00:47 comic.tar.Z -rw------- 1 williams 21696 Jul 7 01:06 comic.tar-X I suspect the slowness comes in compressing the piles of nulls that tar pads everything with. Can the author summarize the algorithm used? It looks as though too much time is spent in looking for stuff -- maybe hashing could be used for one or another step to speed things up. -- Kent Williams 'Look, I can understand "teenage mutant ninja williams@umaxc.weeg.uiowa.edu turtles", but I can't understand "mutually williams@herky.cs.uiowa.edu recursive inline functions".' - Paul Chisholm
evans@ditsyda.oz (Bruce.Evans) (07/10/90)
In article <1815@ns-mx.uiowa.edu> williams@umaxc.weeg.uiowa.edu.UUCP (Kent Williams) writes: > >Can the author summarize the algorithm used? It looks as though too much >time is spent in looking for stuff -- maybe hashing could be used for >one or another step to speed things up. Profiling the 16-bit version gave the following (for 18.8 sec to compress comic.c): 10.042 10042 53.31% _memrchr ******************************************* 4.801 4801 25.48% _get2pai ********************* 1.735 1735 9.21% _getp1 ******* 0.524 524 2.78% _cdsret ** 0.238 238 1.26% _eqlen * ... 18.093 18093 96.06% TOTAL IN RANGE 0.742 742 3.93% OTHER 18.835 18835 100.00% TOTAL That's with memrchr in assembler! memrchr is used to find an anchor for a string lookup. Hashing just on the first char would reduce this step considerably. I don't know how expensive maintaining the hash table would be. Here are some times for the compression programs available to me on a 20MHz 386. I/O times are significant for the faster programs (5 to 10 sec). --- Times for packing the Minix dictionary (407564 bytes). Program compression time un-time O/S compiler ------------------------------------------------------------------------ pkzip1.10 -es 60% 9 6 DOS ? pkpak3.61 59% 9 6 DOS ? compress4.01 -b 13 57% 14 12 minix-3 gcc1.36 compress4.01 -b 16 58% 15 12 minix-3 gcc1.36 zoo2.01 59% 20 17 minix-3 gcc1.37 compress4.01 -b 13 57% 27 14 DOS msc 4.0 compress-minix1.3 57% 41 23 minix cc1.3 compress4.01 -b 13 57% 45 23 minix bcc1.0 pkzip1.01 66% 48 6 DOS ? pak2.10 67% 52 9 DOS ? compress-??? -b 16 58% 83 43 DOS ? lharc1.13c 67% 89 13 DOS ? lharc1.00 67% 168 17 minix-3 gcc1.37 lharc1.00 67% 275 36 minix bcc1.0 comic 67% 1504 20 minix-3 gcc1.37 comic 67% 1782 34 minix bcc1.0 -- Bruce Evans evans@ditsyda.syd.dit.csiro.au