root@rdb1.UUCP (Robert Barrell) (10/10/90)
Is there any way, or any library of routines, which would automatically compress data when it is stored into a file, and uncompress it when it is retrieved? I'd like to be able to read and write compressed files directly without having to run them through uncompress of zcat all the time, and still have file positioning work relative to the uncompressed data. Is such a thing possible? It may be necessary to write a separate library to handle it, such as Zopen(), Zclose(), Zprintf(), Zscanf(), Zread(), Zwrite(), Zseek(), etc. This may be a difficult or impossible task if one only uses the existing library calls, and pipes to compress, but would someone with a thorough knowledge of compression algorithms be able to do it? -- Robert Barrell | rbarrell@rdb1.UUCP | Phillips Consulting Group Milo's Meadow BBS | uunet!pcgbase!rdb1!rbarrell | 282 North Shore Drive login: bbs or nuucp | "... Pooh just IS." | Ormond Beach, FL 32176 (904) 441-5028 | -- The Tao of Pooh | (904) 672 - 3856
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/11/90)
In article <5@rdb1.UUCP> root@rdb1.UUCP (Robert Barrell) writes: > Is there any way, or any library of routines, which would automatically > compress data when it is stored into a file, and uncompress it when it is > retrieved? It shouldn't be hard to stick this into any filesystem implemented outside the kernel. You store all files with a compression type: either no compression or a choice of available methods. You keep an MRU cache of uncompressed files, including all the files open at the moment. You might keep a priority queue of LRU files to switch from uncompressed to compressed, or you might have all files compressed immediately upon close(). Anyone want to try to stick this into RFS? Note that making this sort of transparent change becomes very, very difficult if the filesystem is hidden inside the kernel. comp.std.unix readers know what I'm referring to. ---Dan
mju@mudos.ann-arbor.mi.us (Marc Unangst) (10/12/90)
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: [automatically compressing files transparantly] > Anyone want to try to stick this into RFS? This is a bad, bad idea, for the same reason that compressing backups before writing them to tape is a bad idea. The difficulty of recovering a trashed filesystem increases by several orders of magnitude when you need to reconstruct a compressed file; in fact, I'd say it's almost impossible to recover the undamaged portions of a compressed file (especially if it was the key table that got trashed). Maybe your disks never fail. But mine do on occasion, and I like to have at least half a chance of bringing my data back. -- Marc Unangst | mju@mudos.ann-arbor.mi.us | "Bus error: passengers dumped" ...!umich!leebai!mudos!mju |
root@rdb1.UUCP (Robert Barrell) (10/12/90)
In article <23653:Oct1019:40:1190@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > It shouldn't be hard to stick this into any filesystem implemented > outside the kernel. You store all files with a compression type: either > no compression or a choice of available methods. You keep an MRU cache > of uncompressed files, including all the files open at the moment. You > might keep a priority queue of LRU files to switch from uncompressed to > compressed, or you might have all files compressed immediately upon > close(). Dan, I understand the gist of what you said, but also realize that such a filesystem implementation is far beyond my current knowledge of *nix. Even so, MUST such a thing be incorporated into a filesystem? Is it not possible for appropriate library routines to handle the [un]compression then hand the data to other, lower-level file i/o routines? Or is that, by definition, a "filesystem implemented outside the kernel?" The whole key to what I want is to eliminate the need for an entire file to exist in its uncompressed form on the disk at any time. Rather than taking the time and disk space to uncompress a file before accessing it, then just compressing it again when finished, I'd like to see routines where I'd be able to say: Zfseek(fp,1000L,0); Zgets(string,101,fp); and have the file position to the 1000th byte of the uncompressed file instead of the compressed file. At the moment, when I only need to read information from compressed files, without having to seek or write, I use: sprintf(cmd,"zcat %s",filename); fp = popen(cmd,"r"); ... which works fine. The problem arises when I want to try to seek, especially if I wish to seek backwards into the file. Your ideas sound wonderful, and such an implementation would mean that all the regular library commands, dbm functions, etc. could be used directly. Still, isn't there possibly a simpler interim solution? Of course, if anyone out there CAN and DOES implement something which works either way, I'd like to see it. -- Robert Barrell | rbarrell@rdb1.UUCP | Phillips Consulting Group Milo's Meadow BBS | uunet!pcgbase!rdb1!rbarrell | 282 North Shore Drive login: bbs or nuucp | "... Pooh just IS." | Ormond Beach, FL 32176 (904) 441-5028 | -- The Tao of Pooh | (904) 672 - 3856