[comp.sys.sgi] compressed files

trohling@uceng.UC.EDU (tom rohling) (01/17/90)

     I have recently discovered the compress utility on UNIX machines and 
it has turned out to be a wonderful way of keeping down the disk usage and
the file transfer times on our 120GTX.  Now for the question:

     Can a compressed file be accessed through fortran (or C) much in the 
same way that 'zcat' uncompresses the file to std out but leaves the file 
in its compressed state?  i.e. can I read the contents of a compressed file
from a program without having to uncompress it first?  Sort of like zcat 
it into ram where my program can get at it without creating file out 
of it and taking up all that space.

     We have these rather large files (20 Meg uncompressed) we are using 
in some CFD calculations and alot of the time there isn't enough room on 
the disk to uncompress all the files and run the program for a while and 
still leave enough disk space for other users.  Now I know we should just 
forget all this and go buy another disk, but if this can be done it could 
save alot of space for alot of people on other machines where you can't 
'just go buy another disk' (like a Cray where they charge you for the
space you use).

By the way, was my audio suggestion for the Power and Pro series taken 
seriously anywhere?

Tom Rohling
trohling@uceng.uc.edu

"Nothing is impossible, it's just not possible yet"  -Myself

ciemo@bananapc.wpd.sgi.com (Dave Ciemiewicz) (01/18/90)

In article <3339@uceng.UC.EDU>, trohling@uceng.UC.EDU (tom rohling) writes:
> 
>      I have recently discovered the compress utility on UNIX machines and 
> it has turned out to be a wonderful way of keeping down the disk usage and
> the file transfer times on our 120GTX.  Now for the question:
> 
>      Can a compressed file be accessed through fortran (or C) much in the 
> same way that 'zcat' uncompresses the file to std out but leaves the file 
> in its compressed state?  i.e. can I read the contents of a compressed file
> from a program without having to uncompress it first?  Sort of like zcat 
> it into ram where my program can get at it without creating file out 
> of it and taking up all that space.
> 

No, FORTRAN and C do not support a zcat file type for reading or writing.
However, you should be able to use the system(3F) (FORTRAN version) or
system(3S) (C version) call to 'uncompress' the file before opening the
file:

	call system('uncompress file')

Of course, what you really want to do is create a string which has the
uncompress command and your filename in it.  My FORTRAN is too rusty
to try to illustrate the concatenation.

Just after you close the file, you may want to compress the file again:

	call system('compress file')

>      We have these rather large files (20 Meg uncompressed) we are using 
> in some CFD calculations and alot of the time there isn't enough room on 
> the disk to uncompress all the files and run the program for a while and 
> still leave enough disk space for other users.  Now I know we should just 
> forget all this and go buy another disk, but if this can be done it could 
> save alot of space for alot of people on other machines where you can't 
> 'just go buy another disk' (like a Cray where they charge you for the
> space you use).
> 

Of course, the proposal I have presented only works if you don't have all
data sets open at once; it assumes you are going to open and close them
in sequence or atleast only use a few at a time.

						--- Ciemo

robert@victoria.esd.sgi.com (Robert Skinner) (01/18/90)

In article <3339@uceng.UC.EDU>, trohling@uceng.UC.EDU (tom rohling) writes:
> 
>      Can a compressed file be accessed through fortran (or C) much in the 
> same way that 'zcat' uncompresses the file to std out but leaves the file 
> in its compressed state?  i.e. can I read the contents of a compressed file
> from a program without having to uncompress it first?  Sort of like zcat 
> it into ram....

You can use popen to get a file pointer to the output of zcat.  This acts 
just like fopen, but if the file isn't there, it looks for the compressed
version (with the .Z extension) and makes a pipe that zcat's it into
your program.

	#include	<stdio.h>
	#include	<errno.h>

	FILE	*zopen( name )
	char	*name;
	{
		FILE	*fp;
		char	cmd[256];

		errno = 0;
		fp = fopen( name, "r" );

		if( !fp && errno == 0 ) { 	/* file doesn't exist */
			sprintf( cmd, "zcat %s.Z" );
			fp = popen( cmd, "r" );
		}

		return fp;
	}

(No, this isn't debugged, and I should check whether the compressed
file is there, but you get the idea.)  One drawback is that you can't 
seek on it.  

I think you lose if you HAVE to seek on the file.  Seeking is very tricky
when a (de)compression scheme is involved.

good luck,
Robert Skinner
robert@sgi.com

		Which is worse, ignorance or apathy?
		Who knows?  Who cares?

tps@chem.ucsd.edu (Tom Stockfisch) (01/19/90)

In article <3339@uceng.UC.EDU> trohling@uceng.UC.EDU (tom rohling) writes:
>     Can a compressed file be accessed through fortran (or C) much in the 
>same way that 'zcat' uncompresses the file to std out but leaves the file 
>in its compressed state?  i.e. can I read the contents of a compressed file
>from a program without having to uncompress it first?  Sort of like zcat 
>it into ram where my program can get at it without creating file out 
>of it and taking up all that space.
>     We have these rather large files (20 Meg uncompressed) we are using 
>... and alot of the time there isn't enough room on 
>the disk to uncompress all the files and run the program for a while and 
>still leave enough disk space for other users.

Use the following routine in place of fopen( "bigFile", "r" ):

/* zopen():
 *	open a compressed file for reading, filtering it thru zcat.
 */

# include <stdio.h>

static void	defaultErrHndlr();

void	(*zopenErrHndlr)() =	defaultErrHndlr;

FILE *
zopen(name)
	char	*name;
{
	FILE	*stream;
	int	piped[2];
#		define READ	0
#		define WRITE	1

	if ( pipe(piped) == -1 )
		(*zopenErrHndlr)( "pipe failure\n" );
	switch ( fork() )
	{
	case -1:
		(*zopenErrHndlr)( "fork failure\n" );
	case 0:	/* child */
		close( piped[READ] );
		close(1);
		if ( dup( piped[WRITE] ) != 1 )
			(*zopenErrHndlr)( "dup screwup\n" );
		close( piped[WRITE] );
		execlp( "zcat", "zcat", name, (char *)0 );
		(*zopenErrHndlr)( "cannot start zcat" );
	default:	/* parent */
		close( piped[WRITE] );
		stream =	fdopen( piped[READ], "r" );
		if (stream == NULL)
			(*zopenErrHndlr)( "cannot open pipe\n" );
		break;
	}
	return	stream;
}

static void
defaultErrHndlr(diagnostic)
	char	*diagnostic;
{
	fprintf( stderr, "zopen(): %s\n", diagnostic );
	exit(1);
}
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu