[comp.lang.c] Need a faster replacement for fscanf

jap7g@mendel.acc.Virginia.EDU (Jim A. Pisano) (06/04/90)

I need some ideas on reading a matrix of ASCII data quickly.  I'm
developing a statistics package which reads ASCII numbers in row &
column format.  The numbers can be in integer, floating point, or
scientific notation, e.g.:

1   13  82  10.4    11.0e-2
.3  0.1 4   11      12.

I've used these two bits of code, but the conversion time from the ASCII
format to an internal binary format is vey high & since I'm reading
1000's of rows of data, this can really slow things down.

/* Example 1:  using fscanf()  The simplest approach
*  
* 
*/
	FILE *infp, fopen();
	char *file_buf, *tbuf;
	int cur_row, cur_col, rows, cols, err;
	double **temp;
	infp = fopen( "ascii.dat","r");
	cur_row = 0;
/*
*  Read "rows" number of rows of data or until we get to the end of the
*  file.  There are "cols" number of columns in each row.  Convert a row
*  at a time to an internal double matrix (which gets allocated space).
*/
	cur_row = 0;
	while( (cur_row < rows) && !feof(infp) )
	{
		for( cur_col = 0; cur_col < cols; cur_col++ )	/* read 1 row */
		{
			err = fscanf(infp,"%lg",&(temp_matrix[i][cur_col]));
			if( err != 1 )
				fprintf(stderr,"%u elements read.\n",cur_row * cols + cur_col);
		}
		cur_row++;
	}
/* Example 2:  This is a minor modification.  Use fgets() to read a row
*  of data to an internal temporary buffer & then use sscanf() to parse
*  the numbers into the array.  It goes just a little bit faster.
*/
	file_buf = malloc(1025);

	cur_row = 0;
	while( (cur_row < rows) && !feof(infp) )
	{
		file_buf = fgets(file_buf,1024, infp );
		tbuf = file_buf;
		for( cur_col = 0; cur_col < cols; cur_col++ )	/* read 1 row */
		{
			err = sscanf(tbuf,"%lg",&(temp[i][cur_col]));
/*
*   Skip over blanks and/or commas in data file to next number to
*   process using sscanf
*/
			while( *tbuf && (*tbuf != ' ') && (*tbuf != ',') )
				tbuf++;
			tbuf++;
			if( err != 1 )
				fprintf(stderr,"%u elements read.\n",cur_row * cols + cur_col);
		}
		cur_row++;
	}

What I'd like to do is read a bunch of rows (~500) in and assign them
quickly to my 2-D double matrix "temp".  But I can't figure it out.

Thanks for you help.

-Jim
 ____________________________________________________________________________
|                                                                            |
| Jim Pisano                      jap7g@Virginia.EDU (jap7g@virginia.bitnet) |
| Department of Psychology                             uunet!virginia!jap7g  |
| University of Virginia, Charlottesville, VA 22903    (804) 924-4282        |
|____________________________________________________________________________|
 ____________________________________________________________________________
|                                                                            |
| Jim Pisano                                           jap7g@virginia.edu    |
| Department of Psychology                             uunet!virginia!jap7g  |