[comp.sys.ibm.pc.programmer] Help needed

fang@dukempd.phy.duke.edu (Fang Zhong) (04/09/90)

	I am doing some image processing on a IBM PC-AT.
I use subroutines from ImageTool by Werner Frei Associates
to grab images.  I want to average several frames on an
image to increase S/N ratio.  I wrote a program with MS
QuickC 2.0 (enclosed below).  It takes 75 seconds to just
average over two frames.  Normally I want to average at
least eight frames.  That would take ten minutes during
which my image pattern may have changed.  
	Can some experts on the net tell me how to improve
the speed of the averaging through better programing?
	Thanks in advance.

Fang


-----------------------------------------------------
#include <stdio.h>
#include <malloc.h>
include "imtool.h" 

#define row 480
#define col 512

int **imatrix();
void free_imatrix();

int **ival;

void main()
{
	long ms, crg;
	int mode = 1, num, nframe, i, j, k, pval;

	ival = imatrix(1, row, 1, col);  /* allocate memory for ival */

	for(k = 1; k <= row; k++) {          /* initialize ival */
		for(j = 1; j <= col; j++) {
			ival[k][j] = 0;
		}
	}

	/* initialize imtool routines */
	ms = 0xd000;
	crg = 0x300;
	num = 1;
	INITIM(&num,&ms,&crg);    /* initialize the Frame Grabber Board */
	BRDSEL(&num);             /* initialize selected board */
	CLKSEL(&num);             /* initialize Clock mode to be PLL */

	printf("How many frames do you want to average?\n");
	scanf("%d", &nframe);

	for(i = 0; i < nframe; ++i) {
		DIGITZ(&mode);
		printf("\aCollecting frame #%d\n", i+1);
		for(k = 0; k < row; ++k) {
			for(j = 0; j < col; ++j) {
				RDPXL(&j, &k, &pval);
				ival[k+1][j+1] += pval;
			}
		}
	}

	printf("\aDisplay averaged image\n");

	for(k = 0; k < row; ++k) {
		for(j = 0; j < col; ++j) {
			pval = ival[k+1][j+1] / nframe;
			WRPXL(&j, &k, &pval);
			}
	}
	free_imatrix(ival, 1, row, 1, col);
}

int **imatrix(nrl,nrh,ncl,nch)
int nrl,nrh,ncl,nch;
{
	int i,**m;

	m=(int **)malloc((unsigned) (nrh-nrl+1)*sizeof(int*));
	m -= nrl;

	for(i=nrl;i<=nrh;i++) {
		m[i]=(int *)malloc((unsigned) (nch-ncl+1)*sizeof(int));
		m[i] -= ncl;
	}
	return m;
}

void free_imatrix(m,nrl,nrh,ncl,nch)
int **m;
int nrl,nrh,ncl,nch;
{
	int i;

	for(i=nrh;i>=nrl;i--) free((char*) (m[i]+ncl));
	free((char*) (m+nrl));
}
-- 
	Fang Zhong				1-919-684-8247
	Duke University Dept. of Physics	fang@phy.duke.edu
	Durham, N.C.      27706

cs4g6ag@maccs.dcss.mcmaster.ca (Stephen M. Dunn) (04/11/90)

   I'm not familiar with Quick C, but I don't believe it does much
in the way of optimization.  If that's correct, you may be able to
speed things up a bit by using MSC or other optimizing compilers.

Now on to the program itself:

In article <866@dukempd.phy.duke.edu> fang@dukempd.phy.duke.edu (Fang Zhong) writes:

$		for(k = 0; k < row; ++k) {
$			for(j = 0; j < col; ++j) {
$				RDPXL(&j, &k, &pval);
$				ival[k+1][j+1] += pval;
$			}
$		}

   The k+1 and j+1 in the final statement above are time-wasters.
The k+1 only needs to be performed when k is incremented, and the
j+1 could be changed so that it actually updates j, since the next
thing that is done is the ++j in the for statement:

	for (k = 0; k < row;++ k)
	{
		register int	temp = k + 1;

		for (j = 0; j < col;)
		{
			RDPXL (&j, &k, &pval);
			ival [temp] [++ j] += pval;
		}
	}

   Any optimizing compiler should be able to remove the loop invariant
k+1 from the inner loop.  I don't know how many will change the
code involving j, though.

   In any case, though, the biggest time-waster in the code you have
is the overhead involved in calling RDPXL (row * col) times.  Each
time, the arguments have to pushed onto the stack before the call
and popped afterwards.  If these are calls to subroutines provided
with the board, you have one (maybe two) choices:

- write an assembly language routine that you call once for each
  column that does the calls itself and returns a whole column at
  once

- see if there's such a call provided for you

   If you wrote RDPXL yourself, you should seriously consider writing
a new routine that returns a whole column or row at a time.

   So the first suggestion basically involves optimizing the code
yourself instead of hoping the compiler will do so (you'd be
surprised at how much faster a program can run if you remove stuff
from loops that only needs to be done once!), and the second
involves trying to cut down the overhead of function calls.
These are both techniques that are generally applicable, especially
the first one when using non-optimizing compilers.
-- 
               More half-baked ideas from the oven of:
****************************************************************************
Stephen M. Dunn                               cs4g6ag@maccs.dcss.mcmaster.ca
     <std_disclaimer.h> = "\nI'm only an undergraduate ... for now!\n";