[net.graphics] convolution vs. FFT's

nitin@ur-laser.uucp (Nitin Sampat) (08/01/85)

Does anybody know at what point do Fourier operations
become more practical to use than convolution ?

Example : lets say we are dealing with a 512 X 512 , 8 bit image.

Most image processing systems effect a convolution by using a 3 X 3 
kernal (for example in smoothing).  However, we know that by linear 
system theory, we can take  the FFT of the image and multiply 
this by the transfer function, which is nothing more than the FFT of
the impulse response(in our case the convolution kernal).  FFT operations
give us much more control over the individual frequencies. However, being
computationally demanding, they are avoided and most operations are done
in the spatial domain.  Harware and time limitations dictate that 3 X 3
kernals be used most of the time. For increased control over the frequencies
larger kernal sizes are sometimes used.

The question is that there must be a transition point above which it becomes
more practical to use FFT's .  Someone told me that this "magic number"
is 7 X 7.  So, if one were using a kernal size 7 X 7 or larger, it would
be more feasible to multiply the image by its transfer function in the 
frequency domain rather than convolve it with the impulse response.


Has anyone had any experience with this ?  Is this 7 X 7  approx. correct ?


						nitin 

shep@datacube.UUCP (08/03/85)

> net.graphics / ur-laser!nitin /  9:39 am  Aug  1, 1985 ****/
>Does anybody know at what point do Fourier operations
>become more practical to use than convolution ?
>Example : lets say we are dealing with a 512 X 512 , 8 bit image.

Good question. However it is ill defined. Since we are dealing with
the "practicality" of the problem, we must first recognize the number
of computrons required for each approach. Two-dimensional image convolution
is easily performed by most pipelined architectures. FFT's, because of their
often used butterfly addressing and bit reversal, are more difficult
(less practical, require more hardware, etc) to perform.

The MaxVideo image processing architecture can perform a 10*10 convolution
on a 512*512 image in real-time (32mS). This requires a substantial amount
of hardware. The key here is that this particular architecture allows
a linear increase in performance as additional hardware is added. 
(Double the hardware, half the process time; half the hardware, double
the time.)

At least one crazed engineer here is obsessed with transformations in
the frequency domain. His problem though, is that his circuit board
for performing these transformations is still vaporware and thus not
too "practical" for the application, at least not yet.

My point is that the "crossover point" you seek to determine is a function
of the architecture being used. Pipelined, scan-sequential architectures
can perform an m*n convolution in a frame time, with hardware requirements
doubling as m*n doubles. (Or, for a fixed hardware configuration, with
process time doubling as m*n doubles.) How do single and multi-processor
Von Neuman architectures for the transform compare? 

If we compare 2-d convolution against performing a 2-d FFT, and neglect
the multiplication of the real portion in the frequency domain. We do
see that the FFT requires, at least at some point, less multiplies.
(Unfortunately, they'll be complex multiplies!)

convolution: N^2 * M^2            2-d DFFT:  nm
                                             -- (log nm)
                                             2

Again, good question! I'm anxious to hear other USENET'ers interpretation
of the problem. I chose to play realist.

Shep Siegel                               ihnp4!datacube!shep
Datacube Inc.                  ima!inmet!mirror!datacube!shep
617-535-6644                  decvax!cca!mirror!datacube!shep
4 Dearborn Rd.       decvax!genrad!wjh12!mirror!datacube!shep
Peabody, Ma. 01960   {mit-eddie,cyb0vax}!mirror!datacube!shep