[comp.graphics] The closest point. Suggestions.

antek@binoc.tamu.edu (Antek Laczkowski) (02/22/91)

Hi!   This is my 2 cents in the "How to find the point closest to a given one"
      discussion.

1) I wouldn't use one and only one algoritm - rather vary the method
depending of : How many points at all have you to deal with ? Are the points
fixed or may change their locations ?  How fast is FP aritmetic comparing to
INTEGER ? etc. 

2) Let me first give a quick algoritm for points in 2 or 3 D. If you can keep
their coordinates as integers, it should be quick, because only "+ - if abs"
are involved. The "Manhatan distance" has its usual meaning.

   a) Let the current point be "P". 
   b) Find the closest point(s) in the meaning of the "Manhatan" distance.
      Note, you may compare x,y,z coordinate differences separately to the
      previous "minimum" to speed up the process.
   c) If there is ONLY ONE "closest" point - it is also the closest in the
      terms of "normal" (i.e. Pitagorean (?)) distance. End.
      If there is MORE points, store them in a list. Go to "d".
   d) For each point from the list (for which you also stored the differences
      in coordinates to "P": (dx,dy,dz) calculate: 
        In 2-D space: ABS(dx - dy)
        In 3-D space: ABS(dx - dy) + ABS(dx - dz) + ABS(dy - dz)
      The point(s) for which the above is minimal is the closest one. End.

(Write to Antek @ Bioch.Tamu.Edu for a proof). As you see,  NO MULTIPLICATIONS
are involved, in the integer math it should run fast.

3) If you have a fixed set of points and need the "closest one" frequently,
maybe it is better to pre-calculate a list of "closest point(s)" for each one?

4) If the number of points is huge, I would suggest to group them in "clusters"
- you don't need a separate array for it, just run once through them and order
them in separate lists for each "cluster" basing on xyz coordinates. Then you
may search only a few clusters for the closest point. Of course if you are
searching for the "closest one" few times (like log(N) times, N=the number of
points) - "ordering" will take longer than an ordinary search. The "clustering
process" has even more sense if you may control the coordinates of each point
from the begining - the "ordering" operation requires only the time ~ N. 

5) For some particular problems when you KNOW where the points should be 
   (for example on a circle, in 1 plane, etc. ift is really worth to use some
   math and reduce the dimension of the problem - it should speed up searching.

Antek @ Bioch.Tamu.Edu  ("R" may NOT work, "Binoc.Tamu.Edu has problems with
                          mail !!")
=============================================================================

baer@cs.mcgill.ca (Lawrence BAER) (02/22/91)

In article <12449@helios.TAMU.EDU> antek@binoc.tamu.edu (Antek Laczkowski) 
gives the following algorithm:
>
>   a) Let the current point be "P". 
>   b) Find the closest point(s) in the meaning of the "Manhatan" distance.
>      Note, you may compare x,y,z coordinate differences separately to the
>      previous "minimum" to speed up the process.
>   c) If there is ONLY ONE "closest" point - it is also the closest in the
>      terms of "normal" (i.e. Pitagorean (?)) distance. End.
>      If there is MORE points, store them in a list. Go to "d".
>   d) For each point from the list (for which you also stored the differences
>      in coordinates to "P": (dx,dy,dz) calculate: 
>        In 2-D space: ABS(dx - dy)
>        In 3-D space: ABS(dx - dy) + ABS(dx - dz) + ABS(dy - dz)
>      The point(s) for which the above is minimal is the closest one. End.

As a counterexample, suppose we find that in our planar point set S, the only
closest point to the origin using the Manhattan or 1-norm is the point P1=
(0,1).  We stop at step (c).  Suppose S also contains the point P2=(1/2,3/4).
The 1-norm of P2 is ||P2||_1=5/4 > 1 BUT the 2-norm is ||P2||_2=sqrt(13)/4 < 1.
Thus, P2 is closer than P1 to the origin under the Euclidean norm.

I believe that Keith Blackwell's algorithm (given a few articles previous to
the one cited above):

| 	1)  For all of the points in the set, compute the manhatan
| 	    distance ( delta x + delta y + delta z).
| 	2)  Choose the one with the smallest manhatan distance.
| 	3)  then compute x^2 + y^2 + z^2 for all points whose 
| 	    Manhatan distance is within (2^ (1/2)) times the
|	    smallest Manhatan distance (found in #2).
| 	4) choose the smallest from the set in #3.

is correct except for one detail.  In 3-D, replace 2^(1/2) with 3^(1/2)
(The example he gave was for 2-D which works fine).  The reason for this
is given in Golub and van Loan's Matrix Computations, p.54:

	if x is in R^n then
		||x||_2 <= ||x||_1 <= sqrt(n) * ||x||_2
	or
		(1/sqrt(n))*||x||_1 <= ||x||_2 <= ||x||_1

Note that we can do this sort of trick with any norms on R^(n), including
the infinity norm:

	||x||_inf <= ||x||_2 <= sqrt(n)||x||_inf

If you wish to attack the closest point problem using these kinds of 
algorithms, then the last norm relation may be helpful since the infinity
norm (just the max coord of point x) is so easy to calculate but the
drawback is that in step (3) of Keith Blackwell's algorithm, you may be 
left with more points than if you had used the 1-norm.

For an excellent discussion of this problem and efficient solutions to it,
I suggest 'Computational Geometry' by Preparata & Shamos.

Larry Baer

spencer@eecs.umich.edu (Spencer W. Thomas) (02/22/91)

Is it my imagination, or has nobody mentioned K-d trees?  This is the
"classic" way to solve this problem.

--
=Spencer W. Thomas 		EECS Dept, U of Michigan, Ann Arbor, MI 48109
spencer@eecs.umich.edu		313-936-2616 (8-6 E[SD]T M-F)