[mod.ai] 3D Clustering algorithms

DAVIS@EMBL.BITNET.UUCP (03/09/87)

The subject just about sums it up......anyone out there in the 'lectronic
village overly proud, or overly knowledgeable, or even just familiar with
clustering algorithms for use in three dimensions ? That is to say, I have
a bunch of points in a 3D space, and I want to cluster them. Simple huh ?
Tell me how, or tell me how to find out how......replies directly to me,
or post them on the list.

with thanks,

Paul Davis

Euopean Molecular Biology Laboratory,
Postfach 10.2209
6900 Heidelberg
West Germany

bitnet: davis@embl.bitnet
uucp: ...psuvax!embl.bitnet!davis
petnet: homing pigeons to....

"a time for dreams, a time for sleep, a time for love  .... its now!"


  [What makes three-space special?  Any similarity or dissimilarity
  metric that works in three dimensions should work in N dimensions.
  The really interesting cases are those where no reasonable weighting
  exists for combining distances in the different dimensions.

  Any of the major subroutine packages -- BMD, SPSS, etc. -- have
  clustering routines and associated documentation.  Euclidean space
  is generally assumed, which causes problems with circular scales
  such as hue in a color space.  (One heuristic for color spaces is
  to linearize the usual 256^3 cells by tracing through the space with
  a fractal curve, then search for clusters in the 1-D result.)
  Other 3-D spaces are best analysed in terms of direction cosines
  for vectors to the points from some origin.  Statistical metrics
  based on within-cluster and between-cluster variances are optimal
  for some applications, but gravitational or potential-based models
  are better in others.  ISODATA is a time-honored heuristic method
  for growing and splitting clusters, but is only suitable for
  circular clusters in isometric spaces.  Zahn's method of analyzing
  minimal spanning trees is one way of overcoming the common faults
  (e.g., chaining or lack thereof) of heuristic approaches.

  The book on Pattern Recognition and Image Processing by Duda and
  Hart offers an easy introduction to some of the statistical and
  heuristic methods.  Other pattern recognition books are more
  thorough.  Clustering is still a black art, though, and you are
  probably best off getting a commercial package and trying a few
  of the options to get a feel for what works with your data.  -- KIL]