[comp.databases] Graphics data structure info request

burton@nicmad.UUCP (Kevin Burton) (11/22/89)

I have been considering for quite some time an efficient data structure
for 2-D objects. By efficient I mean that the closest object or objects
to a given object can be found very quickly and objects can be added and
deleted easily. There are many structures that lend themselves very well
to points (eg binary trees, B-trees, Delaunay triangulation ...) but I
am unaware of any that explicitly deal with two dimensional objects. The
quad-tree has the general flavor of the structure I am looking for but
it seems too "raster" oriented to be adapted. Does anybody have any infor-
mation, opinions, advice  ?

If I get enough response I will post a summary. Thanks in advance !!

		Kevin Burton

burton@nicmad.UUCP (Kevin Burton) (11/22/89)

I am sorry but postnews left my signature off.

================================================================================
          att!terminus-----\		           |
     harvard-\         att--\                      | Kevin Burton (608) 276-6166
        ucbvax!uwvax!astroatc!nicmad!burton        | Nicolet Instrument Corp.
    rutgers--/       rolls--/                      | 5225-2 Verona Rd
      ames--/       decvax-/                       | Madison, WI 53711-0288 
================================================================================

flynn@pixel.cps.msu.edu (Patrick J. Flynn) (11/22/89)

[my apologies if you see this twice; the first posting was truncated and
 I am trying to cancel & repost...]

In article <3905@nicmad.UUCP> burton@nicmad.UUCP (Kevin Burton) writes:
>I have been considering for quite some time an efficient data structure
>for 2-D objects. By efficient I mean that the closest object or objects
>to a given object can be found very quickly and objects can be added and
>deleted easily. There are many structures that lend themselves very well
>to points (eg binary trees, B-trees, Delaunay triangulation ...) but I
>am unaware of any that explicitly deal with two dimensional objects. The
>quad-tree has the general flavor of the structure I am looking for but
>it seems too "raster" oriented to be adapted. Does anybody have any infor-
>mation, opinions, advice  ?

Hanan Samet from Maryland has a new book out entitled... [rustle,
rustle...] "The Design and Analysis of Spatial Data Structures",
published by Addison-Wesley, copyright 1990 (!).  I don't have the ISBN
number handy.  There is a companion volume on applications that hasn't
shown up in our bookstore yet.  The "Design" book discusses data
structures for 2D and 3D objects, focusing on tree-structures of
various sorts.  While I haven't done a thorough reading, it seems like
a very nice treatment of a variety of useful techniques.  There is a
HUGE bibliography.
--
Patrick Flynn, CS, Mich. State U., flynn@cps.msu.edu

rick@hanauma.stanford.edu (Richard Ottolini) (11/22/89)

>In article <3905@nicmad.UUCP> burton@nicmad.UUCP (Kevin Burton) writes:
>>I have been considering for quite some time an efficient data structure
>>for 2-D objects. By efficient I mean that the closest object or objects
>>to a given object can be found very quickly and objects can be added and
>>deleted easily.
A related question.  Have there been attempts to coerce model data structures
into standard databases, e.g. normal relational form, to take advantage of
high speed database engines.  I remember some people doing this for AI
knowledge bases.  It wasn't very creative, but faster.

jack@odi.com (Jack Orenstein) (11/22/89)

In article <3905@nicmad.UUCP> burton@nicmad.UUCP (Kevin Burton)
writes:
> I have been considering for quite some time an efficient data structure
> for 2-D objects. By efficient I mean that the closest object or objects
> to a given object can be found very quickly and objects can be added and
> deleted easily. There are many structures that lend themselves very well
> to points (eg binary trees, B-trees, Delaunay triangulation ...) but I
> am unaware of any that explicitly deal with two dimensional objects. The
> quad-tree has the general flavor of the structure I am looking for but
> it seems too "raster" oriented to be adapted. Does anybody have any infor-
> mation, opinions, advice ?

There are a few approaches to dealing with non-point spatial data. The
simplest is to place each data object in a container, such as a circle
or box. Then describe the container parametrically, yielding a point
in a higher-dimension space. Many spatial search problems can be
restated in the parameter space. For example, a 2-d box is represented
by the point (xlo, xhi, ylo, yhi) in 4-d space, and a circle whose
center is at (x, y) is represented by the point (x, y, radius) in 3-d
space.  Following the search, weed out data objects that don't really
satisfy the query. This refinement step is necessary because the
container is not an exact representation of the data object (usually).

Some spatial search structures have been designed to deal with
non-spatial objects. These also involve some kind of approximation and
also require a refinement step. For example, there's the R-tree, a
variation of it - the R+-tree, the hB-tree, and there are variations
of the EXCELL and grid file search structures which were originally
proposed for point data.

I have done research in this area for several years. The basic idea of
my work is to transform the data objects and search problem into an
equivalent data set and search problem in 1-d.  The transformation is
based on the use of space-filling curves. Each data object and query
region is represented by a sequence of consecutive points on this
curve, which can be concisely encoded as an integer key. My algorithms
work on top of any data structure supporting random and sequential
access. I.e., you can put the algorithms on top of a btree, binary
tree, or even just a sorted array (if you don't care about updates).
Queries are run by generating random accesses based on the integer
keys, and sequential accesses.

Quadtrees can be adapted to non-point objects also. The "raster"
orientation means that there must be a refinement step, as with other
structures. (Is this why you ruled it out?)

The work on non-point data is all pretty recent. There hasn't been a
lot of work on performance comparisons. The performance results, for
the most part, examine a single data structure.

In terms of implementation, the methods that use existing data
structures are the simplest, (the parameter-space method and the
space-filling curve method). These are obviously dynamic, since the
underlying data structures are dynamic.  The other structures are
dynamic also, but the insertion and deletion algorithms are fairly
complex, and in some cases, there is no lower bound on storage
utilization.

I could go on for pages. Send email if you want more information.


Jack Orenstein
Object Design, Inc.


References:

Parameter space method: Ph.D. thesis by Hinrichs, Swiss Federal
Institute of Technology; various tech reports by Faloutsos, Rego,
Sellis, University of Maryland.

R-tree: Guttman, SIGMOD 84.

R+-tree: University of Maryland, again. Also, see VLDB 87 (Sellis,
Roussopoulos, Faloutsos).

Grid file: ACM Trans. on Database Systems, 9, 1 (1984), Nievergelt,
Hinterberger, Sevcik.

EXCELL for non-point data: 19th ACM Design Automation Conference,
Tamminen & Sulonen.

hB-tree: Data Engineering 1989, Lomet & Salzberg.

Using space-filling curves: papers by Orenstein in SIGMOD 86, IEEE
Trans. on Soft. Eng. May 1988, SIGMOD 89.

Survey of the area: Recent ACM surverys article by Hanan Samet, who
also has a new book surveying the field.