Skip to main content

A Data-Driven Multidimensional Indexing Method for Data Mining in Astrophysical Databases


Large archives and digital sky surveys with dimensions of bytes currently exist, while in the near future they will reach sizes of the order of . Numerical simulations are also producing comparable volumes of information. Data mining tools are needed for information extraction from such large datasets. In this work, we propose a multidimensional indexing method, based on a static R-tree data structure, to efficiently query and mine large astrophysical datasets. We follow a top-down construction method, called VAMSplit, which recursively splits the dataset on a near median element along the dimension with maximum variance. The obtained index partitions the dataset into nonoverlapping bounding boxes, with volumes proportional to the local data density. Finally, we show an application of this method for the detection of point sources from a gamma-ray photon list.

Author information



Corresponding author

Correspondence to Marco Frailis.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Frailis, M., De Angelis, A. & Roberto, V. A Data-Driven Multidimensional Indexing Method for Data Mining in Astrophysical Databases. EURASIP J. Adv. Signal Process. 2005, 841610 (2005).

Download citation

Keywords and phrases

  • multidimensional indexing
  • VAMSplit R-tree
  • nearest-neighbor query
  • one-class SVM
  • point sources