D 2019

BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning

ANTOL, Matej and Vlastislav DOHNAL

Basic information

Original name

BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning

Authors

ANTOL, Matej (703 Slovakia, belonging to the institution) and Vlastislav DOHNAL (203 Czech Republic, guarantor, belonging to the institution)

Edition

LNCS 11695. Cham, Advances in Databases and Information Systems, 23th East European Conference, ADBIS 2019, p. 337-353, 17 pp. 2019

Publisher

Springer International Publishing

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Netherlands

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

References:

Impact factor

Impact factor: 0.402 in 2005

RIV identification code

RIV/00216224:14330/19:00109747

Organization unit

Faculty of Informatics

ISBN

978-3-030-28729-0

ISSN

UT WoS

000558104700024

Keywords in English

Indexing structure;k-nearest neighbor query;Approximate search;Metric space;Voronoi partitioning

Tags

International impact, Reviewed
Změněno: 30/9/2020 12:42, doc. RNDr. Vlastislav Dohnal, Ph.D.

Abstract

V originále

Processing large volumes of various data needs index structures that can efficiently organize them on secondary memory. Methods based on so-called pivot permutations have become popular in addressing these requirements because of their tremendous querying performance. They localize data objects by ordering preselected anchor objects by their distances to the data objects, and so no coordinate system is exploited to partition the data. This represents a generic solution for unstructured and high-dimensional data. In principle, pivot permutations implement recursive Voronoi tessellation. Also, due to the fixed preselected anchors, such partitioning cannot adapt to the data distribution and leads to very unbalanced cells. In this paper, we address this issue and propose a novel schema called BM-index. It exploits weighted Voronoi partitioning to create pivot permutations that adapt to data distribution. Secondary memory is then accessed efficiently with respect to the existing disk-oriented structures, such as M-index. We present an algorithm to balance the data partitions, and we show its correctness. In experiments on a real-life image collection CoPhIR, we show superior performance in I/O costs when evaluating k-nearest neighbors queries.

Links

EF16_019/0000822, research and development project
Name: Centrum excelence pro kyberkriminalitu, kyberbezpečnost a ochranu kritických informačních infrastruktur
MUNI/A/1411/2019, interní kód MU
Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A