2019
BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning
ANTOL, Matej a Vlastislav DOHNALZákladní údaje
Originální název
BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning
Autoři
ANTOL, Matej (703 Slovensko, domácí) a Vlastislav DOHNAL (203 Česká republika, garant, domácí)
Vydání
LNCS 11695. Cham, Advances in Databases and Information Systems, 23th East European Conference, ADBIS 2019, od s. 337-353, 17 s. 2019
Nakladatel
Springer International Publishing
Další údaje
Jazyk
angličtina
Typ výsledku
Stať ve sborníku
Obor
10201 Computer sciences, information science, bioinformatics
Stát vydavatele
Nizozemské království
Utajení
není předmětem státního či obchodního tajemství
Forma vydání
tištěná verze "print"
Odkazy
Impakt faktor
Impact factor: 0.402 v roce 2005
Kód RIV
RIV/00216224:14330/19:00109747
Organizační jednotka
Fakulta informatiky
ISBN
978-3-030-28729-0
ISSN
UT WoS
000558104700024
Klíčová slova anglicky
Indexing structure;k-nearest neighbor query;Approximate search;Metric space;Voronoi partitioning
Příznaky
Mezinárodní význam, Recenzováno
Změněno: 30. 9. 2020 12:42, doc. RNDr. Vlastislav Dohnal, Ph.D.
Anotace
V originále
Processing large volumes of various data needs index structures that can efficiently organize them on secondary memory. Methods based on so-called pivot permutations have become popular in addressing these requirements because of their tremendous querying performance. They localize data objects by ordering preselected anchor objects by their distances to the data objects, and so no coordinate system is exploited to partition the data. This represents a generic solution for unstructured and high-dimensional data. In principle, pivot permutations implement recursive Voronoi tessellation. Also, due to the fixed preselected anchors, such partitioning cannot adapt to the data distribution and leads to very unbalanced cells. In this paper, we address this issue and propose a novel schema called BM-index. It exploits weighted Voronoi partitioning to create pivot permutations that adapt to data distribution. Secondary memory is then accessed efficiently with respect to the existing disk-oriented structures, such as M-index. We present an algorithm to balance the data partitions, and we show its correctness. In experiments on a real-life image collection CoPhIR, we show superior performance in I/O costs when evaluating k-nearest neighbors queries.
Návaznosti
EF16_019/0000822, projekt VaV |
| ||
MUNI/A/1411/2019, interní kód MU |
|