D 2010

On Locality-sensitive Indexing in Generic Metric Spaces

NOVÁK, David, Martin KYSELÁK and Pavel ZEZULA

Basic information

Original name

On Locality-sensitive Indexing in Generic Metric Spaces

Name in Czech

O indexovaní respektujícím vzdálenosti objektů v obecných metrických prostorech.

Authors

NOVÁK, David (203 Czech Republic, guarantor, belonging to the institution), Martin KYSELÁK (203 Czech Republic) and Pavel ZEZULA (203 Czech Republic, belonging to the institution)

Edition

New York, 3rd International Conference on Similarity Search and Applications, p. 59-66, 8 pp. 2010

Publisher

ACM Press

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Turkey

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/10:00044857

Organization unit

Faculty of Informatics

ISBN

978-1-4503-0420-7

Keywords in English

locality-sensitive hashing; metric space; similarity search; approximation; scalability

Tags

Tags

International impact, Reviewed
Změněno: 17/9/2013 08:44, RNDr. David Novák, Ph.D.

Abstract

V originále

The concept of Locality-sensitive Hashing (LSH) has been successfully used for searching in high-dimensional data and a number of locality-preserving hash functions have been introduced. In order to extend the applicability of the LSH approach to a general metric space, we focus on a recently presented Metric Index (M-Index), we redefine its hashing and searching process in the terms of LSH, and perform extensive measurements on two datasets to verify that the M-Index fulfills the conditions of the LSH concept. We widely discuss "optimal" properties of LSH functions and the efficiency of a given LSH function with respect to kNN queries. The results also indicate that the M-Index hashing and searching is more efficient than the tested standard LSH approach for Euclidean distance.

Links

GAP103/10/0886, research and development project
Name: Vizuální vyhledávání obrázků na Webu (Acronym: VisualWeb)
Investor: Czech Science Foundation, Content-based Image Retrieval on the Web Scale
GA201/09/0683, research and development project
Name: Vyhledávání v rozsáhlých multimediálních databázích
Investor: Czech Science Foundation, Similarity Searching in Very Large Multimedia Databases
GPP202/10/P220, research and development project
Name: Podobnostní vyhledávání s konstantní škálovatelností (Acronym: SIM-SCALE)
Investor: Czech Science Foundation
1M0545, research and development project
Name: Institut Teoretické Informatiky
Investor: Ministry of Education, Youth and Sports of the CR, Institute for Theoretical Computer Science