NOVÁK, David and Pavel ZEZULA. Rank Aggregation of Candidate Sets for Efficient Similarity Search. In 25th International Conference on Database and Expert Systems Applications (DEXA 2014 ). Haidelberg: Springer International Publishing Switzerland, 2014, p. 42-58. ISBN 978-3-319-10084-5. Available from: https://dx.doi.org/10.1007/978-3-319-10085-2_4.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Rank Aggregation of Candidate Sets for Efficient Similarity Search
Authors NOVÁK, David (203 Czech Republic, guarantor, belonging to the institution) and Pavel ZEZULA (203 Czech Republic, belonging to the institution).
Edition Haidelberg, 25th International Conference on Database and Expert Systems Applications (DEXA 2014 ), p. 42-58, 17 pp. 2014.
Publisher Springer International Publishing Switzerland
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 20201 Electrical and electronic engineering
Country of publisher Switzerland
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
Impact factor Impact factor: 0.402 in 2005
RIV identification code RIV/00216224:14330/14:00073743
Organization unit Faculty of Informatics
ISBN 978-3-319-10084-5
ISSN 0302-9743
Doi http://dx.doi.org/10.1007/978-3-319-10085-2_4
Keywords in English Similarity Search; Metric Space; Approximation; Scalability
Tags DISA, firank_B
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 27/4/2015 05:47.
Abstract
Many current applications need to organize data with respect to mutual similarity between data objects. Generic similarity retrieval in large data collections is a tough task that has been drawing researchers’ attention for two decades. A typical general strategy to retrieve the most similar objects to a given example is to access and then refine a candidate set of objects; the overall search costs (and search time) then typically correlate with the candidate set size. We propose a generic approach that combines several independent indexes by aggregating their candidate sets in such a way that the resulting candidate set can be one or two orders of magnitude smaller (while keeping the answer quality). This achievement comes at the expense of higher computational costs of the ranking algorithm but experiments on two real-life and one artificial datasets indicate that the overall gain can be significant.
Links
GBP103/12/G084, research and development projectName: Centrum pro multi-modální interpretaci dat velkého rozsahu
Investor: Czech Science Foundation
PrintDisplayed: 16/7/2024 14:26