Detailed Information on Publication Record
2008
Combining Metric Features in Large Collections
BATKO, Michal, Petra KOHOUTKOVÁ and Pavel ZEZULABasic information
Original name
Combining Metric Features in Large Collections
Name in Czech
Kombinování metrických charakteristik ve velkých kolekcích dat
Authors
BATKO, Michal (203 Czech Republic, guarantor), Petra KOHOUTKOVÁ (203 Czech Republic) and Pavel ZEZULA (203 Czech Republic)
Edition
Los Alamitos CA, Washington, Tokyo, 1st International Workshop on Similarity Search and Applications (SISAP 2008), p. 79-86, 8 pp. 2008
Publisher
IEEE Computer Society
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Mexico
Confidentiality degree
není předmětem státního či obchodního tajemství
References:
RIV identification code
RIV/00216224:14330/08:00024185
Organization unit
Faculty of Informatics
ISBN
978-0-7695-3101-4
UT WoS
000255509900009
Keywords in English
similarity search; complex query; p2p network; approximation
Tags
International impact, Reviewed
Změněno: 19/6/2009 16:21, RNDr. Michal Batko, Ph.D.
V originále
Current information systems are required to process complex digital objects, which are typically characterized by multiple descriptors. Since the values of many descriptors belong to non-sortable domains, they are effectively comparable only by a sort ofsimilarity. Moreover, the scalability is very important in the current digital-explosion age. Therefore, we propose a distributed extension of the well-known threshold algorithm for peer-to-peer paradigm. The technique allows to answer similarity queries that combine multiple similarity measures and due to its peer-to-peer nature it is highly scalable. We also explore possibilities of approximate evaluation strategies, where some relevant results can be lost in favor of increasing the efficiency by order of magnitude. To reveal the strengths and weaknesses of our approach we have experimented with a 1.6 million image database from Flicker comparing the content of the images by five similarity measures from the MPEG-7 standard. To the best of our knowledge, the experience with such a huge real-life dataset is quite unique.
In Czech
Článek popisuje rozšíření existujícího "prahovacího" algoritmu pro prostředí peer-to-peer sítí. Technika umožňuje řešit podobnostní dotazy kombinující několik podobnostních měřítek a díky využití peer-to-peer technologie je vysoce škálovatelná. Dále jsou v článku rozebírany přínosy aproximativní strategie. Výsledky jsou ověřeny na databázi s 1,6 miliony obrázků ze systému Flickr.
Links
GP201/08/P507, research and development project |
| ||
1ET100300419, research and development project |
|