BATKO, Michal, Petra BUDÍKOVÁ and David NOVÁK. CoPhIR Image Collection under the Microscope. In Proceedings of the 2009 Second International Workshop on Similarity Search and Applications. Washington, DC, USA: IEEE Computer Society, 2009, p. 47-54. ISBN 978-0-7695-3765-8.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name CoPhIR Image Collection under the Microscope
Name in Czech Kolekce obrázků CoPhIR pod drobnohledem
Authors BATKO, Michal (203 Czech Republic, belonging to the institution), Petra BUDÍKOVÁ (203 Czech Republic, belonging to the institution) and David NOVÁK (203 Czech Republic, guarantor, belonging to the institution).
Edition Washington, DC, USA, Proceedings of the 2009 Second International Workshop on Similarity Search and Applications, p. 47-54, 8 pp. 2009.
Publisher IEEE Computer Society
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher United States of America
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
RIV identification code RIV/00216224:14330/09:00029662
Organization unit Faculty of Informatics
ISBN 978-0-7695-3765-8
UT WoS 000282087600006
Keywords in English metric space; MPEG-7; visual descriptors; CoPhIR dataset; dataset analysis
Tags DISA
Tags International impact, Reviewed
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 14/3/2016 14:49.
Abstract
The Content-based Photo Image Retrieval (CoPhIR) dataset is the largest available database of digital images with corresponding visual descriptors. It contains five MPEG-7 global descriptors extracted from more than 106 million images from Flickr photo-sharing system. In this paper, we analyze this dataset focusing on 1) efficiency of similarity-based indexing and searching and on 2) expressiveness of combination of the descriptors with respect to subjective perception of visual similarity. We treat the descriptors as metric spaces and then combine them into a multi-metric space. We analyze distance distributions of individual descriptors, measure intrinsic dimensionality of these datasets and statistically evaluate correlation between these descriptors. Further, we use two methods to assess subjective accuracy and satisfaction of similarity retrieval based on a combination of descriptors that is recommended for CoPhIR, and we compare these results on databases of 10 and 100 million CoPhIR images. Finally, we suggest, explore and evaluate two approaches to improve the accuracy: 1) applying logarithms in order to weaken influence of a single descriptor contribution if it deviates from the rest, and 2) the possibility of categorization of the dataset and identifying visual characteristics important for individual categories.
Abstract (in Czech)
CoPhIR (Content-based Photo Image Retrieval) je největší dostupná databáze...
Links
GA201/09/0683, research and development projectName: Vyhledávání v rozsáhlých multimediálních databázích
Investor: Czech Science Foundation, Similarity Searching in Very Large Multimedia Databases
PrintDisplayed: 27/4/2024 12:36