J 2015

Multi-modal Similarity Retrieval with Distributed Key-value Store

NOVÁK, David

Basic information

Original name

Multi-modal Similarity Retrieval with Distributed Key-value Store

Authors

NOVÁK, David (203 Czech Republic, guarantor, belonging to the institution)

Edition

MOBILE NETWORKS & APPLICATIONS, DORDRECHT, SPRINGER, 2015, 1383-469X

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Netherlands

Confidentiality degree

není předmětem státního či obchodního tajemství

Impact factor

Impact factor: 1.538

RIV identification code

RIV/00216224:14330/15:00081691

Organization unit

Faculty of Informatics

UT WoS

000360003900013

Keywords in English

Similarity search; Multi-modal search; Big Data; Scalability; Distributed hash table

Tags

Tags

International impact, Reviewed
Změněno: 6/4/2016 14:13, RNDr. David Novák, Ph.D.

Abstract

V originále

We propose a system architecture for large-scale similarity search in various types of digital data. The architecture combines contemporary highly-scalable distributed data stores with recent efficient similarity indexes and also with other types of search indexes. The system enables various types of data access by distance-based similarity queries, standard term and attribute queries, and advanced queries combining several search aspects (modalities). The first part of this work describes the generic architecture and similarity index PPP-Codes, which is suitable for our system. In the second part, we describe two specific instances of this architecture that manage two large collections of digital images and provide content-based visual search, keyword search, attribute-based access, and their combinations. The first collection is the CoPhIR benchmark with 106 million images accessed by MPEG7 visual descriptors and the second collection contains 20 million images with complex features obtained from deep convolutional neural network.

Links

GAP103/10/0886, research and development project
Name: Vizuální vyhledávání obrázků na Webu (Acronym: VisualWeb)
Investor: Czech Science Foundation, Content-based Image Retrieval on the Web Scale