Similarity Searching: Towards Bulk-loading Peer-to-Peer
Networks

DOHNAL, Vlastislav, Jan SEDMIDUBSKÝ, Pavel ZEZULA and David NOVÁK. Similarity Searching: Towards Bulk-loading Peer-to-Peer Networks. In 1st International Workshop on Similarity Search and Applications (SISAP 2008). Los Alamitos CA, Washington, Tokyo: IEEE Computer Society, 2008, p. 87-94. ISBN 978-0-7695-3101-4.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Similarity Searching: Towards Bulk-loading Peer-to-Peer Networks
Name in Czech	Podobnostní vyhledávání: směrem k efektivnímu budování P2P sítí
Authors	DOHNAL, Vlastislav (203 Czech Republic, guarantor, belonging to the institution), Jan SEDMIDUBSKÝ (203 Czech Republic, belonging to the institution), Pavel ZEZULA (203 Czech Republic) and David NOVÁK (203 Czech Republic, belonging to the institution).
Edition	Los Alamitos CA, Washington, Tokyo, 1st International Workshop on Similarity Search and Applications (SISAP 2008), p. 87-94, 8 pp. 2008.
Publisher	IEEE Computer Society

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Mexico
Confidentiality degree	is not subject to a state or trade secret
Publication form	printed version "print"
WWW	URL
RIV identification code	RIV/00216224:14330/08:00024136
Organization unit	Faculty of Informatics
ISBN	978-0-7695-3101-4
UT WoS	000255509900010
Keywords in English	similarity search; p2p network; peer split; index structure
Tags	DISA, index structure, p2p network, peer split, similarity search
Tags	International impact, Reviewed
Changed by	Changed by: RNDr. David Novák, Ph.D., učo 4335. Changed: 17/9/2013 08:52.

Abstract

Due to the exponential growth of digital data and its complexity, we need a technique which allows us to search such collections efficiently. A suitable solution is based on the peer-to-peer (P2P) network paradigm and the metric-space model of similarity. When a large volume of data is being inserted, the P2P network must expand to new peers in order to maintain its efficiency. Thus, many peers must be split. During a peer split, the data is halved and one half is migrated to a new peer. In this paper, we study the problem of peer splits and propose a specialized algorithm for speeding it up. In particular, we use the structured P2P network called the M-Chord. Search performance within a single peer is enhanced by the M-tree. In experimental evaluation, we compare the proposed algorithm with several straightforward solutions on a real network organizing 10 million images. Our algorithm provides a significant performance boost.

Abstract (in Czech)

Díky exponenciálnímu nárustu dat a jejich složitosti, potřebujeme nalézt techniku, která nám umožní efektivně prohledávat takové kolekce dat. Vhodné řešení je založeno na P2P sítích a metrickém přístupu pro modelování podobnosti. Když se vkládá velké množství dat, P2P síť se musí postupně rozšiřovat do většího počtu uzlů, aby dokázala udržet požadovanou výkonnost. Během tohoto procesu se tak se musí spousta uzlů rozdělit. Když se štěpí uzel, data jsou rozdělena na polovinu a jedna půlka je pak přesunuta do nově vytvořeného uzlu. V tomto článku studujeme problém štěpení jednoho uzlu a navrhujeme vhodné techniky pro urychlení tohoto procesu. Obzvláště, využíváme P2P síť nazývanou M-Chord. Výkonnost vyhledávání v jednom uzlu je vylepšena lokální indexovou strukturou nazývanou M-tree. V experimentální části porovnáváme navržený algoritmus s několika přímočarými řešeními na skutečné síti indexující 10 miliónů obrázků.

Links
GP201/07/P240, research and development project	Name: Distribuované indexační struktury pro podobnostní hledání
GP201/07/P240, research and development project	Investor: Czech Science Foundation, Distributed Index Structures for Similarity Searching
1ET100300419, research and development project	Name: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu
1ET100300419, research and development project	Investor: Academy of Sciences of the Czech Republic, Intelligent Models, Algorithms, Methods and Tools for the Semantic Web (realization)

PrintDisplayed: 27/4/2024 13:51

Similarity Searching: Towards Bulk-loading Peer-to-Peer Networks

Other applications