Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{858703, author = {Novák, David and Batko, Michal}, address = {Washington, DC, USA}, booktitle = {Proceedings of the 2009 Second International Workshop on Similarity Search and Applications}, keywords = {metric space; similarity search; data structure; approximation; scalability}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Washington, DC, USA}, isbn = {978-0-7695-3765-8}, pages = {65-73}, publisher = {IEEE Computer Society}, title = {Metric Index: An Efficient and Scalable Solution for Similarity Search}, url = {http://portal.acm.org/citation.cfm?id=1637863.1638184}, year = {2009} }
TY - JOUR ID - 858703 AU - Novák, David - Batko, Michal PY - 2009 TI - Metric Index: An Efficient and Scalable Solution for Similarity Search PB - IEEE Computer Society CY - Washington, DC, USA SN - 9780769537658 KW - metric space KW - similarity search KW - data structure KW - approximation KW - scalability UR - http://portal.acm.org/citation.cfm?id=1637863.1638184 N2 - Metric space as a universal and versatile model of similarity can be applied in various areas of non-text information retrieval. However, a general, efficient and scalable solution for metric data management is still a resisting research challenge. We introduce a novel indexing and searching mechanism called Metric Index (M-Index), that employs practically all known principles of metric space partitioning, pruning and filtering. The heart of the M-Index is a general mapping mechanism that enables to actually store the data in well-established structures such as the B+-tree or even in a distributed storage. We have implemented the M-Index with B+-tree and performed experiments on a combination of five MPEG-7 descriptors in a database of hundreds of thousands digital images. The experiments put under test several M-Index variants and compare them with two orthogonal approaches - the PM-Tree and the iDistance. The trials show that the M-Index outperforms the others in terms of efficiency of search-space pruning, I/O costs, and response times for precise similarity queries. Furthermore, the M-Index demonstrates an excellent ability to keep similar data close in the index which makes its approximation algorithm very efficient - maintaining practically constant response times while preserving a very high recall as the dataset grows. ER -
NOVÁK, David a Michal BATKO. Metric Index: An Efficient and Scalable Solution for Similarity Search. In \textit{Proceedings of the 2009 Second International Workshop on Similarity Search and Applications}. Washington, DC, USA: IEEE Computer Society, 2009, s.~65-73. ISBN~978-0-7695-3765-8.
|