D 2024

Towards Personalized Similarity Search for Vector Databases

MAHRÍK, Marek, Matúš ŠIKYŇA, Vladimír MÍČ and Pavel ZEZULA

Basic information

Original name

Towards Personalized Similarity Search for Vector Databases

Authors

Edition

Cham, 17th International Conference on Similarity Search and Applications (SISAP 2024), p. 126-139, 14 pp. 2024

Publisher

Springer

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10200 1.2 Computer and information sciences

Country of publisher

Switzerland

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

Organization unit

Faculty of Informatics

ISBN

978-3-031-75822-5

Keywords in English

Similarity search;Personalized similarity;Vector databases

Tags

International impact, Reviewed
Změněno: 31/10/2024 22:21, Mgr. et Mgr. Matúš Šikyňa

Abstract

V originále

The importance of similarity search has become prominent in the fast-evolving vector databases, which apply content embedding techniques on complex data to produce and manage large collections of high-dimensional vectors. Processing of such data is only possible by using a similarity function for storage, structure, and retrieval. However, if multiple users access the collection, their views on similarity can differ as similarity, in general, is subjective and context-dependent. In this article, we elaborate on the problem of a similarity search engine implementation, where users use a common index but search with personalised views of similarity, implemented by a possibly different similarity model. Specifically, we define a foundational theoretical framework and conduct experiments on real-life data to confirm the viability of such an approach. The experiments also indicate future research directions needed to propose and implement an effective and efficient personalised similarity search engine.

Links

MUNI/A/1590/2023, interní kód MU
Name: Využití technik umělé inteligence pro zpracování dat, komplexní analýzy a vizualizaci rozsáhlých dat
Investor: Masaryk University, Using artificial intelligence techniques for data processing, complex analysis and visualization of large-scale data