2026
On the evaluation and optimization of LabeledPAM
JÁNOŠOVÁ, Miriama; Andreas LANG; Petra BUDÍKOVÁ; Erich SCHUBERT; Vlastislav DOHNAL et al.Základní údaje
Originální název
On the evaluation and optimization of LabeledPAM
Autoři
Vydání
Information Systems, 2026, 0306-4379
Další údaje
Jazyk
angličtina
Typ výsledku
Článek v odborném periodiku
Obor
10200 1.2 Computer and information sciences
Stát vydavatele
Nizozemské království
Utajení
není předmětem státního či obchodního tajemství
Odkazy
Impakt faktor
Impact factor: 3.400 v roce 2024
Označené pro přenos do RIV
Ano
Organizační jednotka
Fakulta informatiky
UT WoS
EID Scopus
Klíčová slova anglicky
semi-supervised clustering; k-medoids; partitioning around medoids; FasterPAM; semi-supervised classification
Příznaky
Mezinárodní význam, Recenzováno
Změněno: 1. 4. 2026 11:04, RNDr. Pavel Šmerk, Ph.D.
Anotace
V originále
The analysis of complex and weakly labeled data is increasingly popular. Traditional unsupervised clustering aims to uncover interrelated sets of objects based on feature-based similarity. This approach often reaches its limits when dealing with complex multimedia data due to the curse of dimensionality, presenting unique challenges. Semi-supervised clustering, which leverages small amounts of labeled data, has the potential to cope with this problem. In this work, we delve into LabeledPAM, a semi-supervised clustering method, which extends FasterPAM, a state-of-the-art 𝑘-medoids clustering algorithm. Our algorithm is designed for both semi-supervised classification, where labels are assigned to clusters with minimal labeled data, and semi-supervised clustering, where new clusters with unknown labels are identified. We propose an optimization to the original LabeledPAM algorithm that reduces its computational complexity. Additionally, we provide an implementation in Rust, which integrates seamlessly with Python libraries. To assess LabeledPAM’s performance, we empirically evaluate its properties by comparing it against a range of semi-supervised clustering algorithms, including density-based ones. We conduct experiments on a collection of real-world datasets. Our results demonstrate that LabeledPAM achieves competitive clustering quality while maintaining efficiency across various scenarios, showing its versatility for real-world applications.
Návaznosti
| GF23-07040K, projekt VaV |
| ||
| MUNI/A/1638/2024, interní kód MU |
|