D 2021

FIMSIM: Discovering Communities By Frequent Item-Set Mining and Similarity Search

PESCHEL, Jakub, Michal BATKO, Jakub VALČÍK, Jan SEDMIDUBSKÝ, Pavel ZEZULA et. al.

Basic information

Original name

FIMSIM: Discovering Communities By Frequent Item-Set Mining and Similarity Search

Authors

PESCHEL, Jakub (203 Czech Republic, guarantor, belonging to the institution), Michal BATKO (203 Czech Republic, belonging to the institution), Jakub VALČÍK (203 Czech Republic), Jan SEDMIDUBSKÝ (203 Czech Republic, belonging to the institution) and Pavel ZEZULA (203 Czech Republic, belonging to the institution)

Edition

Cham, 14th International Conference on Similarity Search and Applications (SISAP), p. 372-383, 12 pp. 2021

Publisher

Springer International Publishing

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10200 1.2 Computer and information sciences

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

References:

Impact factor

Impact factor: 0.402 in 2005

RIV identification code

RIV/00216224:14330/21:00119128

Organization unit

Faculty of Informatics

ISBN

978-3-030-89656-0

ISSN

UT WoS

000722252200028

Keywords in English

community mining;frequent item-set mining;similarity search;network analysis

Tags

International impact, Reviewed
Změněno: 28/4/2022 09:57, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

With the growth of structured graph data, the analysis of networks is an important topic. Community mining is one of the main analytical tasks of network analysis. Communities are dense clusters of nodes, possibly containing additional information about a network. In this paper, we present a community-detection approach, called FIMSIM, which is based on principles of frequent item-set mining and similarity search. The frequent item-set mining is used to extract cores of the communities, and a proposed similarity function is applied to discover suitable surroundings of the cores. The proposed approach outperforms the state-of-the-art DB-Link Clustering algorithm while enabling the easier selection of parameters. In addition, possible modifications are proposed to control the resulting communities better.

Links

GA19-02033S, research and development project
Name: Vyhledávání, analytika a anotace datových toků lidských pohybů
Investor: Czech Science Foundation