PESCHEL, Jakub, Michal BATKO and Pavel ZEZULA. On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data. Online. In 2022 IEEE International Symposium on Multimedia (ISM). Neuveden: IEEE, 2022, p. 202-205. ISBN 978-1-6654-7173-2. Available from: https://dx.doi.org/10.1109/ISM55400.2022.00044.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data
Authors PESCHEL, Jakub (203 Czech Republic, guarantor, belonging to the institution), Michal BATKO (203 Czech Republic, belonging to the institution) and Pavel ZEZULA (203 Czech Republic, belonging to the institution).
Edition Neuveden, 2022 IEEE International Symposium on Multimedia (ISM), p. 202-205, 4 pp. 2022.
Publisher IEEE
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10200 1.2 Computer and information sciences
Confidentiality degree is not subject to a state or trade secret
Publication form electronic version available online
WWW URL
RIV identification code RIV/00216224:14330/22:00127166
Organization unit Faculty of Informatics
ISBN 978-1-6654-7173-2
Doi http://dx.doi.org/10.1109/ISM55400.2022.00044
UT WoS 000964457800037
Keywords in English Sequential Pattern Mining; GSP; SPAM; Prefix-span
Tags DISA, firank_B
Tags International impact, Reviewed
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 16/8/2023 13:26.
Abstract
Sequential pattern mining, which is one of the core tasks in data mining, allows to gain insight into datasets with complex sequential data. As the task is computationally intensive, there are many different approaches that are suitable for various types of data. We explore the possibility of optimising the analysis of sequences based on the characteristic (quickly obtainable) properties of the analysed data. In this paper, we propose five such characteristics and explore the efficiency of three algorithms that are representatives of the three main approaches to sequential pattern mining. We discovered that it is possible to save up to 21% of the search time compared to the best-performing representative. We trained a decision tree model with 87% accuracy of choosing the best algorithm for selected data based on these characteristics.
Links
EF16_019/0000822, research and development projectName: Centrum excelence pro kyberkriminalitu, kyberbezpečnost a ochranu kritických informačních infrastruktur
PrintDisplayed: 27/4/2024 12:07