On Selection of Efficient Sequential Pattern Mining Algorithm
Based on Characteristics of Data

D 2022

On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data

PESCHEL, Jakub, Michal BATKO and Pavel ZEZULA

Basic information

Original name

On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data

Authors

PESCHEL, Jakub (203 Czech Republic, guarantor, belonging to the institution), Michal BATKO (203 Czech Republic, belonging to the institution) and Pavel ZEZULA (203 Czech Republic, belonging to the institution)

Edition

Neuveden, 2022 IEEE International Symposium on Multimedia (ISM), p. 202-205, 4 pp. 2022

Publisher

IEEE

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

10200 1.2 Computer and information sciences

Confidentiality degree

is not subject to a state or trade secret

Publication form

electronic version available online

References:

URL

RIV identification code

RIV/00216224:14330/22:00127166

Organization unit

Faculty of Informatics

ISBN

978-1-6654-7173-2

DOI

http://dx.doi.org/10.1109/ISM55400.2022.00044

UT WoS

000964457800037

Keywords in English

Sequential Pattern Mining; GSP; SPAM; Prefix-span

Abstract

V originále

Sequential pattern mining, which is one of the core tasks in data mining, allows to gain insight into datasets with complex sequential data. As the task is computationally intensive, there are many different approaches that are suitable for various types of data. We explore the possibility of optimising the analysis of sequences based on the characteristic (quickly obtainable) properties of the analysed data. In this paper, we propose five such characteristics and explore the efficiency of three algorithms that are representatives of the three main approaches to sequential pattern mining. We discovered that it is possible to save up to 21% of the search time compared to the best-performing representative. We trained a decision tree model with 87% accuracy of choosing the best algorithm for selected data based on these characteristics.

Links

EF16_019/0000822, research and development project

Name: Centrum excelence pro kyberkriminalitu, kyberbezpečnost a ochranu kritických informačních infrastruktur

Citovat

PESCHEL, Jakub, Michal BATKO and Pavel ZEZULA. On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data. Online. In 2022 IEEE International Symposium on Multimedia (ISM). Neuveden: IEEE, 2022, p. 202-205. ISBN 978-1-6654-7173-2. Available from: https://dx.doi.org/10.1109/ISM55400.2022.00044.

@inproceedings{2232098,
   author = {Peschel, Jakub and Batko, Michal and Zezula, Pavel},
   address = {Neuveden},
   booktitle = {2022 IEEE International Symposium on Multimedia (ISM)},
   doi = {http://dx.doi.org/10.1109/ISM55400.2022.00044},
   keywords = {Sequential Pattern Mining; GSP; SPAM; Prefix-span},
   howpublished = {elektronická verze "online"},
   language = {eng},
   location = {Neuveden},
   isbn = {978-1-6654-7173-2},
   pages = {202-205},
   publisher = {IEEE},
   title = {On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data},
   url = {https://ieeexplore.ieee.org/abstract/document/10019622},
   year = {2022}
}

TY  - CONF
ID  - 2232098
AU  - Peschel, Jakub - Batko, Michal - Zezula, Pavel
PY  - 2022
TI  - On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data
PB  - IEEE
CY  - Neuveden
SN  - 9781665471732
KW  - Sequential Pattern Mining
KW  - GSP
KW  - SPAM
KW  - Prefix-span
UR  - https://ieeexplore.ieee.org/abstract/document/10019622
N2  - Sequential pattern mining, which is one of the core tasks in data mining, allows to gain insight into datasets with complex sequential data. As the task is computationally intensive, there are many different approaches that are suitable for various types of data. We explore the possibility of optimising the analysis of sequences based on the characteristic (quickly obtainable) properties of the analysed data. In this paper, we propose five such characteristics and explore the efficiency of three algorithms that are representatives of the three main approaches to sequential pattern mining. We discovered that it is possible to save up to 21% of the search time compared to the best-performing representative. We trained a decision tree model with 87% accuracy of choosing the best algorithm for selected data based on these characteristics.
ER  -

PESCHEL, Jakub, Michal BATKO and Pavel ZEZULA. On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data. Online. In \textit{2022 IEEE International Symposium on Multimedia (ISM)}. Neuveden: IEEE, 2022, p.~202-205. ISBN~978-1-6654-7173-2. Available from: https://dx.doi.org/10.1109/ISM55400.2022.00044.

Přehled o publikaci