2022
On Usefulness of Outlier Elimination in Classification Tasks
HETLEROVIĆ, Dušan, Lubomír POPELÍNSKÝ, P. BRAZDIL, C. SOARES, F. FREAITAS et. al.Základní údaje
Originální název
On Usefulness of Outlier Elimination in Classification Tasks
Název česky
On Usefulness of Outlier Elimination in Classification Tasks
Autoři
HETLEROVIĆ, Dušan (703 Slovensko, domácí), Lubomír POPELÍNSKÝ (203 Česká republika, domácí), P. BRAZDIL, C. SOARES a F. FREAITAS
Vydání
Rennes, International Symposium on Intelligent Data Analysis 2022, od s. 143-156, 14 s. 2022
Nakladatel
Springer
Další údaje
Jazyk
angličtina
Typ výsledku
Stať ve sborníku
Obor
10201 Computer sciences, information science, bioinformatics
Utajení
není předmětem státního či obchodního tajemství
Forma vydání
elektronická verze "online"
Impakt faktor
Impact factor: 0.402 v roce 2005
Kód RIV
RIV/00216224:14330/22:00126186
Organizační jednotka
Fakulta informatiky
ISBN
978-3-031-01332-4
ISSN
UT WoS
000937256100012
Klíčová slova česky
Outlier elimination; Metalearning; Average ranking; Reduction of portfolios
Klíčová slova anglicky
Outlier elimination; Metalearning; Average ranking; Reduction of portfolios
Příznaky
Mezinárodní význam, Recenzováno
Změněno: 28. 3. 2023 12:48, RNDr. Pavel Šmerk, Ph.D.
V originále
Although outlier detection/elimination has been studied before, few comprehensive studies exist on when exactly this technique would be useful as preprocessing in classification tasks. The objective of our study is to fill in this gap. We have performed experiments with 12 various outlier elimination methods and 10 classification algorithms on 50 different datasets. The results were then processed by the proposed reduction method, whose aim is identify the most useful workflows for a given set of tasks (datasets). The reduction method has identified that just three OEMs that are generally useful for the given set of tasks. We have shown that the inclusion of these OEMs is indeed useful, as it leads to lower loss in accuracy and the difference is quite significant (0.5\%) on average.
Česky
Although outlier detection/elimination has been studied before, few comprehensive studies exist on when exactly this technique would be useful as preprocessing in classification tasks. The objective of our study is to fill in this gap. We have performed experiments with 12 various outlier elimination methods and 10 classification algorithms on 50 different datasets. The results were then processed by the proposed reduction method, whose aim is identify the most useful workflows for a given set of tasks (datasets). The reduction method has identified that just three OEMs that are generally useful for the given set of tasks. We have shown that the inclusion of these OEMs is indeed useful, as it leads to lower loss in accuracy and the difference is quite significant (0.5\%) on average.