Avoiding Anomalies in Data Stream Learning

GAMA, Joao, Petr KOSINA a Ezilda ALMEIDA. Avoiding Anomalies in Data Stream Learning. In Johannes Furnkranz, Eyke Hullermeier,Tomoyuki Higuchi. Discovery Science, Proceedings of 16th International Conference DS 2013. Berlin Heidelberg: Springer, 2013, s. 49-63. ISBN 978-3-642-40896-0. Dostupné z: https://dx.doi.org/10.1007/978-3-642-40897-7_4.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	Avoiding Anomalies in Data Stream Learning
Autoři	GAMA, Joao (620 Portugalsko), Petr KOSINA (203 Česká republika, garant, domácí) a Ezilda ALMEIDA (620 Portugalsko).
Vydání	Berlin Heidelberg, Discovery Science, Proceedings of 16th International Conference DS 2013, od s. 49-63, 15 s. 2013.
Nakladatel	Springer

Další údaje
Originální jazyk	angličtina
Typ výsledku	Stať ve sborníku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Německo
Utajení	není předmětem státního či obchodního tajemství
Forma vydání	tištěná verze "print"
WWW	URL
Impakt faktor	Impact factor: 0.402 v roce 2005
Kód RIV	RIV/00216224:14330/13:00070032
Organizační jednotka	Fakulta informatiky
ISBN	978-3-642-40896-0
ISSN	0302-9743
Doi	http://dx.doi.org/10.1007/978-3-642-40897-7_4
UT WoS	000340562100004
Klíčová slova anglicky	Data Streams; Rule Learning; Anomaly Detection
Štítky	firank_B
Příznaky	Mezinárodní význam, Recenzováno
Změnil	Změnil: RNDr. Pavel Šmerk, Ph.D., učo 3880. Změněno: 7. 1. 2019 14:02.

Anotace

The presence of anomalies in data compromises data quality and can reduce the effectiveness of learning algorithms. Standard data mining methodologies refer to data cleaning as a pre-processing before the learning task. The problem of data cleaning is exacerbated when learning in the computational model of data streams. In this paper we present a streaming algorithm for learning classification rules able to detect contextual anomalies in the data. Contextual anomalies are surprising attribute values in the context defined by the conditional part of the rule. For each example we compute the degree of anomaliness based on the probability of the attribute-values given the conditional part of the rule covering the example. The examples with high degree of anomaliness are signaled to the user and not used to train the classifier. The experimental evaluation in real-world data sets shows the ability to discover anomalous examples in the data. The main advantage of the proposed method is the ability to inform the context and explain why the anomaly occurs.

Návaznosti
LG13010, projekt VaV	Název: Zastoupení ČR v European Research Consortium for Informatics and Mathematics (Akronym: ERCIM-CZ)
LG13010, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Zastoupení ČR v European Research Consortium for Informatics and Mathematics
MUNI/A/0758/2011, interní kód MU	Název: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity (Akronym: SKOMU)
MUNI/A/0758/2011, interní kód MU	Investor: Masarykova univerzita, Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity, DO R. 2020_Kategorie A - Specifický výzkum - Studentské výzkumné projekty

VytisknoutZobrazeno: 6. 5. 2024 02:51

Avoiding Anomalies in Data Stream Learning

Další aplikace