HORÁK, Aleš, Vít BAISA and Ondřej HERMAN. Benchmark Dataset for Propaganda Detection in Czech Newspaper Texts. In Proceedings of Recent Advances in Natural Language Processing, RANLP 2019. Varna, Bulgaria: INCOMA Ltd., 2019, p. 77-83. ISBN 978-954-452-055-7. Available from: https://dx.doi.org/10.26615/978-954-452-056-4_010.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Benchmark Dataset for Propaganda Detection in Czech Newspaper Texts
Authors HORÁK, Aleš (203 Czech Republic, guarantor, belonging to the institution), Vít BAISA (203 Czech Republic, belonging to the institution) and Ondřej HERMAN (203 Czech Republic, belonging to the institution).
Edition Varna, Bulgaria, Proceedings of Recent Advances in Natural Language Processing, RANLP 2019, p. 77-83, 7 pp. 2019.
Publisher INCOMA Ltd.
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Bulgaria
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW URL
RIV identification code RIV/00216224:14330/19:00110579
Organization unit Faculty of Informatics
ISBN 978-954-452-055-7
ISSN 1313-8502
Doi http://dx.doi.org/10.26615/978-954-452-056-4_010
Keywords in English propaganda detection; manipulative techniques; benchmark dataset
Tags International impact, Reviewed
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 3/5/2020 12:49.
Abstract
Propaganda of various pressure groups ranging from big economies to ideological blocks is often presented in a form of objective newspaper texts. However, the real objectivity is here shaded with the support of imbalanced views and distorted attitudes by means of various manipulative stylistic techniques. In the project of Manipulative Propaganda Techniques in the Age of Internet, a new resource for automatic analysis of stylistic mechanisms for influencing the readers’ opinion is developed. In its current version, the resource consists of 7,494 newspaper articles from four selected Czech digital news servers annotated for the presence of specific manipulative techniques. In this paper, we present the current state of the annotations and describe the structure of the dataset in detail. We also offer an evaluation of bag-of-words classification algorithms for the annotated manipulative techniques.
Links
MUNI/A/1018/2018, interní kód MUName: Rozsáhlé výpočetní systémy: modely, aplikace a verifikace VIII.
Investor: Masaryk University, Category A
MUNI/G/0872/2016, interní kód MUName: Manipulativní techniky propagandy v době internetu
Investor: Masaryk University, INTERDISCIPLINARY - Interdisciplinary research projects
PrintDisplayed: 25/8/2024 15:43