BLAŤÁK, Jan, Lubomír POPELÍNSKÝ and Eva MRÁKOVÁ. Fragments and Text Categorization. In The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics. Barcelona (Spain): Association for Computational Linguistics, 2004. p. 226-229. ISBN 1-932432-33-7.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Fragments and Text Categorization
Name in Czech Fragmenty a kategorizace textů
Authors BLAŤÁK, Jan (203 Czech Republic), Lubomír POPELÍNSKÝ (203 Czech Republic, guarantor) and Eva MRÁKOVÁ (203 Czech Republic).
Edition Barcelona (Spain), The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics, p. 226-229, 4 pp. 2004.
Publisher Association for Computational Linguistics
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Spain
Confidentiality degree is not subject to a state or trade secret
RIV identification code RIV/00216224:14330/04:00010203
Organization unit Faculty of Informatics
ISBN 1-932432-33-7
Keywords in English text classification; fragments
Tags fragments, text classification
Changed by Changed by: RNDr. Jan Blaťák, Ph.D., učo 2978. Changed: 3/2/2005 16:57.
Abstract
We introduce two novel methods of text categorization in which documents are split into fragments. We conducted experiments on English, French and Czech. In all cases, the problems referred to a binary document classification. We find that both methods increase the accuracy of text categorization. For the Naive Bayes classifier this increase is significant.
Abstract (in Czech)
Prezentujeme dvě nové metody pro kategorizaci dokumentů za použití fragmentů. Uvádíme výsledky experimentů binární klasifikace anglických, francouzských a českých dokumentů. Obě metody poskytují zlepšení přesnosti, přičemž pro naivní bayesovský klasifikátor je zlepšení statisticky významné.
Links
MSM 143300003, plan (intention)Name: Interakce člověka s počítačem, dialogové systémy a asistivní technologie
Investor: Ministry of Education, Youth and Sports of the CR, Human-computer interaction, dialog systems and assistive technologies
PrintDisplayed: 27/1/2023 22:07