Detailed Information on Publication Record
2016
DSL Shared task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model
HERMAN, Ondřej, Vít SUCHOMEL, Vít BAISA and Pavel RYCHLÝBasic information
Original name
DSL Shared task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model
Authors
HERMAN, Ondřej (203 Czech Republic, guarantor, belonging to the institution), Vít SUCHOMEL (203 Czech Republic, belonging to the institution), Vít BAISA (203 Czech Republic, belonging to the institution) and Pavel RYCHLÝ (203 Czech Republic, belonging to the institution)
Edition
Osaka, Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), p. 114-118, 5 pp. 2016
Publisher
Association for Natural Language Processing (ANLP), Osaka, Japan
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
electronic version available online
References:
RIV identification code
RIV/00216224:14330/16:00092557
Organization unit
Faculty of Informatics
ISBN
978-4-87974-716-7
Keywords in English
language discrimination;expectation maximization;language model
Tags
Tags
International impact, Reviewed
Změněno: 1/11/2017 12:13, RNDr. Vít Suchomel, Ph.D.
Abstract
V originále
In this paper we investigate two approaches to discrimination of similar languages: Expectation--maximization algorithm for estimating conditional probability P(word|language) and byte level language models similar to compression-based language modelling methods. The accuracy of these methods reached respectively 86.6 % and 88.3 % on set A of the DSL Shared task 2016 competition.
Links
MUNI/A/0945/2015, interní kód MU |
| ||
7F14047, research and development project |
|