Detailed Information on Publication Record
2020
Current Challenges in Web Corpus Building
JAKUBÍČEK, Miloš, Vojtěch KOVÁŘ, Pavel RYCHLÝ and Vít SUCHOMELBasic information
Original name
Current Challenges in Web Corpus Building
Authors
JAKUBÍČEK, Miloš (203 Czech Republic, guarantor, belonging to the institution), Vojtěch KOVÁŘ (203 Czech Republic, belonging to the institution), Pavel RYCHLÝ (203 Czech Republic, belonging to the institution) and Vít SUCHOMEL (203 Czech Republic, belonging to the institution)
Edition
Marseille, France, Proceedings of the 12th Web as Corpus Workshop, p. 1-4, 4 pp. 2020
Publisher
European Language Resources Association
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10200 1.2 Computer and information sciences
Country of publisher
France
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
electronic version available online
References:
RIV identification code
RIV/00216224:14330/20:00114153
Organization unit
Faculty of Informatics
ISBN
979-10-95546-68-9
Keywords in English
Web corpora; corpus building
Tags
International impact, Reviewed
Změněno: 28/5/2020 13:06, RNDr. Vít Suchomel, Ph.D.
Abstract
V originále
In this paper we discuss some of the current challenges in web corpus building that we faced in the recent years when expanding the corpora in Sketch Engine. The purpose of the paper is to provide an overview and raise discussion on possible solutions, rather than bringing ready solutions to the readers. For every issue we try to assess its severity and briefly discuss possible mitigation options.
Links
GA18-23891S, research and development project |
| ||
LM2018101, research and development project |
|