HANČAR, Pavel. Building Big Czech Corpus : Collecting and Converting Czech Corpora. In RASLAN 2008. Masaryk University, Brno: Masaryk University, Brno, 2008, p. 94-97, 100 pp. ISBN 978-80-210-4741-9.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Building Big Czech Corpus : Collecting and Converting Czech Corpora
Name in Czech Budování velkého českého korpusu : shromáždění a konverze českých korpusů
Authors HANČAR, Pavel (203 Czech Republic, guarantor, belonging to the institution).
Edition Masaryk University, Brno, RASLAN 2008, p. 94-97, 100 pp. 2008.
Publisher Masaryk University, Brno
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 60200 6.2 Languages and Literature
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW URL
RIV identification code RIV/00216224:14330/08:00024361
Organization unit Faculty of Informatics
ISBN 978-80-210-4741-9
UT WoS 000302212600015
Keywords in English corpus; desamb; vertjoin;
Tags corpus, desamb, vertjoin
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 7/6/2021 22:14.
Abstract
This paper describes a creating of a big Czech corpus from many Czech corpora kept on the NLP Centre server. It describes new tools developed for this purpose, difficulties which may come up and a way how solve them.
Abstract (in Czech)
Tento článek popisuje vytváření velkého českého korpusu z mnoha českých korpusů uložených na serveru centra NLP. Popisuje nástroje vytvořené k tomuto účelu, potíže, které se mohou objevit, a cesty jejich řešení.
Links
1ET200610406, research and development projectName: Jazyková poradna na internetu
Investor: Academy of Sciences of the Czech Republic, Internet Language Consulting Service
PrintDisplayed: 25/7/2024 22:05