Detailed Information on Publication Record
2016
Between Comparable and Parallel: English-Czech Corpus from Wikipedia
ŠTROMAJEROVÁ, Adéla, Vít BAISA and Marek BLAHUŠBasic information
Original name
Between Comparable and Parallel: English-Czech Corpus from Wikipedia
Authors
ŠTROMAJEROVÁ, Adéla (203 Czech Republic, guarantor, belonging to the institution), Vít BAISA (203 Czech Republic, belonging to the institution) and Marek BLAHUŠ (203 Czech Republic, belonging to the institution)
Edition
Brno, RASLAN 2016 Recent Advances in Slavonic Natural Language Processing, p. 3-8, 6 pp. 2016
Publisher
Tribun EU
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
printed version "print"
References:
RIV identification code
RIV/00216224:14330/16:00091974
Organization unit
Faculty of Informatics
ISBN
978-80-263-1095-2
ISSN
UT WoS
000466886400001
Keywords (in Czech)
paralelní korpus; srovnatelný korpus; Wikipedie
Keywords in English
parallel corpora; comparable corpora; Wikipedia
Tags
International impact, Reviewed
Změněno: 27/5/2021 09:10, Mgr. et Mgr. Vít Baisa, Ph.D.
Abstract
V originále
We describe the process of creating a parallel corpus from Czech and English Wikipedias using methods which are language independent. The corpus consists of Czech and English Wikipedia articles, the Czech ones being translations of the English ones, is aligned on sentence level and is accessible in Sketch Engine corpus manager.
Links
LM2015071, research and development project |
| ||
MUNI/A/0863/2015, interní kód MU |
|