JAKUBÍČEK, Miloš, Pavel RYCHLÝ a Adam KILGARRIFF. Effective Corpus Virtualization. In Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Bański, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, Jani Takhsha. Challenges in the Management of Large Corpora (CMLC-2). Reykjavik: EUROPEAN LANGUAGE RESOURCES ASSOCIATION-ELRA. s. 7-9. ISBN 978-2-9517408-8-4. 2014. |
Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1186171, author = {Jakubíček, Miloš and Rychlý, Pavel and Kilgarriff, Adam}, address = {Reykjavik}, booktitle = {Challenges in the Management of Large Corpora (CMLC-2)}, editor = {Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Bański, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, Jani Takhsha}, keywords = {corpus; corpus linguistics; virtualization; indexing; database}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Reykjavik}, isbn = {978-2-9517408-8-4}, pages = {7-9}, publisher = {EUROPEAN LANGUAGE RESOURCES ASSOCIATION-ELRA}, title = {Effective Corpus Virtualization}, url = {http://corpora.ids-mannheim.de/cmlc.html}, year = {2014} }
TY - JOUR ID - 1186171 AU - Jakubíček, Miloš - Rychlý, Pavel - Kilgarriff, Adam PY - 2014 TI - Effective Corpus Virtualization PB - EUROPEAN LANGUAGE RESOURCES ASSOCIATION-ELRA CY - Reykjavik SN - 9782951740884 KW - corpus KW - corpus linguistics KW - virtualization KW - indexing KW - database UR - http://corpora.ids-mannheim.de/cmlc.html N2 - In this paper we describe an implementation of corpus virtualization within the Manatee corpus management system. Under corpus virtualization we understand logical manipulation with corpora or their parts grouping them into new (virtual) corpora. We discuss the motivation for such a setup in detail and show space and time efficiency of this approach evaluated on a 11 billion word corpus of Spanish. ER -
JAKUBÍČEK, Miloš, Pavel RYCHLÝ a Adam KILGARRIFF. Effective Corpus Virtualization. In Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Ba\'nski, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, Jani Takhsha. \textit{Challenges in the Management of Large Corpora (CMLC-2)}. Reykjavik: EUROPEAN LANGUAGE RESOURCES ASSOCIATION-ELRA. s.~7-9. ISBN~978-2-9517408-8-4. 2014.
|