Other formats:
BibTeX
LaTeX
RIS
@inproceedings{560586, author = {Rychlý, Pavel and Smrž, Pavel}, address = {Saint-Petersburg}, booktitle = {Proceedings of the Second International Conference on Corpus Linguisitcs}, keywords = {corpora; corpus management; statistics; word sketches}, language = {eng}, location = {Saint-Petersburg}, isbn = {5-288-03531-8}, pages = {124-132}, publisher = {Saint-Petersburg State University Press}, title = {Manatee, Bonito and Word Sketches for Czech}, url = {http://nlp.fi.muni.cz/publications/corpora2004_pary_smrz/}, year = {2004} }
TY - JOUR ID - 560586 AU - Rychlý, Pavel - Smrž, Pavel PY - 2004 TI - Manatee, Bonito and Word Sketches for Czech PB - Saint-Petersburg State University Press CY - Saint-Petersburg SN - 5288035318 KW - corpora KW - corpus management KW - statistics KW - word sketches UR - http://nlp.fi.muni.cz/publications/corpora2004_pary_smrz/ N2 - This paper deals with a newly designed and developed system Manatee that can be employed to manage corpora, especially extremely large ones with billions of words, and enables the efficient evaluation of complex queries and the computation of advanced statistics. The main functions of the tool are presented here, together with the introduction of its web-based graphical user interface, Bonito. The sophisticated statistical processing is demonstrated in an example of computing of Word Sketches. Special attention is paid to the definition of the word sketches for Czech and problems connected to its free word order ER -
RYCHLÝ, Pavel and Pavel SMRŽ. Manatee, Bonito and Word Sketches for Czech. In \textit{Proceedings of the Second International Conference on Corpus Linguisitcs}. Saint-Petersburg: Saint-Petersburg State University Press, 2004, p.~124-132. ISBN~5-288-03531-8.
|