RYCHLÝ, Pavel and Pavel SMRŽ. Manatee, Bonito and Word Sketches for Czech. In Proceedings of the Second International Conference on Corpus Linguisitcs. Saint-Petersburg: Saint-Petersburg State University Press, 2004, p. 124-132. ISBN 5-288-03531-8.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Manatee, Bonito and Word Sketches for Czech
Name in Czech Manatee, Bonito a Word Sketches pro češtinu
Authors RYCHLÝ, Pavel (203 Czech Republic, guarantor) and Pavel SMRŽ (203 Czech Republic).
Edition Saint-Petersburg, Proceedings of the Second International Conference on Corpus Linguisitcs, p. 124-132, 9 pp. 2004.
Publisher Saint-Petersburg State University Press
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Russian Federation
Confidentiality degree is not subject to a state or trade secret
WWW URL
RIV identification code RIV/00216224:14330/04:00009665
Organization unit Faculty of Informatics
ISBN 5-288-03531-8
Keywords in English corpora; corpus management; statistics; word sketches
Tags corpora, Corpus Management, statistics, word sketches
Changed by Changed by: doc. RNDr. Pavel Smrž, Ph.D., učo 1297. Changed: 18/1/2005 11:22.
Abstract
This paper deals with a newly designed and developed system Manatee that can be employed to manage corpora, especially extremely large ones with billions of words, and enables the efficient evaluation of complex queries and the computation of advanced statistics. The main functions of the tool are presented here, together with the introduction of its web-based graphical user interface, Bonito. The sophisticated statistical processing is demonstrated in an example of computing of Word Sketches. Special attention is paid to the definition of the word sketches for Czech and problems connected to its free word order
Abstract (in Czech)
Příspěvek se věnuje nově navrženému a vyvinutému systému Manatee, který může být použit pro management korpusů, zejména rozsáhlých, např. s miliardou slov. Rovněž představuje nové webovské uživatelské prostředí Bonito a systém Word Sketches pro češtinu.
Links
MSM 143300003, plan (intention)Name: Interakce člověka s počítačem, dialogové systémy a asistivní technologie
Investor: Ministry of Education, Youth and Sports of the CR, Human-computer interaction, dialog systems and assistive technologies
1ET100300419, research and development projectName: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu
Investor: Academy of Sciences of the Czech Republic, Intelligent Models, Algorithms, Methods and Tools for the Semantic Web (realization)
PrintDisplayed: 29/9/2024 23:05