D 2004

Manatee, Bonito and Word Sketches for Czech

RYCHLÝ, Pavel and Pavel SMRŽ

Basic information

Original name

Manatee, Bonito and Word Sketches for Czech

Name in Czech

Manatee, Bonito a Word Sketches pro češtinu

Authors

RYCHLÝ, Pavel (203 Czech Republic, guarantor) and Pavel SMRŽ (203 Czech Republic)

Edition

Saint-Petersburg, Proceedings of the Second International Conference on Corpus Linguisitcs, p. 124-132, 9 pp. 2004

Publisher

Saint-Petersburg State University Press

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Russian Federation

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

RIV identification code

RIV/00216224:14330/04:00009665

Organization unit

Faculty of Informatics

ISBN

5-288-03531-8

Keywords in English

corpora; corpus management; statistics; word sketches
Změněno: 18/1/2005 11:22, doc. RNDr. Pavel Smrž, Ph.D.

Abstract

V originále

This paper deals with a newly designed and developed system Manatee that can be employed to manage corpora, especially extremely large ones with billions of words, and enables the efficient evaluation of complex queries and the computation of advanced statistics. The main functions of the tool are presented here, together with the introduction of its web-based graphical user interface, Bonito. The sophisticated statistical processing is demonstrated in an example of computing of Word Sketches. Special attention is paid to the definition of the word sketches for Czech and problems connected to its free word order

In Czech

Příspěvek se věnuje nově navrženému a vyvinutému systému Manatee, který může být použit pro management korpusů, zejména rozsáhlých, např. s miliardou slov. Rovněž představuje nové webovské uživatelské prostředí Bonito a systém Word Sketches pro češtinu.

Links

MSM 143300003, plan (intention)
Name: Interakce člověka s počítačem, dialogové systémy a asistivní technologie
Investor: Ministry of Education, Youth and Sports of the CR, Human-computer interaction, dialog systems and assistive technologies
1ET100300419, research and development project
Name: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu
Investor: Academy of Sciences of the Czech Republic, Intelligent Models, Algorithms, Methods and Tools for the Semantic Web (realization)