Další formáty:
BibTeX
LaTeX
RIS
@misc{346746, author = {Popelínský, Lubomír and Pavelek, Tomáš and Ptáčník, Tomáš}, address = {Brno (CZE)}, keywords = {Lemma disambiguation; Corpus; Natural language processing; Machine learning}, language = {eng}, location = {Brno (CZE)}, publisher = {FI MU}, title = {On Disambiguation in Czech Corpora}, year = {2000} }
TY - GEN ID - 346746 AU - Popelínský, Lubomír - Pavelek, Tomáš - Ptáčník, Tomáš PY - 2000 TI - On Disambiguation in Czech Corpora PB - FI MU CY - Brno (CZE) KW - Lemma disambiguation KW - Corpus KW - Natural language processing KW - Machine learning N2 - Lemma disambiguation means finding the basic word form, typically nominative singular for nouns or infinitive for verbs. We developed a multistrategy method for lemma disambiguation of unannotated text. The method is based on a combination of inductive logic programming and instance-based learning. We present results of the most important subtasks of lemma disambiguation for Czech language. Although no expert knowledge on Czech grammar has been used the accuracy reaches 90% with a fraction of words remaining ambiguous. We also display first results of tag disambiguation. ER -
POPELÍNSKÝ, Lubomír, Tomáš PAVELEK a Tomáš PTÁČNÍK. \textit{On Disambiguation in Czech Corpora}. Brno (CZE): FI MU, 2000, 012 s.
|