D 2011

Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech

KOVÁŘ, Vojtěch, Aleš HORÁK and Miloš JAKUBÍČEK

Basic information

Original name

Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech

Authors

KOVÁŘ, Vojtěch (203 Czech Republic, guarantor, belonging to the institution), Aleš HORÁK (203 Czech Republic, belonging to the institution) and Miloš JAKUBÍČEK (203 Czech Republic, belonging to the institution)

Edition

Berlin/Heidelberg, Human Language Technology. Challenges for Computer Science and Linguistics, p. 161-171, 11 pp. 2011

Publisher

Springer

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

References:

RIV identification code

RIV/00216224:14330/11:00049734

Organization unit

Faculty of Informatics

ISBN

978-3-642-20094-6

Keywords in English

syntactic analysis; freeword- order languages; an alternative approach; natural language processing

Tags

International impact, Reviewed
Změněno: 20/6/2013 11:12, doc. RNDr. Aleš Horák, Ph.D.

Abstract

V originále

Syntactic analysis of natural languages is considered to be one of the basic steps to advanced natural language processing, such as logical analysis or information retrieval with natural language texts. The Czech language can be characterized as a morphologically rich language with a relatively free word order, which further complicates the problem of syntactic analysis. Current parsing systems for Czech fight many problems including low precision or high ambiguity of the parser output. In this paper, we show a new approach to syntactic analysis of free-word-order languages based on the idea of pattern matching linking rules. The system, named SET, is currently developed and tested with the Czech language as a representative of free-word-order languages with very rich morphological system. We briefly mention current approaches and parsing systems for Czech. Then we describe the basic ideas as well as details of SET’s prototype implementation of the pattern matching approach to syntactic analysis.

In Czech

Článek prezentuje novou metodu pro syntaktickou analýzu jazyků s volným pořádkem slov ve větě, založenou na vyhledávání konečných vzorků. Metoda je implementována v systému SET.

Links

GAP401/10/0792, research and development project
Name: Temporální aspekty znalostí a informací
Investor: Czech Science Foundation
LC536, research and development project
Name: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
2C06009, research and development project
Name: Prostředky tvorby komplexní báze znalostí pro komunikaci se sémantickým webem v přirozeném jazyce (Acronym: COT-SEWing)
Investor: Ministry of Education, Youth and Sports of the CR
248307, interní kód MU
Name: Pattern Recognition-based Statistically Enhanced MT (Acronym: PRESEMT)
Investor: European Union, Pattern Recognition-based Statistically Enhanced MT, Cooperation