How Many Dots Are Really Needed for Head-Driven Chart Parsing?

KADLEC, Vladimír a Pavel SMRŽ. How Many Dots Are Really Needed for Head-Driven Chart Parsing? Lecture Notes in Artificial Intelligence. Berlin: Springer, 2006, roč. 3831/2006, č. 1, s. 483-492, 9 s. ISSN 0302-9743.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	How Many Dots Are Really Needed for Head-Driven Chart Parsing?
Název česky	Kolik skutečně potřebujeme teček pro syntaktickou analýzu řízenou hlavou pravidla?
Autoři	KADLEC, Vladimír (203 Česká republika, garant) a Pavel SMRŽ (203 Česká republika).
Vydání	Lecture Notes in Artificial Intelligence, Berlin, Springer, 2006, 0302-9743.

Další údaje
Originální jazyk	angličtina
Typ výsledku	Článek v odborném periodiku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Česká republika
Utajení	není předmětem státního či obchodního tajemství
Impakt faktor	Impact factor: 0.302 v roce 2005
Kód RIV	RIV/00216224:14330/06:00015577
Organizační jednotka	Fakulta informatiky
UT WoS	000235805500046
Klíčová slova anglicky	nlp; CFG; parsing
Štítky	CFG, NLP, parsing
Změnil	Změnil: RNDr. Vladimír Kadlec, Ph.D., učo 3541. Změněno: 3. 2. 2006 15:21.

Anotace

This paper presents an improved form of head-driven chart parser that is appropriate for large context-free grammars. The basic method --- HDddm (Head-Driven dependent dot move) --- is introduced first. Both variants that improve the basic approach are based on the same idea --- to reduce the number of chart edges by modifying the form of items (dotted rules). The first one ``unifies'' the items that share the analyzed part of the relevant rule (thus, only one dot is needed to mark the position before and after the covered part). The second method applies the inverse strategy, it ``eliminates'' the parts that have not been covered yet (no dot needed). All the discussed alternatives are described in the form of parsing schemata. We also shortly mention a tricky technique (employing a special trie-like data structure developed originally for Scrabble) that enables to minimize the extra information needed in the algorithms. We demonstrate the advantages of the described methods by the significant decreases in the number of edges for charts. The results are given for the standard set of testing grammars (and respective inputs) as well as for a large and highly ambiguous Czech grammar.

Anotace česky

V článku je prezentován vylepšený algoritmus syntaktické analýzy pomocí bezkontextových gramatik řízený hlavou pravidla. Nejdříve je popsána základní metoda -- HDddm. Obě další varianty, které vylepšují rychlost této základní metody jsou založeny na podobné myšlence -- redukovat množství generovaných hran tabulkového analyzátoru pomocí modifikace položek (pravidel s tečkami). První metoda sjednocuje položky, které mají stejnou již analyzovanou část pravidla. Druhá varianta naopak eliminuje tu část položky, která již byla analyzována. Jako datová struktura pro algoritmus je navržena struktura původně určená pro implementaci hry Scrabble. Výhody popsaných technik jsou demonstrovány na standardních testovacích datech pro angličtinu a také na vysoce nejednoznažné bezkontextové gramatice češtiny.

Návaznosti
GA201/05/2781, projekt VaV	Název: Překlad českých vět do konstrukcí transparentní intenzionální logiky
GA201/05/2781, projekt VaV	Investor: Grantová agentura ČR, Překlad českých vět do konstrukcí transparentní intenzionální logiky

VytisknoutZobrazeno: 25. 4. 2024 23:05

How Many Dots Are Really Needed for Head-Driven Chart Parsing?

Další aplikace