2006
Language Resources for Intelligent Processing of Dialogues about Electrical Networks
HORÁK, Aleš; Lukáš SVOBODA; Vladimír KADLEC a Pavel CENEKZákladní údaje
Originální název
Language Resources for Intelligent Processing of Dialogues about Electrical Networks
Název česky
Jazykové zdroje pro inteligentní zpracování dialogů o elektrických sítích
Autoři
HORÁK, Aleš; Lukáš SVOBODA; Vladimír KADLEC a Pavel CENEK
Vydání
Ostrava, Proceedings of ElNet 2005, od s. 42-49, 7 s. 2006
Nakladatel
VŠB TU Ostrava
Další údaje
Jazyk
angličtina
Typ výsledku
Stať ve sborníku
Obor
10201 Computer sciences, information science, bioinformatics
Stát vydavatele
Česká republika
Utajení
není předmětem státního či obchodního tajemství
Označené pro přenos do RIV
Ano
Kód RIV
RIV/00216224:14330/06:00015281
Organizační jednotka
Fakulta informatiky
ISBN
80-248-0975-3
Klíčová slova anglicky
corpora; question answering; desambiguation; electircal networks
Příznaky
Recenzováno
Změněno: 9. 1. 2007 11:21, doc. RNDr. Aleš Horák, Ph.D.
V originále
The paper describes the process of designing a natural language dialogue interface for querying large databases with time data about electrical power network failures. The first stage of implementation of such dialogue interface consists of creation and preparation of several auxiliary resources that are required for natural language processing of texts over this specific domain. All modern methods of automatic input analysis of texts covering a domain with special terminology are based on a collection of large amount of texts from the field, so called textual corpus. We describe the process and statistical results of creation of a corpus of electrical power networks texts consisting of more than 100.000 of positions (words and marks). We also offer some preliminary results of syntactical analysis of these texts. In the last part of this paper, we present the design of a dialogue system based on the analysis techniques using the corpus data that will allow natural language queries (in Czech) over the database of power networks failures.
Česky
The paper describes the process of designing a natural language dialogue interface for querying large databases with time data about electrical power network failures. The first stage of implementation of such dialogue interface consists of creation and preparation of several auxiliary resources that are required for natural language processing of texts over this specific domain. All modern methods of automatic input analysis of texts covering a domain with special terminology are based on a collection of large amount of texts from the field, so called textual corpus. We describe the process and statistical results of creation of a corpus of electrical power networks texts consisting of more than 100.000 of positions (words and marks). We also offer some preliminary results of syntactical analysis of these texts. In the last part of this paper, we present the design of a dialogue system based on the analysis techniques using the corpus data that will allow natural language queries (in Czech) over the database of power networks failures.
Návaznosti
| 1ET100300414, projekt VaV |
|