Language Resources for Intelligent Processing of Dialogues
about Electrical Networks

D 2006

Language Resources for Intelligent Processing of Dialogues about Electrical Networks

HORÁK, Aleš; Lukáš SVOBODA; Vladimír KADLEC a Pavel CENEK

Základní údaje

Originální název

Language Resources for Intelligent Processing of Dialogues about Electrical Networks

Název česky

Jazykové zdroje pro inteligentní zpracování dialogů o elektrických sítích

Autoři

HORÁK, Aleš; Lukáš SVOBODA; Vladimír KADLEC a Pavel CENEK

Vydání

Ostrava, Proceedings of ElNet 2005, od s. 42-49, 7 s. 2006

Nakladatel

VŠB TU Ostrava

Další údaje

Jazyk

angličtina

Typ výsledku

Stať ve sborníku

Obor

10201 Computer sciences, information science, bioinformatics

Stát vydavatele

Česká republika

Utajení

není předmětem státního či obchodního tajemství

Označené pro přenos do RIV

Ano

Kód RIV

RIV/00216224:14330/06:00015281

Organizační jednotka

Fakulta informatiky

ISBN

80-248-0975-3

Klíčová slova anglicky

corpora; question answering; desambiguation; electircal networks

Štítky

corpora, desambiguation, electircal networks, question answering

Příznaky

Recenzováno

Změněno: 9. 1. 2007 11:21, doc. RNDr. Aleš Horák, Ph.D.

Anotace

ORIG CZ

V originále

The paper describes the process of designing a natural language dialogue interface for querying large databases with time data about electrical power network failures. The first stage of implementation of such dialogue interface consists of creation and preparation of several auxiliary resources that are required for natural language processing of texts over this specific domain. All modern methods of automatic input analysis of texts covering a domain with special terminology are based on a collection of large amount of texts from the field, so called textual corpus. We describe the process and statistical results of creation of a corpus of electrical power networks texts consisting of more than 100.000 of positions (words and marks). We also offer some preliminary results of syntactical analysis of these texts. In the last part of this paper, we present the design of a dialogue system based on the analysis techniques using the corpus data that will allow natural language queries (in Czech) over the database of power networks failures.

Česky

The paper describes the process of designing a natural language dialogue interface for querying large databases with time data about electrical power network failures. The first stage of implementation of such dialogue interface consists of creation and preparation of several auxiliary resources that are required for natural language processing of texts over this specific domain. All modern methods of automatic input analysis of texts covering a domain with special terminology are based on a collection of large amount of texts from the field, so called textual corpus. We describe the process and statistical results of creation of a corpus of electrical power networks texts consisting of more than 100.000 of positions (words and marks). We also offer some preliminary results of syntactical analysis of these texts. In the last part of this paper, we present the design of a dialogue system based on the analysis techniques using the corpus data that will allow natural language queries (in Czech) over the database of power networks failures.

Návaznosti

1ET100300414, projekt VaV

Název: Inteligentní metody pro zvýšení spolehlivosti elektrických sítí

Investor: Akademie věd ČR, Inteligentní metody pro zvýšení spolehlivosti elektrických sítí

Přehled o publikaci