Information Extraction for Czech Based on Syntactic Analysis

BAISA, Vít and Vojtěch KOVÁŘ. Information Extraction for Czech Based on Syntactic Analysis. In Zygmunt Vetulani. Human Language Technologies as a Challenge for Computer Science and Linguistics, Proceedings of 5th Language and Technology Conference. Poznań: Funcacja Universytetu im. A. Mickiewicza, 2011, p. 466-470. ISBN 978-83-932640-1-8.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Information Extraction for Czech Based on Syntactic Analysis
Name in Czech	Extrakce informací pro češtinu založená na syntaktické analýze
Authors	BAISA, Vít (203 Czech Republic, guarantor, belonging to the institution) and Vojtěch KOVÁŘ (203 Czech Republic, belonging to the institution).
Edition	Poznań, Human Language Technologies as a Challenge for Computer Science and Linguistics, Proceedings of 5th Language and Technology Conference, p. 466-470, 5 pp. 2011.
Publisher	Funcacja Universytetu im. A. Mickiewicza

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Czech Republic
Confidentiality degree	is not subject to a state or trade secret
RIV identification code	RIV/00216224:14330/11:00050162
Organization unit	Faculty of Informatics
ISBN	978-83-932640-1-8
UT WoS	000345651500013
Keywords (in Czech)	extrakce informací;syntaktická analýza;sémantická klasifikace;morfologická desambiguace
Keywords in English	information extraction; syntactic analysis; semantic classification; morphological disambiguation
Tags	International impact, Reviewed
Changed by	Changed by: Mgr. et Mgr. Vít Baisa, Ph.D., učo 139654. Changed: 28/6/2012 12:45.

Abstract

We present a complex pipeline of natural language processing tools for Czech that performs extraction of basic facts presented in a text. The input for the tool is a plain text, the output contains verb and noun phrases with basic semantic classification. Automatic syntactic analysis of Czech plays a crucial role in the pipeline. In this paper, we describe the particular tools used in the system, then we give an example of its usage and conclude with a basic evaluation of the overall system accuracy.

Abstract (in Czech)
Článek popisuje postupnou aplikaci několika nástrojů pro zpracování češtiny, jejímž výsledkem je extrakce základních faktů z textu. Vstupem nástroje je volný text, výstupem jsou jmenné a slovesné fráze spolu se základní sémantickou klasifikací. Důležitou roli hrají nástroje pro automatickou syntaktickou analýzu češtiny.

Links
GAP401/10/0792, research and development project	Name: Temporální aspekty znalostí a informací
GAP401/10/0792, research and development project	Investor: Czech Science Foundation
GA407/07/0679, research and development project	Name: Právní e-slovník - PES
GA407/07/0679, research and development project	Investor: Czech Science Foundation, Legal e-dictionary - PES
VF20102014003, research and development project	Name: Analýza přirozeného jazyka v prostředí internetu (Acronym: APJI)
VF20102014003, research and development project	Investor: Ministry of the Interior of the CR

PrintDisplayed: 26/7/2024 07:30

Information Extraction for Czech Based on Syntactic Analysis

Other applications