D 2013

Semi-automatic Theme-Rheme Identification

PALA, Karel and Ondřej SVOBODA

Basic information

Original name

Semi-automatic Theme-Rheme Identification

Authors

PALA, Karel (203 Czech Republic, guarantor, belonging to the institution) and Ondřej SVOBODA (203 Czech Republic, belonging to the institution)

Edition

Brno, Seventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2013, p. 39-48, 10 pp. 2013

Publisher

Tribun EU

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/13:00070352

Organization unit

Faculty of Informatics

ISBN

978-80-263-0520-0

Keywords in English

theme-rheme; Functional Sentence Perspective; topic-focus articulation;

Tags

International impact, Reviewed
Změněno: 2/12/2013 15:19, Mgr. Lucia Kocincová

Abstract

V originále

In this paper we start from the theory of the Functional Sentence Perspective developed primarily by Firbas [1], Svoboda [2] and also Sgall, Hajicová [3] and make an attempt to formulate a procedure allowing to semi-automatically recognize which sentence constituents carry information that is contextually dependent and thus known to an adressee (theme), constituents containing new information (rheme), and also constituents bearing non-thematic and non-rhematic information (transition). Having themes and rhemes recognized as successfully as possible we also hope to investigate thematic progression (thematic line) in texts in the future. The core of the procedure and its experimental implementation for Czech (using the bushbank corpus CBB.Blog [4] as a data source) are described in the paper. Since the task is really complicated we only offer basic evaluation, which, in our view, shows that the task is feasible.

Links

LM2010013, research and development project
Name: LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat (Acronym: LINDAT-Clarin)
Investor: Ministry of Education, Youth and Sports of the CR