D 2014

Partial Grammar Checking for Czech Using the SET Parser

KOVÁŘ, Vojtěch

Basic information

Original name

Partial Grammar Checking for Czech Using the SET Parser

Authors

KOVÁŘ, Vojtěch (203 Czech Republic, guarantor, belonging to the institution)

Edition

prvni. Berlin Heidelberg, 17th International Conference, TSD 2014, p. 308-314, 7 pp. 2014

Publisher

Springer Verlag

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

Impact factor

Impact factor: 0.402 in 2005

RIV identification code

RIV/00216224:14330/14:00077608

Organization unit

Faculty of Informatics

ISBN

978-3-319-10815-5

ISSN

Keywords in English

parser; SET; Czech; grammar checking; punctuation detection; syntactic analysis

Tags

Tags

International impact, Reviewed
Změněno: 27/4/2015 06:15, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

Checking people’s writing for correctness is one of the prominent language technology applications. In the Czech language, punctuation errors and mistakes in subject-predicate agreement belong to the most severe and most frequent errors people make, as there are complex and non-intuitive rules for both of these phenomena. At the same time, they include numerous syntactic, semantic and pragmatic aspects which makes them very difficult to be formalized for automatic checking. In this paper, we present an automatic method for fixing errors in commas and subject-predicate agreement, using pattern-matching rule-based syntactic analysis provided by the SET parsing system. We explain the method and present first evaluation of the overall accuracy.

Links

LM2010013, research and development project
Name: LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat (Acronym: LINDAT-Clarin)
Investor: Ministry of Education, Youth and Sports of the CR