Enlargement of the Czech Question-Answering Dataset to SQAD
v2.0

D 2017

Enlargement of the Czech Question-Answering Dataset to SQAD v2.0

ŠULGANOVÁ, Terézia, Marek MEDVEĎ and Aleš HORÁK

Basic information

Original name

Enlargement of the Czech Question-Answering Dataset to SQAD v2.0

Authors

ŠULGANOVÁ, Terézia (703 Slovakia, guarantor, belonging to the institution), Marek MEDVEĎ (703 Slovakia, belonging to the institution) and Aleš HORÁK (203 Czech Republic)

Edition

Brno, Proceedings of the Eleventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2017, p. 79-84, 6 pp. 2017

Publisher

Tribun EU

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

References:

URL

RIV identification code

RIV/00216224:14330/17:00095303

Organization unit

Faculty of Informatics

ISBN

978-80-263-1340-3

ISSN

UT WoS

000426613500009

Keywords in English

question answering; QA dataset; SQAD

Abstract

V originále

In this paper, we present the second version of Czech question-answering dataset called SQAD v2.0 (Simple Question Answering Database). The new version represents a large extension of our original SQAD database. In the current release, the dataset contains nearly 9,000 question-answer pairs completed with manual annotation of question and answer types. All texts in the dataset (the source documents, the question and the respective answer) are provided with complete morphological annotation in plain textual format. We offer detailed statistics of the SQAD v2.0 dataset based on the new QA annotation.

Links

GA15-13277S, research and development project

Name: Hyperintensionální logika pro analýzu přirozeného jazyka

Investor: Czech Science Foundation

LM2015071, research and development project

Name: Jazyková výzkumná infrastruktura v České republice (Acronym: LINDAT-Clarin)

Investor: Ministry of Education, Youth and Sports of the CR

Detailed Information on Publication Record