An Architecture for Scientific Document Retrieval Using Textual
and Math Entailment Modules

PAKRAY, Partha a Petr SOJKA. An Architecture for Scientific Document Retrieval Using Textual and Math Entailment Modules. In Eighth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2014. Brno: Tribun EU, 2014, s. 107-117. ISSN 2336-4289. Dostupné z: https://dx.doi.org/10.13140/2.1.4036.2561.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	An Architecture for Scientific Document Retrieval Using Textual and Math Entailment Modules
Autoři	PAKRAY, Partha (356 Indie, domácí) a Petr SOJKA (203 Česká republika, garant, domácí).
Vydání	Brno, Eighth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2014, od s. 107-117, 11 s. 2014.
Nakladatel	Tribun EU

Další údaje
Originální jazyk	angličtina
Typ výsledku	Stať ve sborníku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Česká republika
Utajení	není předmětem státního či obchodního tajemství
Forma vydání	tištěná verze "print"
WWW	preprint článku DOI
Kód RIV	RIV/00216224:14330/14:00077458
Organizační jednotka	Fakulta informatiky
ISSN	2336-4289
Doi	http://dx.doi.org/10.13140/2.1.4036.2561
UT WoS	000374560500014
Klíčová slova česky	reprezentace jazyka; výběr významu; výběr významového slova; výběr významu slova; diskretizace reprezentace; reprezentace významu; empirická lingvistika
Klíčová slova anglicky	natural language representation; priming; lexical priming; semantic priming; data discretization; language modelling; representation of meaning; personal mental lexicon; empirical linguistics
Příznaky	Mezinárodní význam
Změnil	Změnil: doc. RNDr. Petr Sojka, Ph.D., učo 2378. Změněno: 11. 1. 2017 09:50.

Anotace

We present an architecture for scientific document retrieval. An existing system for textual and math-ware retrieval Math Indexer and Searcher MIaS is designed for extensions by modules for textual and math-aware entailment. The goal is to increase quality of retrieval (precision and recall) by handling natural languge variations of expressing semantically the same in texts and/or formulae. Entailment modules are designed to use several, ordered layers of processing on lexical, syntactic and semantic levels using natural language processing tools adapted for handling tree structures like mathematical formulae. If these tools are not able to decide on the entailment, generic knowledge databases are used deploying distributional semantics methods and tools. It is shown that sole use of distributional semantics for semantic textual entailment decisions on sentence level is surprisingly good. Finally, further research plans to deploy results in the digital mathematical libraries are outlined.

Návaznosti
LG13010, projekt VaV	Název: Zastoupení ČR v European Research Consortium for Informatics and Mathematics (Akronym: ERCIM-CZ)
LG13010, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Zastoupení ČR v European Research Consortium for Informatics and Mathematics
250503, interní kód MU	Název: The European Digital Mathematics Library (Akronym: EuDML)
250503, interní kód MU	Investor: Evropská unie, The European Digital Mathematics Library

VytisknoutZobrazeno: 25. 4. 2024 12:09

An Architecture for Scientific Document Retrieval Using Textual and Math Entailment Modules

Další aplikace