D 2013

Similarity Search for Mathematics: Masaryk University team at the NTCIR-10 Math Task

LÍŠKA, Martin, Petr SOJKA and Michal RŮŽIČKA

Basic information

Original name

Similarity Search for Mathematics: Masaryk University team at the NTCIR-10 Math Task

Authors

LÍŠKA, Martin (703 Slovakia, belonging to the institution), Petr SOJKA (203 Czech Republic, guarantor, belonging to the institution) and Michal RŮŽIČKA (203 Czech Republic, belonging to the institution)

Edition

Tokyo, Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, p. 686-691, 6 pp. 2013

Publisher

National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430 Japan

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

storage medium (CD, DVD, flash disk)

RIV identification code

RIV/00216224:14330/13:00068654

Organization unit

Faculty of Informatics

ISBN

978-4-86049-062-1

Keywords (in Czech)

MIaS;MathML;indexování;vyhledávání;kanonické MathML;EuDML;digitální knihovny;informační systémy;indexování hledání matematického obsahu včetně formulí;hodnocení relevance a podobnosti matematických článků;dolování v textech;DML-CZ;digitální matematická knihovna;sémantika

Keywords in English

math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; WebMIaS; MIaS;TeX; Lucene

Tags

Tags

International impact
Změněno: 28/4/2014 06:26, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

This paper describes and summarizes experiences of Masaryk University team MIRMU with the mathematical search performed for the NTCIR pilot Math Task. Our approach is the similarity search based on enhanced full text search utilizing attested state-of-the-art techniques and implementations. The variability of used Math Indexer and Searcher (MIaS) system in terms of the math query notation was tested by submitting multiple runs with four query notations provided. The analysis of the evaluation results shows that the system performs best using TeX queries that are translated to combined Presentation-Content MathML.

Links

LG13010, research and development project
Name: Zastoupení ČR v European Research Consortium for Informatics and Mathematics (Acronym: ERCIM-CZ)
Investor: Ministry of Education, Youth and Sports of the CR
250503, interní kód MU
Name: The European Digital Mathematics Library (Acronym: EuDML)
Investor: European Union

Files attached