RŮŽIČKA, Michal, Petr SOJKA and Martin LÍŠKA. Math Indexer and Searcher under the Hood: History and Development of a Winning Strategy. In Noriko Kando, Hideo Joho, Kazuaki Kishida. Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies. Tokyo: National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430 Japan, 2014, p. 127-134. ISBN 978-4-86049-065-2.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Math Indexer and Searcher under the Hood: History and Development of a Winning Strategy
Authors RŮŽIČKA, Michal (203 Czech Republic, belonging to the institution), Petr SOJKA (203 Czech Republic, guarantor, belonging to the institution) and Martin LÍŠKA (703 Slovakia, belonging to the institution).
Edition Tokyo, Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, p. 127-134, 8 pp. 2014.
Publisher National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430 Japan
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Japan
Confidentiality degree is not subject to a state or trade secret
Publication form storage medium (CD, DVD, flash disk)
WWW poster Proceedings page preprint PDF final PDF conference web
RIV identification code RIV/00216224:14330/14:00076746
Organization unit Faculty of Informatics
ISBN 978-4-86049-065-2
Keywords (in Czech) MIaS;MathML;indexování;vyhledávání;kanonické MathML;EuDML;digitální knihovny;informační systémy;indexování hledání matematického obsahu včetně formulí;hodnocení relevance a podobnosti matematických článků;dolování v textech;DML-CZ;digitální matematická knihovna;sémantika
Keywords in English MIaS;MathML;math indexing and retrieval; canonical MathML;EuDML;mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; WebMIaS;TeX; Lucene
Tags best1, firank_B
Tags International impact
Changed by Changed by: RNDr. Michal Růžička, Ph.D., učo 143424. Changed: 2/6/2016 11:38.
Abstract
This paper describes and summarizes experiences of Masaryk University team MIRMU with the mathematical search performed for the NTCIR pilot Math Task. Our approach is the similarity search based on MathML Canonicalization and second generation of scalable full text search engine Math Indexer and Searcher (MIaS) with attested state-of-the-art information retrieval techniques. The capability of MIaS system in terms of the math query notation, normalization, combining math with textual query tokens was deployed by submitting multiple runs with four query notations provided, and with results merged from multiple queries. The analysis of the evaluation results shows that the system performs best using TeX queries that are translated and canonicalized to Content MathML.
Links
LG13010, research and development projectName: Zastoupení ČR v European Research Consortium for Informatics and Mathematics (Acronym: ERCIM-CZ)
Investor: Ministry of Education, Youth and Sports of the CR
MUNI/A/0765/2013, interní kód MUName: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity (Acronym: SKOMU)
Investor: Masaryk University, Category A
PrintDisplayed: 26/4/2024 17:41