Information System of Masaryk University 

Exploiting Semantic Annotations in Math Information Retrieval

česky | in English

SOJKA, Petr. Exploiting Semantic Annotations in Math Information Retrieval. In Jaap Kamps, Jussi Karlgren, Peter Mika, Vanessa Murdock. Proceedings of ESAIR 2012. Maui, USA: ACM, 2012. p. 15-16, 2 pp. ISBN 978-1-4503-1717-7. doi:10.1145/2390148.2390157.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Exploiting Semantic Annotations in Math Information Retrieval
Name in Czech Využití sémantického značkování pro vyhledávání matematiky
Authors SOJKA, Petr (203 Czech Republic, guarantor, belonging to the institution).
Edition Maui, USA, Proceedings of ESAIR 2012, p. 15-16, 2 pp. 2012.
Publisher ACM
Other information
Original language English
Type of outcome article in proceedings
Field of Study Informatics
Country of publisher United States of America
Confidentiality degree is not subject to a state or trade secret
Publication form storage medium (CD, DVD, flash disk)
WWW poster workshop website DOI (ACM DL) preprint PDF
RIV identification code RIV/00216224:14330/12:00067468
Organization unit Faculty of Informatics
ISBN 978-1-4503-1717-7
UT WoS 000312604400008
Keywords (in Czech) MIaS;MathML;indexování;vyhledávání;kanonické MathML;EuDML;digitální knihovny;informační systémy;indexování hledání matematického obsahu včetně formulí;hodnocení relevance a podobnosti matematických článků;dolování v textech;DML-CZ;digitální matematická knihovna;sémantika
Keywords in English MIaS;MathML;indexing;search;canonical MathML;EuDML;digital libraries;information systems;information retrieval;mathematical content search;math indexing and retrieval;document ranking of math papers;text mining;DML-CZ;DML projects;semantics
Tags International impact, Reviewed
Changed by Changed by: doc. RNDr. Petr Sojka, Ph.D., učo 2378. Changed: 30. 5. 2013 01:11.
This paper describes exploitation of semantic annotations in the design and architecture of MIaS (Math Indexer and Searcher) system for mathematics retrieval. Basing on the claim that navigational and research search are `killer' applications for digital library such as the European Digital Mathematics Library, EuDML, we argue for an approach based on Natural Language Processing techniques as used in corpus management systems such as the Sketch Engine, that will reach web scalability and avoid inference problems. The main ideas are 1) to augment surface texts (including math formulae) with additional linked representations (maps) bearing semantic information (expanded formulae as text, canonicalized text and subformulae) for indexing, including support for indexing structural information (expressed as Content MathML or other tree structures) and 2) use semantic user preferences to order found documents. The semantic enhancements of the MIaS system are being implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene, with support for [MathML] tree indexing. Scalability issues have been checked against more than 400,000 arXiv documents.
LA09016, research and development projectName: Účast ČR v European Research Consortium for Informatics and Mathematics (ERCIM) (Acronym: ERCIM)
Investor: Ministry of Education, Youth and Sports of the CR, INGO
250503, internal MU codeName: The European Digital Mathematics Library (Acronym: EuDML)
Investor: European Union, Competitiveness and inovation framework programme
Type Name Uploaded/Created by Uploaded/Created Rights
991762 /1 Sojka, P. 14. 11. 2012


Address within IS
Address for the users outside IS
Address within Manager
Address within Manager for the users outside IS
Wed 14. 11. 2012 09:14, doc. RNDr. Petr Sojka, Ph.D.


Right to read:
  • anyone on the Internet
Right to upload:
Right to administer:
  • a concrete person doc. RNDr. Petr Sojka, Ph.D., učo 2378
p15-sojka.pdf Licence Creative Commons  File version Sojka, P. 14. 11. 2012


Right to read:
  • anyone logged in the IS
Right to upload:
Right to administer:
  • a concrete person doc. RNDr. Petr Sojka, Ph.D., učo 2378
Ask the author for author copy Displayed: 22. 9. 2017 11:56

Other references 

Go to top | Current date and time: 22. 9. 2017 11:56, Week 38 (even)

Contact: istech(zavináč/atsign)fi(tečka/dot)muni(tečka/dot)cz, Office for Studies, access rights administrators, is-technicians, e-technicians, IT support | Use of cookies | learn more about Information System