Detailed Information on Publication Record
2011
The Art of Mathematics Retrieval (invited talk at Informatics Colloquium FI MU, 8.11.2011)
SOJKA, PetrBasic information
Original name
The Art of Mathematics Retrieval (invited talk at Informatics Colloquium FI MU, 8.11.2011)
Name in Czech
Umění vyhledávání matematiky (zvaná přednáška na Informatickém kolokviu FI MU, 8.11.2011)
Authors
SOJKA, Petr (203 Czech Republic, guarantor, belonging to the institution)
Edition
Informatics Colloquium, 2011
Other information
Language
English
Type of outcome
Vyžádané přednášky
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
References:
RIV identification code
RIV/00216224:14330/11:00053852
Organization unit
Faculty of Informatics
Keywords (in Czech)
digitální matematická knihovna;vyhledávání;indexace;metadata s matematikou;DML-CZ; EuDML;MathML;TeX
Keywords in English
digital library; math search;math retrieval;indexing of mathematics;metadata handling; EuDML; semantics of mathematical documents; knowledge management; digitization; MathML; portal-systems; repositories of knowledge; DML-CZ
Tags
International impact
Změněno: 9/11/2011 15:37, doc. RNDr. Petr Sojka, Ph.D.
Abstract
V originále
The design and architecture of MIaS (Math Indexer and Searcher), a~system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a~similarity of math subformulae. The system was implemented as a~math-aware search engine based on the state-of-the-art system Apache Lucene and is used in The European Digital Mathematics Library - EuDML. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a~Solr-compatible Lucene.
Links
250503, interní kód MU |
|