SOJKA, Petr and Martin LÍŠKA. Indexing and Searching Mathematics in Digital Libraries -- Architecture, Design and Scalability Issues. In James H. Davenport, William M. Farmer, Josef Urban, Florian Rabe. Intelligent Computer Mathematics Lecture Notes in Computer Science, 2011, Volume 6824/2011. Berlin / Heidelberg: Springer, 2011, p. 228-243. ISBN 978-3-642-22672-4. Available from: https://dx.doi.org/10.1007/978-3-642-22673-1_16. |
Other formats:
BibTeX
LaTeX
RIS
@inproceedings{945754, author = {Sojka, Petr and Líška, Martin}, address = {Berlin / Heidelberg}, booktitle = {Intelligent Computer Mathematics Lecture Notes in Computer Science, 2011, Volume 6824/2011}, doi = {http://dx.doi.org/10.1007/978-3-642-22673-1_16}, editor = {James H. Davenport, William M. Farmer, Josef Urban, Florian Rabe}, keywords = {math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; MIaS; WebMIaS}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Berlin / Heidelberg}, isbn = {978-3-642-22672-4}, pages = {228-243}, publisher = {Springer}, title = {Indexing and Searching Mathematics in Digital Libraries -- Architecture, Design and Scalability Issues}, url = {http://dx.doi.org/10.1007/978-3-642-22673-1_16}, year = {2011} }
TY - JOUR ID - 945754 AU - Sojka, Petr - Líška, Martin PY - 2011 TI - Indexing and Searching Mathematics in Digital Libraries -- Architecture, Design and Scalability Issues PB - Springer CY - Berlin / Heidelberg SN - 9783642226724 KW - math indexing and retrieval KW - mathematical digital libraries KW - information systems KW - information retrieval KW - mathematical content search KW - document ranking of mathematical papers KW - math text mining KW - MIaS KW - WebMIaS UR - http://dx.doi.org/10.1007/978-3-642-22673-1_16 N2 - This paper surveys approaches and systems for searching mathematical formulae in mathematical corpora and on the web. The design and architecture of our MIaS (Math Indexer and Searcher) system is presented, and our design decisions are discussed in detail. An approach based on Presentation MathML using a similarity of math subformulae is suggested and verified by implementing it as a math-aware search engine based on the state-of-the-art system, Apache Lucene. Scalability issues were checked based on 324,000 real scientific documents from arXiv archive with 112 million mathematical formulae. More than two billions MathML subformulae were indexed using our Solr-compatible Lucene extension. ER -
SOJKA, Petr and Martin LÍŠKA. Indexing and Searching Mathematics in Digital Libraries -- Architecture, Design and Scalability Issues. In James H. Davenport, William M. Farmer, Josef Urban, Florian Rabe. \textit{Intelligent Computer Mathematics Lecture Notes in Computer Science, 2011, Volume 6824/2011}. Berlin / Heidelberg: Springer, 2011, p.~228-243. ISBN~978-3-642-22672-4. Available from: https://dx.doi.org/10.1007/978-3-642-22673-1\_{}16.
|