R 2014

MIaS 1.5

LÍŠKA, Martin and Petr SOJKA

Basic information

Original name

MIaS 1.5

Authors

LÍŠKA, Martin (703 Slovakia, belonging to the institution) and Petr SOJKA (203 Czech Republic, guarantor, belonging to the institution)

Edition

2014

Other information

Language

English

Type of outcome

Software

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

RIV identification code

RIV/00216224:14330/14:00073351

Organization unit

Faculty of Informatics

Keywords in English

MIaS; Math Indexer and Searcher

Technical parameters

MIaS je aplikace pre použití z příkazové řádky umožňující indexaci a vyhledávání nad dokumenty obsahujícími matematické zápisy. K tomuto využíva fulltextové vyhledávací jádro Lucene a vlastnou implementaci tokenizéru MIaSMath, který spracuje matematiku. Petr Sojka, FI MU Brno, Botanická 68a, 60200 Brno, CZ, tel. +420549496966

Tags

International impact
Změněno: 25/11/2014 06:08, doc. RNDr. Petr Sojka, Ph.D.

Abstract

V originále

A math-aware, full-text indexing based search engine that enables users to search for mathematical formulae inside documents. Search engine is unique because it is able to index and search structural information like representation of mathematical formulae. There is no other software or IR system that is able to store three billions of formulae in its index and search it with response time below a second. MIaS processes documents containing mathematical notation in MathML format. The system is built as an extension to any full-text indexing engine and has been verifiend on state-of-the-art Lucene core. It is scalable - it was verified to index almost whole arxiv.org (440,000 papers) having more than 160,000,000 formulae. Software is being used in EuDML (eudml.org) and other digital libraries. For more details see papers in peer reviewed conferences: [1] Sojka, Petr; Líška, Martin. In Matthew R. B. Hardy, Frank Wm. Tompa. Proceedings of the 2011 ACM Symposium on Document Engineering. Mountain View, CA, USA : ACM, 2011. pp.57--60. [2] Sojka, Petr; Líška, Martin. In J.H.Davenport, W.M. Farmer, J.Urban, F. Rabe. Intelligent Computer Mathematics LNCS 6824. Springer, 2011, pp.228--243.

Links

LG13010, research and development project
Name: Zastoupení ČR v European Research Consortium for Informatics and Mathematics (Acronym: ERCIM-CZ)
Investor: Ministry of Education, Youth and Sports of the CR
1ET200190513, research and development project
Name: DML-CZ: Česká digitální matematická knihovna
Investor: Academy of Sciences of the Czech Republic, DML-CZ: Czech Digital Mathematical Library