p 2012

Why TeX math search is more relevant now than ever (invited talk 21.5.2012,Portsmouth University Computing Seminar,UK)

SOJKA, Petr

Basic information

Original name

Why TeX math search is more relevant now than ever (invited talk 21.5.2012,Portsmouth University Computing Seminar,UK)

Name in Czech

Proč je TeXové hledání matematiky dnes důležitější než dříve (zvaná přednáška 21.5.2012,Portsmouth University Computing Seminar,Portsmouth,UK)

Authors

SOJKA, Petr (203 Czech Republic, guarantor, belonging to the institution)

Edition

University of Portsmouth Computing Seminar, 2012

Other information

Language

English

Type of outcome

Vyžádané přednášky

Field of Study

10101 Pure mathematics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

RIV identification code

RIV/00216224:14330/12:00060008

Organization unit

Faculty of Informatics

Keywords (in Czech)

vyhledávání matematických formulí; TeX;DML-CZ;workflow digitalizace;digitalni knihovny;pdfjbim;jbig2enc;RDF recompression

Keywords in English

math-aware search;mathematics knowledge management;TeX;DML-CZ;digitization workflow;digital libraries;pdfJbim;big2enc;PDF recompression

Tags

International impact
Změněno: 12/9/2012 14:37, doc. RNDr. Petr Sojka, Ph.D.

Abstract

V originále

TeX is around 30 years old, and was conceived and written before the advent of MathML, not to mention the Internet. At that time the idea of indexing and searching mathematics was just a futuristic idea. When people jumped on the Google bandwagon, it was predicted that old technologies such as TEX mark-up for math would disappear in time (it is not used for tokenization and indexing properly). The advent of the Internet and W3C brought mark-up and global search to the attention of the public. Somehow it was acceptable again. The recent move to the semantic search and MathML has brought renewed attention to the need of unambiguous canonical math representation in texts. As part of the project of building the European Digital Mathematics Library (http://www.eudml.eu) we have designed and implemented a math search engine, MIaS (http://nlp.fi.muni.cz/projekty/eudml/mias). It currently indexes and searches more than 160,000,000 formulae originally written by authors in TeX in their scientific papers. We will present the system and will discuss the ways towards a global math search engine based on the TeX math notation.

Links

LA09016, research and development project
Name: Účast ČR v European Research Consortium for Informatics and Mathematics (ERCIM) (Acronym: ERCIM)
Investor: Ministry of Education, Youth and Sports of the CR, Czech Republic membership in the European Research Consortium for Informatics and Mathematics
250503, interní kód MU
Name: The European Digital Mathematics Library (Acronym: EuDML)
Investor: European Union