NOVOTNÝ, Vít, Petr SOJKA, Michal ŠTEFÁNIK and Dávid LUPTÁK. Three is Better than One: Ensembling Math Information Retrieval Systems. CEUR Workshop Proceedings. Thessaloniki, Greece: M. Jeusfeld c/o Redaktion Sun SITE, Informatik V, RWTH Aachen., 2020, vol. 2020, No 2696, p. 93-122. ISSN 1613-0073.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Three is Better than One: Ensembling Math Information Retrieval Systems
Authors NOVOTNÝ, Vít (203 Czech Republic, belonging to the institution), Petr SOJKA (203 Czech Republic, guarantor, belonging to the institution), Michal ŠTEFÁNIK (703 Slovakia, belonging to the institution) and Dávid LUPTÁK (703 Slovakia, belonging to the institution).
Edition CEUR Workshop Proceedings, Thessaloniki, Greece, M. Jeusfeld c/o Redaktion Sun SITE, Informatik V, RWTH Aachen. 2020, 1613-0073.
Other information
Original language English
Type of outcome Article in a journal
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Greece
Confidentiality degree is not subject to a state or trade secret
WWW PDF
RIV identification code RIV/00216224:14330/20:00116318
Organization unit Faculty of Informatics
Keywords (in Czech) vyhledávání matematiky; odpovědi na otázky; reprezentace matematiky; slovní embedingy; ansámbl
Keywords in English math information retrieval; question answering; math representations; word embeddings; ensembling
Tags information retrieval, machine learning, math indexing and retrieval, math information retrieval, MIR, SCM, similarity search, soft cosine measure
Tags International impact, Reviewed
Changed by Changed by: RNDr. Vít Starý Novotný, Ph.D., učo 409729. Changed: 3/1/2023 13:53.
Abstract
We report on the systems that the Math Information Retrieval group at Masaryk University (MIRMU) prepared for tasks 1 (find answers) and 2 (formula search) of the ARQ Math lab at the CLEF conference. We prototyped three primary MIR systems, proposed several math representations to tackle the lab tasks, and evaluated the proposed systems and representations. We developed a novel algorithm for ensembling information retrieval systems that outperformed all our systems on task 1 and placed ninth out of the 23 competing submissions. Out-of-competition en sembles of all non-baseline primary submissions in the competition made available by the participants placed first on task 1 and third on task 2. Our prototypes will help to understand the challenging problems of answer and formula retrieval in the STEM domain and bring the possibility of accurate math information retrieval one step closer.
Links
MUNI/A/1076/2019, interní kód MUName: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20 (Acronym: SKOMU)
Investor: Masaryk University, Category A
MUNI/A/1411/2019, interní kód MUName: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A
PrintDisplayed: 1/5/2024 00:39