J 2020

Three is Better than One: Ensembling Math Information Retrieval Systems

NOVOTNÝ, Vít, Petr SOJKA, Michal ŠTEFÁNIK and Dávid LUPTÁK

Basic information

Original name

Three is Better than One: Ensembling Math Information Retrieval Systems

Authors

NOVOTNÝ, Vít (203 Czech Republic, belonging to the institution), Petr SOJKA (203 Czech Republic, guarantor, belonging to the institution), Michal ŠTEFÁNIK (703 Slovakia, belonging to the institution) and Dávid LUPTÁK (703 Slovakia, belonging to the institution)

Edition

CEUR Workshop Proceedings, Thessaloniki, Greece, M. Jeusfeld c/o Redaktion Sun SITE, Informatik V, RWTH Aachen. 2020, 1613-0073

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Greece

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

PDF

RIV identification code

RIV/00216224:14330/20:00116318

Organization unit

Faculty of Informatics

Keywords (in Czech)

vyhledávání matematiky; odpovědi na otázky; reprezentace matematiky; slovní embedingy; ansámbl

Keywords in English

math information retrieval; question answering; math representations; word embeddings; ensembling

Tags

information retrieval, machine learning, math indexing and retrieval, math information retrieval, MIR, SCM, similarity search, soft cosine measure

Tags

International impact, Reviewed
Změněno: 3/1/2023 13:53, RNDr. Vít Starý Novotný, Ph.D.

Abstract

V originále

We report on the systems that the Math Information Retrieval group at Masaryk University (MIRMU) prepared for tasks 1 (find answers) and 2 (formula search) of the ARQ Math lab at the CLEF conference. We prototyped three primary MIR systems, proposed several math representations to tackle the lab tasks, and evaluated the proposed systems and representations. We developed a novel algorithm for ensembling information retrieval systems that outperformed all our systems on task 1 and placed ninth out of the 23 competing submissions. Out-of-competition en sembles of all non-baseline primary submissions in the competition made available by the participants placed first on task 1 and third on task 2. Our prototypes will help to understand the challenging problems of answer and formula retrieval in the STEM domain and bring the possibility of accurate math information retrieval one step closer.

Links

MUNI/A/1076/2019, interní kód MU
Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20 (Acronym: SKOMU)
Investor: Masaryk University, Category A
MUNI/A/1411/2019, interní kód MU
Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A
Displayed: 9/11/2024 02:58