Towards Useful Word Embeddings: Evaluation on Information
Retrieval, Text Classification, and Language Modeling

NOVOTNÝ, Vít, Michal ŠTEFÁNIK, Dávid LUPTÁK and Petr SOJKA. Towards Useful Word Embeddings: Evaluation on Information Retrieval, Text Classification, and Language Modeling. In Aleš Horák and Pavel Rychlý and Adam Rambousek. Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020. Brno: Tribun EU, 2020, p. 37-46. ISBN 978-80-263-1600-8.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Towards Useful Word Embeddings: Evaluation on Information Retrieval, Text Classification, and Language Modeling
Authors	NOVOTNÝ, Vít (203 Czech Republic, guarantor, belonging to the institution), Michal ŠTEFÁNIK (703 Slovakia, belonging to the institution), Dávid LUPTÁK (703 Slovakia, belonging to the institution) and Petr SOJKA (203 Czech Republic, belonging to the institution).
Edition	Brno, Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, p. 37-46, 10 pp. 2020.
Publisher	Tribun EU

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Czech Republic
Confidentiality degree	is not subject to a state or trade secret
Publication form	printed version "print"
WWW	workshop homepage PDF (fulltext)
RIV identification code	RIV/00216224:14330/20:00117105
Organization unit	Faculty of Informatics
ISBN	978-80-263-1600-8
ISSN	2336-4289
UT WoS	000655471300004
Keywords (in Czech)	evaluace; slovní vektory; word2vec; fastText; vyhledávání informací; klasifikace textů; jazykové modelování
Keywords in English	Evaluation; word vectors; word2vec; fastText; information retrieval; text classification; language modeling
Tags	information retrieval, language modeling, machine learning, SCM, soft cosine measure, text classification, word embeddings
Tags	International impact
Changed by	Changed by: Mgr. Michal Petr, učo 65024. Changed: 16/5/2022 15:08.

Abstract

Since the seminal work of Mikolov et al. (2013), word vectors of log-bilinear models have found their way into many NLP applications and were extended with the positional model.

Although the positional model improves accuracy on the intrinsic English word analogy task, prior work has neglected its evaluation on extrinsic end tasks, which correspond to real-world NLP applications.

In this paper, we describe our first steps in evaluating positional weighting on the information retrieval, text classification, and language modeling extrinsic end tasks.

Links
MUNI/A/1076/2019, interní kód MU	Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20 (Acronym: SKOMU)
MUNI/A/1076/2019, interní kód MU	Investor: Masaryk University, Category A
MUNI/A/1411/2019, interní kód MU	Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
MUNI/A/1411/2019, interní kód MU	Investor: Masaryk University, Category A

PrintDisplayed: 1/5/2024 10:05

Towards Useful Word Embeddings: Evaluation on Information Retrieval, Text Classification, and ...

Other applications