EDS-MEMBED: Multi-sense embeddings based on enhanced
distributional semantic structures via a graph walk over word
senses

AYETIRAN, Eniafe Festus, Petr SOJKA and Vít NOVOTNÝ. EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses. Knowledge-Based Systems. Elsevier, vol. 2021, No 219, p. 106902-106915. ISSN 0950-7051. doi:10.1016/j.knosys.2021.106902. 2021.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses
Authors	AYETIRAN, Eniafe Festus (566 Nigeria, guarantor, belonging to the institution), Petr SOJKA (203 Czech Republic, belonging to the institution) and Vít NOVOTNÝ (203 Czech Republic, belonging to the institution).
Edition	Knowledge-Based Systems, Elsevier, 2021, 0950-7051.

Other information
Original language	English
Type of outcome	Article in a journal
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Netherlands
Confidentiality degree	is not subject to a state or trade secret
WWW	DOI preprint
Impact factor	Impact factor: 8.139
RIV identification code	RIV/00216224:14330/21:00120721
Organization unit	Faculty of Informatics
Doi	http://dx.doi.org/10.1016/j.knosys.2021.106902
UT WoS	000634868500007
Keywords in English	Multi-sense embeddings; Graph walk; Language generation; Distributional semantics; Distributional structures; Word sense disambiguation; Knowledge-based systems; Word similarity; Semantic applications
Tags	Knowledge-Based Systems, similarity search
Tags	International impact, Reviewed
Changed by	Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 23/5/2022 14:19.

Abstract

Several language applications often require word semantics as a core part of their processing pipeline either as precise meaning inference or semantic similarity. Multi-sense embeddings (M-SE) can be exploited for this important requirement. M-SE seeks to represent each word by their distinct senses in order to resolve the conflation of meanings of words as used in different contexts. Previous works usually approach this task by training a model on a large corpus and often ignore the effect and usefulness of the semantic relations offered by lexical resources. However, even with large training data, coverage of all possible word senses is still an issue. In addition, a considerable percentage of contextual semantic knowledge is never learned because a huge amount of possible distributional semantic structures are never explored. In this paper, we leverage the rich semantic structures in WordNet using a graph-theoretic walk technique over word senses to enhance the quality of multi-sense embeddings. This algorithm composes enriched texts from the original texts. Furthermore, we derive new distributional semantic similarity measures for M-SE from prior ones. We adapt these measures to the word sense disambiguation (WSD) aspect of our experiment. We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks and show that our method for enhancing distributional semantic structures improves embeddings quality on the baselines. Despite the small training data, it achieves state-of-the-art performance on some of the datasets.

Links
MUNI/A/1411/2019, interní kód MU	Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
MUNI/A/1411/2019, interní kód MU	Investor: Masaryk University, Category A
MUNI/A/1549/2020, interní kód MU	Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 21 (Acronym: SKOMU)
MUNI/A/1549/2020, interní kód MU	Investor: Masaryk University

PrintDisplayed: 20/4/2024 06:02

EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph ...

Other applications