EDS-MEMBED: Multi-sense embeddings based on enhanced
distributional semantic structures via a graph walk over word
senses

J 2021

EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses

AYETIRAN, Eniafe Festus, Petr SOJKA and Vít NOVOTNÝ

Basic information

Original name

EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses

Authors

AYETIRAN, Eniafe Festus (566 Nigeria, guarantor, belonging to the institution), Petr SOJKA (203 Czech Republic, belonging to the institution) and Vít NOVOTNÝ (203 Czech Republic, belonging to the institution)

Edition

Knowledge-Based Systems, Elsevier, 2021, 0950-7051

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Netherlands

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

DOI preprint

Impact factor

Impact factor: 8.139

RIV identification code

RIV/00216224:14330/21:00120721

Organization unit

Faculty of Informatics

DOI

http://dx.doi.org/10.1016/j.knosys.2021.106902

UT WoS

000634868500007

Keywords in English

Multi-sense embeddings; Graph walk; Language generation; Distributional semantics; Distributional structures; Word sense disambiguation; Knowledge-based systems; Word similarity; Semantic applications

Abstract

V originále

Several language applications often require word semantics as a core part of their processing pipeline either as precise meaning inference or semantic similarity. Multi-sense embeddings (M-SE) can be exploited for this important requirement. M-SE seeks to represent each word by their distinct senses in order to resolve the conflation of meanings of words as used in different contexts. Previous works usually approach this task by training a model on a large corpus and often ignore the effect and usefulness of the semantic relations offered by lexical resources. However, even with large training data, coverage of all possible word senses is still an issue. In addition, a considerable percentage of contextual semantic knowledge is never learned because a huge amount of possible distributional semantic structures are never explored. In this paper, we leverage the rich semantic structures in WordNet using a graph-theoretic walk technique over word senses to enhance the quality of multi-sense embeddings. This algorithm composes enriched texts from the original texts. Furthermore, we derive new distributional semantic similarity measures for M-SE from prior ones. We adapt these measures to the word sense disambiguation (WSD) aspect of our experiment. We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks and show that our method for enhancing distributional semantic structures improves embeddings quality on the baselines. Despite the small training data, it achieves state-of-the-art performance on some of the datasets.

Links

MUNI/A/1411/2019, interní kód MU

Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.

Investor: Masaryk University, Category A

MUNI/A/1549/2020, interní kód MU

Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 21 (Acronym: SKOMU)

Investor: Masaryk University

Citovat

AYETIRAN, Eniafe Festus, Petr SOJKA and Vít NOVOTNÝ. EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses. Knowledge-Based Systems. Elsevier, 2021, vol. 2021, No 219, p. 106902-106915. ISSN 0950-7051. Available from: https://dx.doi.org/10.1016/j.knosys.2021.106902.

@article{1681976,
   author = {Ayetiran, Eniafe Festus and Sojka, Petr and Novotný, Vít},
   article_number = {219},
   doi = {http://dx.doi.org/10.1016/j.knosys.2021.106902},
   keywords = {Multi-sense embeddings; Graph walk; Language generation; Distributional semantics; Distributional structures; Word sense disambiguation; Knowledge-based systems; Word similarity; Semantic applications},
   language = {eng},
   issn = {0950-7051},
   journal = {Knowledge-Based Systems},
   title = {EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses},
   url = {https://doi.org/10.1016/j.knosys.2021.106902},
   volume = {2021},
   year = {2021}
}

TY  - JOUR
ID  - 1681976
AU  - Ayetiran, Eniafe Festus - Sojka, Petr - Novotný, Vít
PY  - 2021
TI  - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses
JF  - Knowledge-Based Systems
VL  - 2021
IS  - 219
SP  - 106902
EP  - 106902
PB  - Elsevier
SN  - 09507051
KW  - Multi-sense embeddings
KW  - Graph walk
KW  - Language generation
KW  - Distributional semantics
KW  - Distributional structures
KW  - Word sense disambiguation
KW  - Knowledge-based systems
KW  - Word similarity
KW  - Semantic applications
UR  - https://doi.org/10.1016/j.knosys.2021.106902
N2  - Several language applications often require word semantics as a core part of their processing pipeline either as precise meaning inference or semantic similarity. Multi-sense embeddings (M-SE) can be exploited for this important requirement. M-SE seeks to represent each word by their distinct senses in order to resolve the conflation of meanings of words as used in different contexts. Previous works usually approach this task by training a model on a large corpus and often ignore the effect and usefulness of the semantic relations offered by lexical resources. However, even with large training data, coverage of all possible word senses is still an issue. In addition, a considerable percentage of contextual semantic knowledge is never learned because a huge amount of possible distributional semantic structures are never explored. In this paper, we leverage the rich semantic structures in WordNet using a graph-theoretic walk technique over word senses to enhance the quality of multi-sense embeddings. This algorithm composes enriched texts from the original texts. Furthermore, we derive new distributional semantic similarity measures for M-SE from prior ones. We adapt these measures to the word sense disambiguation (WSD) aspect of our experiment. We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks and show that our method for enhancing distributional semantic structures improves embeddings quality on the baselines. Despite the small training data, it achieves state-of-the-art performance on some of the datasets.
ER  -

AYETIRAN, Eniafe Festus, Petr SOJKA and Vít NOVOTNÝ. EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses. \textit{Knowledge-Based Systems}. Elsevier, 2021, vol.~2021, No~219, p.~106902-106915. ISSN~0950-7051. Available from: https://dx.doi.org/10.1016/j.knosys.2021.106902.

Detailed Information on Publication Record