D 2020

Cthulhu Hails from Wales: N-gram Frequency Analysis of R'lyehian

NOVOTNÝ, Vít and Marie STARÁ

Basic information

Original name

Cthulhu Hails from Wales: N-gram Frequency Analysis of R'lyehian

Authors

NOVOTNÝ, Vít (203 Czech Republic, guarantor, belonging to the institution) and Marie STARÁ (203 Czech Republic, belonging to the institution)

Edition

Brno, Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, p. 87-92, 6 pp. 2020

Publisher

Tribun EU

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/20:00117107

Organization unit

Faculty of Informatics

ISBN

978-80-263-1600-8

ISSN

UT WoS

000655471300009

Keywords in English

H. P. Lovecraft; language identification; N-grams; R'lyehian

Tags

International impact
Změněno: 13/5/2024 17:44, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

R'lyehian is a unique fictional language penned by the prolific 20th century horror fiction author H. P. Lovecraft. Prior work in the area of the Lovecraftian mythos has not yet studied the similarities between R'lyehian and natural languages, which are crucial for determining its true origins.

We produced a comprehensive wordlist of R'lyehian and used open-source $N$-gram-based language identification tools to find the most similar natural languages to R'lyehian. From the comprehensive wordlist, we also constructed a frequency table of all unigraphs and digraphs in R'lyehian.

We show that R'lyehian is most similar to Celtic languages, which lays grounds for our hypothesis that R'lyeh, where Cthulhu lies dreaming, might be a place in Wales.

Our frequency tables will prove a useful resource for future work in the area of the Lovecraftian mythos.


Links

MUNI/A/1076/2019, interní kód MU
Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20 (Acronym: SKOMU)
Investor: Masaryk University, Category A
MUNI/A/1411/2019, interní kód MU
Name: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A