NOVOTNÝ, Vít and Marie STARÁ. Cthulhu Hails from Wales: N-gram Frequency Analysis of R'lyehian. In Aleš Horák, Pavel Rychlý, Adam Rambousek. Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020. Brno: Tribun EU, 2020, p. 87-92. ISBN 978-80-263-1600-8.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Cthulhu Hails from Wales: N-gram Frequency Analysis of R'lyehian
Authors NOVOTNÝ, Vít (203 Czech Republic, guarantor, belonging to the institution) and Marie STARÁ (203 Czech Republic, belonging to the institution).
Edition Brno, Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, p. 87-92, 6 pp. 2020.
Publisher Tribun EU
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW Domovská stránka workshopu PDF
RIV identification code RIV/00216224:14330/20:00117107
Organization unit Faculty of Informatics
ISBN 978-80-263-1600-8
ISSN 2336-4289
Keywords in English H. P. Lovecraft; language identification; N-grams; R'lyehian
Tags language identification, Lovecraft, machine learning
Tags International impact
Changed by Changed by: RNDr. Vít Starý Novotný, Ph.D., učo 409729. Changed: 1/11/2021 09:35.
Abstract

R'lyehian is a unique fictional language penned by the prolific 20th century horror fiction author H. P. Lovecraft. Prior work in the area of the Lovecraftian mythos has not yet studied the similarities between R'lyehian and natural languages, which are crucial for determining its true origins.

We produced a comprehensive wordlist of R'lyehian and used open-source $N$-gram-based language identification tools to find the most similar natural languages to R'lyehian. From the comprehensive wordlist, we also constructed a frequency table of all unigraphs and digraphs in R'lyehian.

We show that R'lyehian is most similar to Celtic languages, which lays grounds for our hypothesis that R'lyeh, where Cthulhu lies dreaming, might be a place in Wales.

Our frequency tables will prove a useful resource for future work in the area of the Lovecraftian mythos.

Links
MUNI/A/1076/2019, interní kód MUName: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20 (Acronym: SKOMU)
Investor: Masaryk University, Category A
MUNI/A/1411/2019, interní kód MUName: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A
PrintDisplayed: 1/5/2024 09:12