k 2024

How formulaic are inquisition records? Measuring lexical richness and text similarity in a corpus of Latin notarial documents

ZBÍRAL, David; Gideon KOTZÉ a Robert Laurence John SHAW

Základní údaje

Originální název

How formulaic are inquisition records? Measuring lexical richness and text similarity in a corpus of Latin notarial documents

Vydání

Formulaic Language in Historical Research and Data Extraction, 7-9 February 2024, Huygens Institute for the History and Culture of the Netherlands & Royal Netherlands Academy of Arts and Sciences, Amsterdam, Netherlands, 2024

Další údaje

Jazyk

angličtina

Typ výsledku

Prezentace na konferencích

Obor

60203 Linguistics

Stát vydavatele

Nizozemské království

Utajení

není předmětem státního či obchodního tajemství

Označené pro přenos do RIV

Ne

Organizační jednotka

Filozofická fakulta

Klíčová slova česky

formulační jazyk; lingvistická analýza; středověké notářské listiny

Klíčová slova anglicky

formulaic language; linguistic analysis; medieval notarial documents

Příznaky

Mezinárodní význam, Recenzováno
Změněno: 8. 2. 2025 20:23, Mgr. Ivona Vrzalová

Anotace

V originále

It is a widely accepted axiom that medieval inquisition records, just as many other types of notarial documents, are formulaic. However, the only notion of different degrees of formulaicity is that some registers – such as the register of Jacques Fournier famously studied by Emmanuel Le Roy Ladurie – are “less formulaic”, and thus “exceptional”. This exceptionality has even become, deservedly or not, an indication of reliability, which invests formulaicity with critical importance. It is thus surprising that there exist no empirical studies which would actually measure the formulaicity of medieval inquisition records, thus allowing us to systematically compare between them and inform the source criticism of this contested type of text. To bridge this gap, we apply methods of lexical richness measurement and text similarity analysis to an expertly cleaned corpus of digitized editions of Latin-language medieval inquisition records (ca. 1,300,000 tokens). This allows us to express that the formulaicity of inquisition records, rather than a universal feature with anecdotic exceptions, is actually a distribution on a scale, where some registers are significantly more formulaic than others. We achieve this by investigating the distribution, diversity, and similarity of types, tokens, as well as larger segments of text, combining this with our knowledge of the texts in order to interpret the results. Besides comparing individual registers with one another, we are able to compare the degree of formulaicity between different genres of heresy trial records (such as the formulaicity of depositions vs. sentences vs. abjurations), since a part of our corpus (ca. 700,000 tokens) is segmented into specific documents provided with genre metadata.

Návaznosti

101000442, interní kód MU
Název: Networks of Dissent: Computational Modelling of Dissident and Inquisitorial Cultures in Medieval Europe (Akronym: DISSINET)
Investor: Evropská unie, Networks of Dissent: Computational Modelling of Dissident and Inquisitorial Cultures in Medieval Europe, ERC (Excellent Science)