k 2021

Data collection in historical network research : An extreme proposal

ZBÍRAL, David; Adam MERTEL; Robert Laurence John SHAW a Tomáš HAMPEJS

Základní údaje

Originální název

Data collection in historical network research : An extreme proposal

Vydání

Networks 2021, 5. - 10. July 2021, online, 2021

Další údaje

Jazyk

angličtina

Typ výsledku

Prezentace na konferencích

Obor

60304 Religious studies

Stát vydavatele

Velká Británie a Severní Irsko

Utajení

není předmětem státního či obchodního tajemství

Odkazy

Označené pro přenos do RIV

Ano

Kód RIV

RIV/00216224:14210/21:00119064

Organizační jednotka

Filozofická fakulta

Klíčová slova anglicky

historical data; data ontologies; databases; semantic text modelling; RDF

Štítky

Příznaky

Mezinárodní význam, Recenzováno
Změněno: 5. 4. 2022 11:14, Mgr. Monika Kellnerová

Anotace

V originále

The extent of data collection in historical network research (HNR) is often delimited by the specific hypotheses that drive the research in question. Such a parsimonious approach is completely logical and in many cases sufficient; moreover, there is no such thing as “total” data collection, because the data is to a degree in the eye of the beholder. At the same time, however, historical research has a tried and tested tradition of more “data-driven” research, where the close reading of sources often drives the direction of study more than the testing of hypotheses. In this paper, we present our experience of developing a thorough data model and user interface for the collection of structured data from medieval inquisitorial registers; we undertook this as part of a project that seeks to provide a networked perspective on religious dissent and its repression in the period (Dissident Networks Project / DISSINET, https://dissinet.cz). From this experience, we derive several proposals which should be of interest to historians who, on the continuous scale between hypothesis-driven and source-driven data collection, lean somewhat more towards the latter. Our point of departure is that a data model for source-driven data collection should allow as much relational complexity as the natural language does. Our approach is not completely new from a conceptual or technical point of view; it is based on statements whose departure point is the “semantic triple” and which are stored in a graph database. However, we dig quite deeply into the language of our sources to propose a way of recording its minutiae, allowing for modifiers (e.g., adjectives, adverbs), temporal and spatial relations, modality (negative, question, possibility etc.), and give specific meaning to the different actant positions (subject, objects) of each verb. We thus preserve the semantic structure and detail of the source, while also producing highly structured data suited to various projections and various kinds of network analysis and visualization (as well as other computational methodologies). This approach to data collection thus amounts to modelling, in the instance, the source itself. The talk does not focus on technical solutions (e.g., review of data collection environments) or standards. Rather, we explore conceptual issues and a practical workflow that we believe can be inspirational not only for HNR but for SNA more generally. In the terminology of the latter, our approach allows for a genuine “mixed methods” approach to research, standing at the intersection between the richness of qualitative detail and the power of quantitative analyses of structured relational data.

Návaznosti

GX19-26975X, projekt VaV
Název: Nekonformní náboženské kultury ve středověké Evropě z pohledu analýzy sociálních sítí a geografických informačních systémů (Akronym: DISSINET)
Investor: Grantová agentura ČR, Nekonformní náboženské kultury ve středověké Evropě z pohledu analýzy sociálních sítí a geografických informačních systémů