ZBÍRAL, David, Robert Laurence John SHAW, Tomáš HAMPEJS and Adam MERTEL. Maximising the Power of Semantic Textual Data : CASTEMO Data Collection and the InkVisitor Application. In DH2023: Collaboration as Opportunity, 10.-14. 07. 2023, Graz. 2023.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Maximising the Power of Semantic Textual Data : CASTEMO Data Collection and the InkVisitor Application
Name (in English) Maximising the Power of Semantic Textual Data : CASTEMO Data Collection and the InkVisitor Application
Authors ZBÍRAL, David, Robert Laurence John SHAW, Tomáš HAMPEJS and Adam MERTEL.
Edition DH2023: Collaboration as Opportunity, 10.-14. 07. 2023, Graz, 2023.
Other information
Type of outcome Presentations at conferences
Confidentiality degree is not subject to a state or trade secret
Keywords (in Czech) sémantické modelování textu; digital humanities; inkvizice
Keywords in English semantic text modelling; digital humanities; inquisition
Tags International impact, Reviewed
Changed by Changed by: Mgr. Jolana Navrátilová, učo 22838. Changed: 15/12/2023 10:19.
Abstract
The authors present Computer-Assisted Semantic Text Modelling (CASTEMO), a novel but now well-developed approach to transformation of textual resources into rich structured data, CASTEMO knowledge graphs, stored in JSON-based document databases. They also introduce the open-source InkVisitor research environment which assists in CASTEMO data collection workflow. Both the workflow and the environment were developed within the ERC-funded Dissident Networks Project (DISSINET] but are now made available to use by other researchers and projects. The CASTEMO data collection approach aims to preserve the rich qualitative texture of texts and at the same time produce structured data suitable for computational analysis. It preserves the contextual embeddedness of knowledge and the natural features of human knowledge, such as conflicting evidence and information given in a non-indicative modality, e.g. questions and conditional sentences. It thus answers a significant challenge in the digital study of texts, where a decision must often be taken to prefer extracting content or analysing discursive features, as well as whether to focus on distant or close reading. With CASTEMO, these levels can be readily interwoven into “scalable reading”. This presentation introduces the essential data modelling principles of CASTEMO, as well as its use cases and advantages for certain types of study. It also introduces the InkVisitor research environment.
Abstract (in English)
The authors present Computer-Assisted Semantic Text Modelling (CASTEMO), a novel but now well-developed approach to transformation of textual resources into rich structured data, CASTEMO knowledge graphs, stored in JSON-based document databases. They also introduce the open-source InkVisitor research environment which assists in CASTEMO data collection workflow. Both the workflow and the environment were developed within the ERC-funded Dissident Networks Project (DISSINET] but are now made available to use by other researchers and projects. The CASTEMO data collection approach aims to preserve the rich qualitative texture of texts and at the same time produce structured data suitable for computational analysis. It preserves the contextual embeddedness of knowledge and the natural features of human knowledge, such as conflicting evidence and information given in a non-indicative modality, e.g. questions and conditional sentences. It thus answers a significant challenge in the digital study of texts, where a decision must often be taken to prefer extracting content or analysing discursive features, as well as whether to focus on distant or close reading. With CASTEMO, these levels can be readily interwoven into “scalable reading”. This presentation introduces the essential data modelling principles of CASTEMO, as well as its use cases and advantages for certain types of study. It also introduces the InkVisitor research environment.
Links
101000442, interní kód MUName: Networks of Dissent: Computational Modelling of Dissident and Inquisitorial Cultures in Medieval Europe (Acronym: DISSINET)
Investor: European Union, ERC (Excellent Science)
PrintDisplayed: 12/7/2024 22:49