Caring for digital data in Archaeology


An introduction to open data practices

Petr Pajdla,
Tomáš Pavloň and
Olga Lečbychová

AIS CR

30. 11. 2021

“How science will be conducted another few decades ahead, we simply cannot predict. A few things are, however, very obvious (…). First, our technical ability to generate data, both for research and in society at large, far outpaces our abilities to make optimal use of those data for knowledge discovery and innovation. The statement that 90% of the total global data has been generated in the last two years will possibly stay true for many years to come.”

Mons, B. 2018. Data Stewardship For Open Science: Implementing FAIR principles. Boca Raton: CRC Press, Taylor & Francis Group.

Data

and other fancy words…

  • From latin, (something) given.
  • Sets of values about individual objects.
  • Measured or observed…
  • Qualitative or quantitative…
  • In context, data are transformed into information.

Mons, B. 2018. Data Stewardship For Open Science: Implementing FAIR principles. Boca Raton: CRC Press, Taylor & Francis Group.
also Wikipedia https://en.wikipedia.org/wiki/Data

spaghettin monster

“(…) archaeological fieldwork, which creates archaeological data, also destroys the in situ archaeological evidence itself. Increasingly, the digital record may be the only source of information about archaeological research materials, as more materials are ‘born-digital’. It is essential, therefore, that the digital records that describe archaeological resources be made accessible and that their preservation be ensured. Providing access and long-term preservation are the goals of digital archiving.”

Archaeology Data Service and the Centre for Digital Antiquity 2013. Caring for Digital Data in Archaeology: A Guide to Good Practice. Oxford: Oxbow Books. https://guides.archaeologydataservice.ac.uk

FAIR data principles

The Open Science Training Handbook, https://book.fosteropenscience.eu/

FAIR data principles

Findable

  • unique and persistent identifiers (PID)
  • rich metadata descriptions
  • indexed in searchable resource

Accessible

  • (meta)data retrievable by the PID using a standardized protocols
  • protocol is open, free, allows for authentication
  • metadata remains accessible even if data is not

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, 3(1): 160018. DOI: https://doi.org/10.1038/sdata.2016.18.

Hollander, H., Morselli, F., Uiterwaal, F., Admiraal, F., Trippel, T., Giorgio, S.D., 2019. PARTHENOS Guidelines to FAIRify data management and make data reusable. DOI: https://doi.org/10.5281/zenodo.2668478

Interoperable

  • formal, broad language for knowledge representation
  • vocabularies follow FAIR principles
  • reference other (meta)data

Reusable

  • richly described with a plurality of accurate and relevant attributes
  • clear and accessible license
  • detailed provenance
  • meet domain-relevant community standards

Good practice

What to do while preparing your project,
during the project and at the end of the project.

Before the project

DMP

Create a data management plan (DMP)

  • A living document. Details how data is handled during and after the completion of a research project.
  • Preparing a DMP in an early stage of your project helps in identifying possible problems in all stages of your data life cycle…
  • Encourages good practice in handling data and helps to follow FAIR principles.

  • Data Stewardship Wizard - tool used at MUNI: https://ds-wizard.org/

During the project

Organize your work

  • Document what you are doing and what things mean (create metadata).
  • Organize stuff in a hierarchical structure in directories.
  • Stick to file naming conventions.
  • Version data and code (use version control systems).
  • Use controlled vocabularies.
  • Work reproducibly (document processes using code).

reproducibility

The Turing Way project illustration by Scriberia. https://doi.org/10.5281/zenodo.3332807

At the end of the project

Preserve and give access

As open as possible, as close as necessary.

  • Data publishing

  • Long term preservation (LTP)

    • use suitable file formats
    • format update, migration, backups etc. are secured

Repositories:

Where to go next...

ADS & tDAR Guides to Good Practice

https://guides.archaeologydataservice.ac.uk
(updated online version)

ADS book

Data Stewardship For Open Science

Mons, B. 2018. Data Stewardship For Open Science: Implementing FAIR principles. Boca Raton: CRC Press.

Mons book