SEHNAL, David, Sebastian BITTRICH, Sameer VELANKAR, Jaroslav KOČA, Radka SVOBODOVÁ, Stephen K. BURLEY and Alexander S. ROSE. BinaryCIF and CIFTools-Lightweight, efficient and extensible macromolecular data management. PLoS Computational Biology. San Francisco: Public Library of Science, 2020, vol. 16, No 10, p. 1-13. ISSN 1553-734X. Available from: https://dx.doi.org/10.1371/journal.pcbi.1008247.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name BinaryCIF and CIFTools-Lightweight, efficient and extensible macromolecular data management
Authors SEHNAL, David (203 Czech Republic, belonging to the institution), Sebastian BITTRICH, Sameer VELANKAR, Jaroslav KOČA (203 Czech Republic, guarantor, belonging to the institution), Radka SVOBODOVÁ (203 Czech Republic, belonging to the institution), Stephen K. BURLEY and Alexander S. ROSE.
Edition PLoS Computational Biology, San Francisco, Public Library of Science, 2020, 1553-734X.
Other information
Original language English
Type of outcome Article in a journal
Field of Study 10608 Biochemistry and molecular biology
Country of publisher United States of America
Confidentiality degree is not subject to a state or trade secret
WWW URL
Impact factor Impact factor: 4.475
RIV identification code RIV/00216224:14740/20:00117701
Organization unit Central European Institute of Technology
Doi http://dx.doi.org/10.1371/journal.pcbi.1008247
UT WoS 000585163600006
Keywords in English Structural Biology; Molecular Graphics; Data Curation
Tags rivok
Tags International impact, Reviewed
Changed by Changed by: Mgr. Pavla Foltynová, Ph.D., učo 106624. Changed: 22/2/2021 13:09.
Abstract
3D macromolecular structural data is growing ever more complex and plentiful in the wake of substantive advances in experimental and computational structure determination methods including macromolecular crystallography, cryo-electron microscopy, and integrative methods. Efficient means of working with 3D macromolecular structural data for archiving, analyses, and visualization are central to facilitating interoperability and reusability in compliance with the FAIR Principles. We address two challenges posed by growth in data size and complexity. First, data size is reduced by bespoke compression techniques. Second, complexity is managed through improved software tooling and fully leveraging available data dictionary schemas. To this end, we introduce BinaryCIF, a serialization of Crystallographic Information File (CIF) format files that maintains full compatibility to related data schemas, such as PDBx/mmCIF, while reducing file sizes by more than a factor of two versus gzip compressed CIF files. Moreover, for the largest structures, BinaryCIF provides even better compression-factor ten and four versus CIF files and gzipped CIF files, respectively. Herein, we describe CIFTools, a set of libraries in Java and TypeScript for generic and typed handling of CIF and BinaryCIF files. Together, BinaryCIF and CIFTools enable lightweight, efficient, and extensible handling of 3D macromolecular structural data.
Links
EF16_013/0001777, research and development projectName: ELIXIR-CZ: Budování kapacit
LM2018131, research and development projectName: Česká národní infrastruktura pro biologická data (Acronym: ELIXIR-CZ)
Investor: Ministry of Education, Youth and Sports of the CR, Czech National Infrastructure for Biological Data
PrintDisplayed: 17/8/2024 21:41