A flexible denormalization technique for data analysis above a
deeply-structured relational database: biomedical applications

D 2015

A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications

ŠTEFANIČ, Stanislav a Matej LEXA

Základní údaje

Originální název

A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications

Autoři

ŠTEFANIČ, Stanislav (703 Slovensko, domácí) a Matej LEXA (703 Slovensko, garant, domácí)

Vydání

Cham, Lecture Notes in Computer Science 9043, Bioinformatics and Biomedical Engineering, Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17 2015, Proceedings, Part I, od s. 120-133, 14 s. 2015

Nakladatel

Springer International Publishing

Další údaje

Jazyk

angličtina

Typ výsledku

Stať ve sborníku

Obor

10201 Computer sciences, information science, bioinformatics

Stát vydavatele

Švýcarsko

Utajení

není předmětem státního či obchodního tajemství

Forma vydání

tištěná verze "print"

Odkazy

URL

Impakt faktor

Impact factor: 0.402 v roce 2005

Kód RIV

RIV/00216224:14330/15:00082481

Organizační jednotka

Fakulta informatiky

ISBN

978-3-319-16482-3

ISSN

DOI

http://dx.doi.org/10.1007/978-3-319-16483-0_12

Klíčová slova anglicky

relational database; PostgreSQL; NoSQL; data flattening; automatic data denormalization

Příznaky

Mezinárodní význam, Recenzováno

Změněno: 3. 9. 2015 13:37, doc. Ing. Matej Lexa, Ph.D.

Anotace

V originále

Relational databases are sometimes used to store biomedical and patient data in large clinical or international projects. This data is inherently deeply structured, records for individual patients contain varying number of variables. When ad-hoc access to data subsets is needed, standard database access tools do not allow for rapid command prototyping and variable selection to create flat data tables. In the context of Thalamoss, an international research project on beta-thalassemia, we developed and experimented with an interactive variable selection method addressing these needs. Our newly-developed Python library sqlAutoDenorm.py automatically generates SQL commands to denormalize a subset of database tables and their relevant records, effectively generating a flat table from arbitrarily structured data. The denormalization process can be controlled by a small number of user-tunable parameters. Python and R/Bioconductor are used for any subsequent data processing steps, including visualization, and Weka is used for machine-learning above the generated data.

Návaznosti

7E13011, projekt VaV

Název: THALAssaemia MOdular Stratification System for personalized therapy of beta-thalassemia (Akronym: THALAMOSS)

Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, THALAssaemia MOdular Stratification System for personalized therapy of beta-thalassemia

Citovat

ŠTEFANIČ, Stanislav a Matej LEXA. A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications. In Ortuño, Francisco and Rojas, Ignacio. Lecture Notes in Computer Science 9043, Bioinformatics and Biomedical Engineering, Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17 2015, Proceedings, Part I. Cham: Springer International Publishing, 2015, s. 120-133. ISBN 978-3-319-16482-3. Dostupné z: https://dx.doi.org/10.1007/978-3-319-16483-0_12.

@inproceedings{1229305,
   author = {Štefanič, Stanislav and Lexa, Matej},
   address = {Cham},
   booktitle = {Lecture Notes in Computer Science 9043, Bioinformatics and Biomedical Engineering, Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17 2015, Proceedings, Part I},
   doi = {http://dx.doi.org/10.1007/978-3-319-16483-0_12},
   editor = {Ortuño, Francisco and Rojas, Ignacio},
   keywords = {relational database; PostgreSQL; NoSQL; data flattening; automatic data denormalization},
   howpublished = {tištěná verze "print"},
   language = {eng},
   location = {Cham},
   isbn = {978-3-319-16482-3},
   pages = {120-133},
   publisher = {Springer International Publishing},
   title = {A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications},
   url = {http://link.springer.com/chapter/10.1007%2F978-3-319-16483-0_12},
   year = {2015}
}

TY  - JOUR
ID  - 1229305
AU  - Štefanič, Stanislav - Lexa, Matej
PY  - 2015
TI  - A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications
PB  - Springer International Publishing
CY  - Cham
SN  - 9783319164823
KW  - relational database
KW  - PostgreSQL
KW  - NoSQL
KW  - data flattening
KW  - automatic data denormalization
UR  - http://link.springer.com/chapter/10.1007%2F978-3-319-16483-0_12
L2  - http://link.springer.com/chapter/10.1007%2F978-3-319-16483-0_12
N2  - Relational databases are sometimes used to store biomedical and patient data in large clinical or international projects. This data is inherently deeply structured, records for individual patients contain varying number of variables. When ad-hoc access to data subsets is needed, standard database access tools do not allow for rapid command prototyping and variable selection to create flat data tables. In the context of Thalamoss, an international research project on beta-thalassemia, we developed and experimented with an interactive variable selection method addressing these needs. Our newly-developed Python library sqlAutoDenorm.py automatically generates SQL commands to denormalize a subset of database tables and their relevant records, effectively generating a flat table from arbitrarily structured data. The denormalization process can be controlled by a small number of user-tunable parameters. Python and R/Bioconductor are used for any subsequent data processing steps, including visualization, and Weka is used for machine-learning above the generated data.
ER  -

ŠTEFANIČ, Stanislav a Matej LEXA. A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications. In Ortuño, Francisco and Rojas, Ignacio. \textit{Lecture Notes in Computer Science 9043, Bioinformatics and Biomedical Engineering, Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17 2015, Proceedings, Part I}. Cham: Springer International Publishing, 2015, s.~120-133. ISBN~978-3-319-16482-3. Dostupné z: https://dx.doi.org/10.1007/978-3-319-16483-0\_{}12.

Podrobný výpis o publikaci