Bi7528 Analysis of genomic and proteomic data

Faculty of Science
Autumn 2014
Extent and Intensity
2/0/0. 2 credit(s) (fasci plus compl plus > 4). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), graded credit, z (credit).
Mgr. Eva Budinská, Ph.D. (lecturer)
Guaranteed by
prof. RNDr. Ladislav Dušek, Ph.D.
RECETOX - Faculty of Science
Contact Person: Mgr. Eva Budinská, Ph.D.
Supplier department: RECETOX - Faculty of Science
Thu 8:00–9:50 A1/609 - IBA (A1,6.p, Kamenice 3)
Bi5040 Biostatistics - basic course || Bi5045 Biostatistics for Comp. Biol.
Bi5040 Biostatistics - basic course or Bi5045 Biostatistics for Computational Biology Solid foundation in biostatistics and molecular biology and genetics is necessary. Having attended the following courses constitutes an advantage: Bi7527 Data analysis in R, Bi8600 Multivariate Methods, DSMBz01 Molecular Biology and Genetics, Bi3060 Basic genetics, B7250 Human genetics.
Course Enrolment Limitations
The course is offered to students of any study field.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 0/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30
Course objectives
After following the course, student:
Knows basic types of biological and medical questions in genomic and proteomic experiments;
Knows selected technologies that are sources of high-density genomic and proteomic data (types of DNA microarrays, arrayCGH, 2D gel electrophoresis and MASS spectrometry);
Knows basic data types produced by genomic and proteomic technologies and their drawbacks from biostatistician's point of view.
Can list basic steps of genomic and proteomic data analysis.
Is aware of technological details of microarrays, MASS spectrometry and 2D electrophoresis that can influence data structure, quality and subsequent analyses.
Understands basic methods of quantification leading to raw data matrix and the necessity of further data quality control and normalization.
Knows specific, technology dependent sources of noise in the data.
Using graphical and statistical tools can identify this noise in the data.
Applies statistical methods to remove the noise from the data.
Is able to perform necessary and specific data transformations (normalization).
Can standardize measurements between experiments.
From multiple raw datasets creates final data matrix of samples and proteins/genes for downstream analyses.
Can identify and remove batch effects in the data.
Describes general principles of analysis of genomic and proteomic data.
Based on the hypothesis and data type selects correct method for hypothesis testing.
Understands and apply SAM and limma.
Applies hypothesis testing for detection of differentially expressed genes and proteins.
Knows basic statistics methods for class prediction and applies them to genomic and proteomic data.
Knows basic non-parametric data-mining techniques and applies them to genomic and proteomic data.
Knows positives and drawbacks of different prediction methods.
Applies MAQC II standards for creating classifiers from microarray data.
Selects and applies multivariate regression strategies which combine gene expression and clinical data.
Applies Cox-proportional hazards model for prediction of prognostic role of genes/proteins and Kaplan-Meier estimates between gene/protein expression based groups.
Knows principles and methods for gene sets analysis.
Knows basic principle and methods for gene network analysis.
Applies gene set analysis on a model example.
Knows public databases of genomic and proteomic data.
Knows Fisher Z-transformation and other basic meta-analytical concepts in genomic data.
Applies meta-analytical methods for ordering ;
Performs genomic and proteomic analyses in R and Bioconductor;
Knows selected specific R and Bioconductor data structures and packages and applies them for data analysis.
  • 1. Challenges of genomic and proteomic technologies
  • 2.DNA microarrays: principles, types and design of probes, image analysis and data quantification;
  • 3.Quality control and normalization of cDNA microarray data;
  • 4.Quality control and normalization of oligonucleotide microarray data;
  • 5. MASS spectrometry: principles, data quantification,data quality control and normalization;
  • 6.2D electrophoresis: principles, data quantification,data quality control and normalization;
  • 7.Basic principles of downstream analysis of genomic and protemoic data
  • 8.Class comparison
  • 9.Class prediction
  • 10.Class discovery
  • 11.Survival analysis and other regression techniques
  • 12.Gene set and gene network analysis
  • 13.arrayCGH analysis
  • 14.Meta-analysis
    recommended literature
  • Meta-analysis and combining information in genetics and genomics. Edited by Rudy Guerra - Darlene Renee Goldstein. Boca Raton: CRC Press, 2010. xxiii, 335. ISBN 9781584885221. info
  • GENTLEMAN, Robert. R programming for bioinformatics. Boca Raton: CRC Press, 2009. xii, 314. ISBN 9781420063677. info
  • Bioinformatics and computational biology solutions using R and bioconductor. Edited by Robert Gentleman. New York: Springer, 2005. xix, 473. ISBN 0387251464. info
  • Data analysis and visualization in genomics and proteomics. Edited by Francisco Azuaje - Joaquín Dopazo. Hoboken, NJ: John Wiley, 2005. xv, 267. ISBN 0470094397. info
  • DRĄGHICI, Sorin. Data analysis tools for DNA microaarays. Boca Raton: Chapman & Hall/CRC, 2003. 477 s. +. ISBN 1-58488-315-4. info
Teaching methods
The lectures will be combined with practicals in R and its extension Bioconductor. First, the theory, concepts and methods are explained, then students can apply these concepts in data analysis of real examples.
Assessment methods
During the course students will be motivated by homework which will allow them to collect bonus points. These points will be included in the final scoring. The final exam will be in form of a written test with 10 questions. The maximum number of points is 30. Students can use all the study materials, as the questions are designed to test mainly the knowledge of important general principles and capability of quickly applying the knowledge acquired during the course when performing real analysis. 16 points (sum from both homework and final exam) are required to complete the course successfully.
Language of instruction
Further comments (probably available only in Czech)
The course is taught annually.
Information on course enrolment limitations: Doporučení absolvovat Bi8600, DSMBz01, Bi3060
The course is also listed under the following terms Autumn 2010 - only for the accreditation, Autumn 2009, Autumn 2010, Autumn 2011, Autumn 2011 - acreditation, Autumn 2012, Autumn 2013, Spring 2016, Spring 2017, spring 2018, Spring 2019, Spring 2020.
  • Enrolment Statistics (Autumn 2014, recent)
  • Permalink: