PřF:Bi7528 Anal genom proteom data - Course Information
Bi7528 Analysis of genomic and proteomic dataFaculty of Science
- Extent and Intensity
- 2/0/0. 2 credit(s) (fasci plus compl plus > 4). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), graded credit, z (credit).
- Mgr. Eva Budinská, Ph.D. (lecturer)
RNDr. Ivana Ihnatová, Ph.D. (lecturer)
Mgr. Barbora Zwinsová (lecturer)
- Guaranteed by
- prof. RNDr. Ladislav Dušek, Ph.D.
RECETOX - Faculty of Science
Contact Person: Mgr. Eva Budinská, Ph.D.
Supplier department: RECETOX - Faculty of Science
- Mon 20. 2. to Mon 22. 5. Wed 12:00–13:50 A1/609 - IBA (A1,6.p, Kamenice 3)
- Bi5040 Biostatistics - basic course || Bi5045 Biostatistics for Comp. Biol.
Bi5040 Biostatistics - basic course or Bi5045 Biostatistics for Computational Biology Solid foundation in biostatistics and molecular biology and genetics is necessary. Having attended the following courses constitutes an advantage: Bi7527 Data analysis in R, Bi8600 Multivariate Methods, DSMBz01 Molecular Biology and Genetics, Bi3060 Basic genetics, B7250 Human genetics.
- Course Enrolment Limitations
- The course is offered to students of any study field.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 1/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30
- Course objectives
- After following the course, student:
Knows basic types of biological and medical questions in genomic and proteomic experiments;
Knows selected technologies that are sources of high-density genomic and proteomic data (types of DNA microarrays, arrayCGH, and MASS spectrometry);
Knows basic data types produced by genomic and proteomic technologies and their drawbacks from biostatistician's point of view.
Can list basic steps of genomic and proteomic data analysis.
Is aware of technological details of microarrays and MASS spectrometry that can influence data structure, quality and subsequent analyses.
Understands basic methods of quantification leading to raw data matrix and the necessity of further data quality control and normalization.
Knows specific, technology dependent sources of noise in the data.
Using graphical and statistical tools can identify this noise in the data.
Applies statistical methods to remove the noise from the data.
Is able to perform necessary and specific data transformations (normalization).
Can standardize measurements between experiments.
From multiple raw datasets creates final data matrix of samples and proteins/genes for downstream analyses.
Can identify and remove batch effects in the data.
Describes general principles of analysis of genomic and proteomic data.
Based on the hypothesis and data type selects correct method for hypothesis testing.
Understands and apply SAM and limma.
Applies hypothesis testing for detection of differentially expressed genes and proteins.
Knows basic statistics methods for class prediction and applies them to genomic and proteomic data.
Knows basic non-parametric data-mining techniques and applies them to genomic and proteomic data.
Knows positives and drawbacks of different prediction methods.
Applies MAQC II standards for creating classifiers from microarray data.
Selects and applies multivariate regression strategies which combine gene expression and clinical data.
Applies Cox-proportional hazards model for prediction of prognostic role of genes/proteins and Kaplan-Meier estimates between gene/protein expression based groups.
Knows principles and methods for gene sets analysis.
Knows basic principle and methods for gene network analysis.
Applies gene set analysis on a model example.
Knows public databases of genomic and proteomic data.
Knows Fisher Z-transformation and other basic meta-analytical concepts in genomic data.
Applies meta-analytical methods for ordering ;
Performs genomic and proteomic analyses in R and Bioconductor;
Knows selected specific R and Bioconductor data structures and packages and applies them for data analysis.
- 1. Challenges of genomic and proteomic technologies
- 2.DNA microarrays: principles, types and design of probes, image analysis and data quantification;
- 3.Quality control and normalization of cDNA microarray data;
- 4.Quality control and normalization of oligonucleotide microarray data;
- 5. MASS spectrometry: principles, data quantification,data quality control and normalization;
- 6.Basic principles of downstream analysis of genomic and protemoic data
- 7.Class comparison
- 8.Class prediction
- 9.Class discovery
- 10.Survival analysis and other regression techniques
- 11.Gene set and gene network analysis
- 12.aCGH analysis
- recommended literature
- Meta-analysis and combining information in genetics and genomics. Edited by Rudy Guerra - Darlene Renee Goldstein. Boca Raton: CRC Press, 2010. xxiii, 335. ISBN 9781584885221. info
- GENTLEMAN, Robert. R programming for bioinformatics. Boca Raton: CRC Press, 2009. xii, 314. ISBN 9781420063677. info
- Bioinformatics and computational biology solutions using R and bioconductor. Edited by Robert Gentleman. New York: Springer, 2005. xix, 473. ISBN 0387251464. info
- Data analysis and visualization in genomics and proteomics. Edited by Francisco Azuaje - Joaquín Dopazo. Hoboken, NJ: John Wiley, 2005. xv, 267. ISBN 0470094397. info
- DRĄGHICI, Sorin. Data analysis tools for DNA microaarays. Boca Raton: Chapman & Hall/CRC, 2003. 477 s. +. ISBN 1-58488-315-4. info
- Teaching methods
- The lectures will be combined with practicals in R and its extension Bioconductor. First, the theory, concepts and methods are explained, then students can apply these concepts in data analysis of real examples. Several biological problems/projects will be presented at the beginning of the semester. Students will be divided into groups and each group chooses a project which will analyze during the semester and use it as a data model for presented application methods. In the second half of the semester, students will present their interim results in lectures.
- Assessment methods
- Final written test will consist of approximately 10 questions, scored by 20 points in total. This test will count for 50% of the final evaluation. Remaining 20 points will be awarded proportionally for the activity during the course (5 points) and for project elaboration quality (15 points). For successful completion of the course it is necessary to achieve at least 21 points and at least 10 points from the project. Students can use all the study materials, as the questions are designed to test mainly the knowledge of important general principles and capability of quickly applying the knowledge acquired during the course when performing real analysis.
- Language of instruction
- Further comments (probably available only in Czech)
- Study Materials
The course is taught annually.
Information on course enrolment limitations: Doporučení absolvovat Bi8600, DSMBz01, Bi3060