Bi7528 Analysis of genomic and proteomic data

Faculty of Science
spring 2018
Extent and Intensity
2/0/0. 2 credit(s) (fasci plus compl plus > 4). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
Teacher(s)
Mgr. Eva Budinská, Ph.D. (lecturer)
RNDr. Ivana Ihnatová, Ph.D. (lecturer)
Mgr. Barbora Zwinsová (lecturer)
Guaranteed by
prof. RNDr. Ladislav Dušek, Ph.D.
RECETOX – Faculty of Science
Contact Person: Mgr. Eva Budinská, Ph.D.
Supplier department: RECETOX – Faculty of Science
Timetable
Wed 12:00–13:50 F01B1/709
Prerequisites
Bi5040 Biostatistics - basic course || Bi5045 Biostatistics for Comp. Biol.
Bi5040 Biostatistics - basic course or Bi5045 Biostatistics for Computational Biology Solid foundation in biostatistics and molecular biology and genetics is necessary. Having attended the following courses constitutes an advantage: Bi7527 Data analysis in R, Bi8600 Multivariate Methods, DSMBz01 Molecular Biology and Genetics, Bi3060 Basic genetics, B7250 Human genetics.
Course Enrolment Limitations
The course is offered to students of any study field.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 0/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30
Course objectives
After following the course, student:
Knows basic types of biological and medical questions in genomic and proteomic experiments;
Knows selected technologies that are sources of high-density genomic and proteomic data (types of DNA microarrays, arrayCGH, and MASS spectrometry);
Knows basic data types produced by genomic and proteomic technologies and their drawbacks from biostatistician's point of view.
Can list basic steps of genomic and proteomic data analysis.
Is aware of technological details of microarrays and MASS spectrometry that can influence data structure, quality and subsequent analyses.
Understands basic methods of quantification leading to raw data matrix and the necessity of further data quality control and normalization.
Knows specific, technology dependent sources of noise in the data.
Using graphical and statistical tools can identify this noise in the data.
Applies statistical methods to remove the noise from the data.
Is able to perform necessary and specific data transformations (normalization).
Can standardize measurements between experiments.
From multiple raw datasets creates final data matrix of samples and proteins/genes for downstream analyses.
Can identify and remove batch effects in the data.
Describes general principles of analysis of genomic and proteomic data.
Based on the hypothesis and data type selects correct method for hypothesis testing.
Understands and apply SAM and limma.
Applies hypothesis testing for detection of differentially expressed genes and proteins.
Knows basic statistics methods for class prediction and applies them to genomic and proteomic data.
Knows basic non-parametric data-mining techniques and applies them to genomic and proteomic data.
Knows positives and drawbacks of different prediction methods.
Applies MAQC II standards for creating classifiers from microarray data.
Selects and applies multivariate regression strategies which combine gene expression and clinical data.
Applies Cox-proportional hazards model for prediction of prognostic role of genes/proteins and Kaplan-Meier estimates between gene/protein expression based groups.
Knows principles and methods for gene sets analysis.
Knows basic principle and methods for gene network analysis.
Applies gene set analysis on a model example.
Knows public databases of genomic and proteomic data.
Performs genomic and proteomic analyses in R and Bioconductor;
Knows selected specific R and Bioconductor data structures and packages and applies them for data analysis.
Syllabus
  • 1. Challenges of genomic and proteomic technologies
  • 2.DNA microarrays: principles, types and design of probes, image analysis and data quantification;
  • 3.Quality control and normalization of cDNA microarray data;
  • 4.Quality control and normalization of oligonucleotide microarray data;
  • 5. MASS spectrometry: principles, data quantification,data quality control and normalization;
  • 6.Basic principles of downstream analysis of genomic and protemoic data
  • 7.Class comparison
  • 8.Class prediction
  • 9.Class discovery
  • 10.Survival analysis and other regression techniques
  • 11.Gene set and gene network analysis
  • 12. Project presentations
Literature
    recommended literature
  • Meta-analysis and combining information in genetics and genomics. Edited by Rudy Guerra - Darlene Renee Goldstein. Boca Raton: CRC Press, 2010, xxiii, 335. ISBN 9781584885221. info
  • GENTLEMAN, Robert. R programming for bioinformatics. Boca Raton: CRC Press, 2009, xii, 314. ISBN 9781420063677. info
  • Bioinformatics and computational biology solutions using R and bioconductor. Edited by Robert Gentleman. New York: Springer, 2005, xix, 473. ISBN 0387251464. info
  • Data analysis and visualization in genomics and proteomics. Edited by Francisco Azuaje - Joaquín Dopazo. Hoboken, NJ: John Wiley, 2005, xv, 267. ISBN 0470094397. info
  • DRĄGHICI, Sorin. Data analysis tools for DNA microaarays. Boca Raton: Chapman & Hall/CRC, 2003, 477 s. +. ISBN 1-58488-315-4. info
Teaching methods
The lectures will be combined with practicals in R and its extension Bioconductor. First, the theory, concepts and methods are explained, then students can apply these concepts in data analysis of real examples. At the beginning of the semester,students will individually select project which which they will analyze during the semester and use it as a data model for presented application methods. The projects can be either evaluatino of their own data, or evaluation of data from public repositories. After the selection, project has to be accorded by the teacher. At the last lecture, the students will present their project.
Assessment methods
Final written test will consist of approximately 10 questions, scored by 16 points in total. This test will count for 40% of the final evaluation. 4 points will be awarded proportionally for the activity during the course and 20 points for the project elaboration quality. For successful completion of the course by exam it is necessary to achieve at least 21 points of which at least 10 points from the project and 8 from the exam. For completion of the project as pass, overall 14 points are necessary. Students can use all the study materials, as the questions are designed to test mainly the knowledge of important general principles and capability of quickly applying the knowledge acquired during the course when performing real analysis.
Language of instruction
Czech
Further comments (probably available only in Czech)
Study Materials
The course is taught annually.
Information on course enrolment limitations: Doporučení absolvovat Bi8600, DSMBz01, Bi3060
Teacher's information
http://portal.matematickabiologie.cz/index.php?pg=analyza-genomickych-a-proteomickych-dat--analyza-genomickych-a-proteomickych-dat
The course is also listed under the following terms Autumn 2010 - only for the accreditation, Autumn 2009, Autumn 2010, Autumn 2011, Autumn 2011 - acreditation, Autumn 2012, Autumn 2013, Autumn 2014, Spring 2016, Spring 2017, Spring 2019, Spring 2020, Spring 2021, Spring 2022.
  • Enrolment Statistics (spring 2018, recent)
  • Permalink: https://is.muni.cz/course/sci/spring2018/Bi7528