E7527 Data Analysis in R

Faculty of Science
Autumn 2025
Extent and Intensity
2/1/0. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: k (colloquium). Other types of completion: zk (examination).
In-person direct teaching
Teacher(s)
Mgr. Soňa Smetanová, Ph.D. (lecturer)
Mgr. Jan Böhm (lecturer)
Mgr. Eva Budinská, Ph.D. (lecturer)
Guaranteed by
Mgr. Eva Budinská, Ph.D.
RECETOX – Faculty of Science
Contact Person: Mgr. Soňa Smetanová, Ph.D.
Supplier department: RECETOX – Faculty of Science
Prerequisites
E5540 Biostatistics - basic course || E5046 Biostatistics for Comp. Biol.
E5540 Biostatistics-Basic Course / E5046 Biostatistics for Mathematical Biology and Biomedicine The course requires basic knowledge of the use of R, knowledge of basic statistical methods at least in the scope of Bi5040 Biostatistics-Basic Course, and ideally knowledge of multivariate statistical methods in the scope of E8600 Multivariate Methods.
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 0/30, only registered: 27/30, only registered with preference (fields directly associated with the programme): 17/30
fields of study / plans the course is directly associated with
Course objectives
The aim of the course is to teach the researchers to use advanced R - statistical software for data analysis. We will in detail explain the syntax of the R language and introduce a number of functions for data pre-processing, statistical data analysis and graph plotting. This is a basic course that assumes no previous experience of working in R.
Learning outcomes
PAfter attending this course, the student:
Understands the syntax of language R
Knows data structures in R
Knows the difference between a script and a function
Can create functions
Creates scripts for R batch commands and uses them
Knows the syntax of basic cycles and conditions (for, repeat, if...)
Can install packages of R functions
Automatically creates objects with names defined by a variable
Makes automatic scripts
Optimizes computational burden of algorithms by using less time-consuming functions(e.g. apply instead of for cycle)
Knows the options of connecting R with other programming languages (C, Python, Perl)
Loads and saves data files
Transforms matrices and other data tables
Can merge tables of different types
Effectively recodes variables
Performs hypothesis testing
Applies different functions for data clustering
Knows all possibilities of graph saving
Knows and works with basic graphical interface in R
Creates graphs in lattice and grid
Can create and save graphs in automatic script
Creates complex colour graphs
Knows how to set up graph resolution and creates graphs of publication quality
Saves graphs in different formats
Can create analysis plan and find and select the best functions
Can create a simple-to-follow script and additional functions for complex data analysis of example data
Will optimize this script from the computational burden point of view
Syllabus
  • 1st lecture - Introduction to R (history of R, what is R, advantages, and disadvantages of R; downloading and installing R; basic work with  R - setting the working directory, basic commands, operators, libraries; help; what is an object and its basic characteristics)
  • 2.-5. lecture – Objects in R (vectors and basic work with vectors; matrices and basic work with matrices; data frames; lists; and other objects)
  • 6.-7. lecture - Programming in R (for loop, if condition, while, repeat, commands from the apply family; functions; how to write a script effectively)
  • 8.-9. lecture – Loading and saving files, basic data editing
  • 9.-10. lecture - Graphs in  R (traditional graphics; Lattice (Trellis); Grid; ggplot2; saving graphs)
  • 11. lecture - Multidimensional analysis, analysis of a real example
  • 12. lecture - Introduction to the popular packages (tidyr,plyr,dplyr,ComplexHeatmap)
  • 13. lecture – Evaluation of projects
Literature
    recommended literature
  • TORGO, Luís. Data mining with R : learning with case studies. Boca Raton: Chapman and Hall/CRC, 2011, xv, 289. ISBN 9781439810187. info
  • MATLOFF, Norman S. The art of R programming : a tour of statistical software design. Eleventh printing. San Francisco: No Starch Press, 2011, xxiii, 373. ISBN 1593273843. info
  • GENTLEMAN, Robert. R programming for bioinformatics. Boca Raton: CRC Press, 2009, xii, 314. ISBN 9781420063677. info
  • MURRELL, Paul. R graphics. Boca Raton: Chapman & Hall/CRC, 2006, xix, 301. ISBN 158488486X. info
  • Bioinformatics and computational biology solutions using R and bioconductor. Edited by Robert Gentleman. New York: Springer, 2005, xix, 473. ISBN 0387251464. info
Teaching methods
Teaching is conducted in the form of a lecture followed by a one-hour practical session. The basics and theoretical concepts are explained to students through a presentation, and these are then immediately applied after each complete section using the R user interface on computers in a computer lab. The number of students is set so that each has access to their own computer. Students are encouraged to be proactive and to propose their own algorithmic solutions to the given problems.
Assessment methods
Colloquium:
During the semester, students participate in mandatory practical sessions where the theory covered in lectures is practiced. Additionally, students complete a project during the semester, which is evaluated with a maximum of 10 points. The assessment will focus on the functionality and clarity of the script in relation to the defined objectives of the project. To pass the course, students must obtain at least 6 out of 10 points.
Exam:
The final practical test in  R  consists of a set of problems - their solutions are submitted, along with  code. The maximum score for the test is 20. It is allowed to use the study materials and help in R, but it is not allowed to use the internet. The final evaluation is based on the total number of points (project; max. 10 points + final test; max. 20 points), and a score of 17.5 points is required for successful completion, including at least 5 points for the project.
Evaluation: <17.5 F, ≤20 E, ≤22.5 D, ≤25 C, ≤27.5 B, ≤30 A
Language of instruction
Czech
Further comments (probably available only in Czech)
The course is taught annually.
The course is taught every week.
Information on course enrolment limitations: Doporučení absolvovat Bi8600, DSMBz01, Bi3060
Listed among pre-requisites of other courses
The course is also listed under the following terms Autumn 2022, Autumn 2023, Autumn 2024.
  • Enrolment Statistics (recent)
  • Permalink: https://is.muni.cz/course/sci/autumn2025/E7527