Bi8190 Data manipulation and visualisation in R

Faculty of Science
Autumn 2025
Extent and Intensity
0/2/0. 2 credit(s) (plus extra credits for completion). Type of Completion: k (colloquium).
In-person direct teaching
Teacher(s)
Mgr. Irena Axmanová, Ph.D. (seminar tutor)
Mgr. Bc. Klára Klinkovská (seminar tutor)
Guaranteed by
Mgr. Irena Axmanová, Ph.D.
Department of Botany and Zoology – Biology Section – Faculty of Science
Contact Person: Mgr. Irena Axmanová, Ph.D.
Supplier department: Department of Botany and Zoology – Biology Section – Faculty of Science
Prerequisites
Bi7560 Introduction to R ||SOUHLAS
A basic knowledge of R is recommended, ideally completing the Introduction to R Bi7560 Introduction to R before the course.
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
there are 7 fields of study the course is directly associated with, display
Course objectives
During the course, students will become familiar with advanced methods of data manipulation and visualization in R, particularly using libraries from the *tidyverse* collection (*tidyr*, *dplyr*, *tibble*, *purrr*, *stringr*, *ggplot2*, *readr*). They will learn to work with data routinely – importing, cleaning, filtering, appending information from external sources, creating new variables (e.g., based on calculations), grouping samples according to specified criteria, and calculating summary statistics for these groups. The course will also cover basic and advanced visualization techniques using *ggplot2* and the creation of simple maps in R. Emphasis will be placed on the principles of *open data science*, focusing on writing scripts prepared for publication and sharing on GitHub.
Learning outcomes
Upon successful completion of the course, the student will be able to:
- demonstrate proficiency in importing, cleaning, and filtering data in R;
- integrate datasets with external sources and augment them with additional information;
- construct new variables, including those derived from calculations;
- organize data into groups based on specified criteria and calculate summary statistics for each group;
- design and produce basic and advanced data visualizations using ggplot2;
- generate simple maps in R;
- develop and document reproducible scripts in line with open data science principles for publication and dissemination on GitHub.
Syllabus
  • 1 Introduction
  • R as a programming language
  • Tidyverse package, %>%, |>
  • Projects in RStudio, cheatsheets, keyboard shortcuts
  • Principles of tidy scripting (structure, headings, bookmarks, comments)
  • Sources of information and where to seek help, AI tools
  • Importing data using readr, readxl – common pitfalls (encoding)
  • Data structure (names, table, glimpse)
  • Tidy data (principles, preparation, check), renaming variables (rename)
  • 2 Basic Data Manipulation
  • Basic data manipulation functions (select, filter, mutate, arrange, slice)
  • Data export (write_csv)
  • 3 Data Visualization with ggplot2
  • Logic of ggplot
  • Basic geom functions (point, line, boxplot, histogram, barplot)
  • Adding trend lines
  • Symbols, colors
  • Legend, axis labels
  • Theme
  • Saving plots (ggsave)
  • 4 Wide vs. Long Data Format
  • Format conversions (pivot)
  • Creating new variables (mutate, group_by, summarise)
  • Species richness, counts/proportions of different values within a sample (count)
  • 5 Join Functions
  • Join functions (left_join, full_join) to add information from other datasets
  • Filtering joins: semi_join, anti_join
  • Proportions of specific groups by properties, indicator values, CWM
  • Nomenclature adjustments (advanced mutate, summarise), merging duplicates
  • Mutate with multiple conditions (ifelse, case_when)
  • 6 Advanced Data Visualization
  • ggplot advanced – faceting, patchwork, ggpubr, ggeffects
  • Shiny trailer (demonstration)
  • 7 Script Automation
  • Using loops (for loops)
  • Writing custom functions
  • Working with nested dataframes (purrr)
  • 8 Mapping in R
  • Maps using terra
  • Displaying samples in space (overview map, scale, legend) with OpenStreetMap basemaps
  • Cartograms, grid mapping
  • 9 Advanced Mapping in R
  • Extracting data from rasters, digital elevation models
  • Selecting data using masks
  • Scaling mapped points by value (color, symbol)
  • 10 From Database to Plot
  • Importing data from a database, linking datasets, restructuring data, filtering subsets
  • Merging duplicates caused by nomenclature conversion
  • Adding external attributes, calculating weighted means
  • Preparing publication-ready figures
  • Combining the entire process into a single pipeline
  • 11 GitHub
  • How it works, downloading data from public repositories
  • Personal account
  • Creating a repository and linking it to an R project on your computer
  • Collaborating on a project (branch, commit, push, pull)
  • Publishing scripts, making them public (DOI, README guidelines)
Literature
    recommended literature
  • WICKHAM, Hadley; Mine ÇETINKAYA-RUNDEL and Garrett GROLEMUND. R for data science : import, tidy, transform, visualize, and model data. 2nd edition. Tokyo: O'Reilly, 2023, xxiii, 548. ISBN 9781492097402. info
  • WICKHAM, Hadley and Carson SIEVERT. Ggplot2 : elegant graphics for data analysis. Second edition. Switzerland: Springer, 2016, xvi, 260. ISBN 9783319242750. info
Teaching methods
lecture + practicals, homework, projects, testing AI tools, presentations
Assessment methods
During the course, students are required to do homework. To finish the course as a colloquium, they need to prove their newly acquired skills in a final project and present the results.
Náhradní absolvování
In the case of an international trip or a long-term illness, it is possible to complete the course in an alternative format following an agreement with the teacher.
Language of instruction
Czech
Follow-Up Courses
Further comments (probably available only in Czech)
The course is taught annually.
The course is taught every week.
Teacher's information
https://botzooldataanalysis.github.io/
This course is designed for practising and becoming proficient in basic data operations in R using RStudio. Therefore, we recommend bringing your own computer so that any issues can be addressed together, ensuring that students are confident that everything will work not only at school but also when working with other datasets.
The course is also listed under the following terms Spring 2008 - for the purpose of the accreditation, Spring 2007, Spring 2008, Spring 2010, Spring 2012, spring 2012 - acreditation, Spring 2014, Autumn 2016, Autumn 2018, Autumn 2024.
  • Enrolment Statistics (recent)
  • Permalink: https://is.muni.cz/course/sci/autumn2025/Bi8190