M7DataSP Advanced Data Science Practicum

Faculty of Science
Autumn 2020
Extent and Intensity
0/2/1. 3 credit(s) (příf plus uk k 1 zk 2 plus 1 > 4). Type of Completion: z (credit).
Teacher(s)
Mgr. Petr Šimeček, MSc., Ph.D. (lecturer)
Guaranteed by
doc. PaedDr. RNDr. Stanislav Katina, Ph.D.
Department of Mathematics and Statistics – Departments – Faculty of Science
Supplier department: Department of Mathematics and Statistics – Departments – Faculty of Science
Timetable of Seminar Groups
M7DataSP/01: Mon 12:00–13:50 MP2,01014a, P. Šimeček
Prerequisites
It is expected that students have some experience with a programming language suitable for Data Analysis, ideally R or Python.
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 5/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30
fields of study / plans the course is directly associated with
Course objectives
The main goal is to get hands-on experience with data analysis and machine learning methods. Also to deepen students' programming skills.
Learning outcomes
This course will enable students to
- predict dependent variable with linear or logistic regression
- examine unknown data using Principal Componen Analysis and/or clustering
- split data into training and testing sets, understand variance vs bias trade-off
- use classification and regression trees, forests, bagging and boosting (XGBoost, LightGBM, CatBoost)
- get basics of TensorFlow 2.0 and Keras, applying neural networks and fine-tuning to image and NLP data
- recommendation algorithms (collaboration filtering)

As a side product, after on this course students will also practice
- data cleaning
- visualizations
- data transformation (group by, summary)
- working with git and GitHubem
- reproducible analysis and documents (RMarkdown, Jupyter Notebook)
- social skills, working in groups
Syllabus
  • The details can be found on GitHub https://github.com/simecek/dspracticum2020
Teaching methods
Each lecture will be focused on one dataset and problem on which we demonstrate a new data science skill. Students are expected to submit homework before each lecture.
Assessment methods
50% homeworks (by group of 2-4 students), 50% final project (individual). To pass, you must achieve at least 60% points.
Language of instruction
Czech
Further Comments
Study Materials
Teacher's information
https://github.com/simecek/dspracticum2020
The course is also listed under the following terms autumn 2021, Autumn 2023, Autumn 2024.
  • Enrolment Statistics (Autumn 2020, recent)
  • Permalink: https://is.muni.cz/course/sci/autumn2020/M7DataSP