FI:PA230 Reinforcement Learning - Informace o předmětu

PA230 Reinforcement Learning

Fakulta informatiky
podzim 2026

Rozsah

2/0/1. 3 kr. (plus ukončení). Ukončení: zk.
Vyučováno kontaktně

Vyučující

doc. RNDr. Petr Novotný, Ph.D. (přednášející)
Mgr. Martin Kurečka (pomocník)

Garance

doc. RNDr. Petr Novotný, Ph.D.
Katedra teorie programování – Fakulta informatiky
Dodavatelské pracoviště: Katedra teorie programování – Fakulta informatiky

Předpoklady

PV021 Neural Networks
Knowledge of basic types of neural networks and of their training. Elementary knowledge of probability and statistics.

Omezení zápisu do předmětu

Předmět je nabízen i studentům mimo mateřské obory.

Mateřské obory/plány

Machine learning and artificial intelligence (program FI, N-UIZD_A)
Strojové učení a umělá inteligence (program FI, N-UIZD)

Anotace

The main aim of the course is to introduce the participants to the field of reinforcement learning and to acquaint them with the major approaches to training of agent policies. The knowledge will be reinforced by a hands-on project in which the participants will train their own agents on selected benchmarks.

Výstupy z učení

After completing the course the student:
+ will have a formal understanding of the problems solved in the field of reinforcement learning (RL).
+ will be able to formulate core principles of RL algorithms.
+ will be able to describe the most prominent RL algorithms and reason about their performance characteristics and tradeoffs.
+ will have a practical experience with training of a RL agent utilizing state-of-the-art deep learning frameworks.
+ will be able to read scientificic literature from the RL domain.

Klíčová témata

- Aims of reinforcement learning (RL), neuropsychological connection, brief history.
- Problem formalization: Markov decision processes, policies, payoffs.
- Exact policy synthesis methods: value iteration, policy iteration, their relevance for RL.
- Basic methods: Monte Carlo, SARSA, Q-learning. General principles: temporal difference learning, value bootstrapping.
- Deep reinforcement learning: function approximators and issues pertaining to their use, gradient-based optimization.
- DQN, Rainbow heuristics.
- Policy gradient methods: policy gradient theorem, REINFORCE, Actor-Critic methods, SAC, trust region policy optimization (TRPO), proximal policy optimization (PPO).
- Monte Carlo tree search (MCTS) methods: conceptual foundations (exploration vs. exploitation, multi-armed bandits, upper confidence bound), UCT-based MCTS, MCTS and deep RL (AlphaZero).
- Case study: RL with human feedback in fine-tuning of large language models.
- Model-based RL (Dreamer).

Studijní zdroje a literatura

https://spinningup.openai.com/en/latest/
LAPAN, Maxim. Deep reinforcement learning hands-on : apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more. Second edition. Birmingham: Packt, 2020, xix, 798. ISBN 9781838826994. info
SUTTON, Richard S. a Andrew G. BARTO. Reinforcement learning : an introduction. Second edition. Cambridge, Massachusetts: The MIT Press, 2018, xxii, 526. ISBN 9780262039246. info
WIERING, Marco. Reinforcement learning : state of the art. Edited by Martijn van Otterlo. Berlin: Springer-Verlag, 2012, xxxiv, 638. ISBN 9783642446856. info

Přístupy, postupy a metody používané ve výuce

lecture, semestral project, individual literature study

Způsob ověření výstupů z učení a požadavky na ukončení

semestral project, oral exam

Vyučovací jazyk

Angličtina

Odkaz a informace vyučujících

An exception from the requirement of passing the PV021 course can be granted in some circumstances (e.g., if you enroll in PV021 in the same semester and there is enough space in PA230).

Další komentáře

Předmět je vyučován každoročně.
Výuka probíhá každý týden.

Předmět je zařazen také v obdobích podzim 2024, podzim 2025.

Statistika zápisu (nejnovější)
Permalink: https://is.muni.cz/predmet/fi/podzim2026/PA230

FI:PA230 Reinforcement Learning - Informace o předmětu

PA230 Reinforcement Learning

Další aplikace