FI:PV026 Large Language Models - Informace o předmětu

PV026 Large Language Models

Fakulta informatiky
jaro 2026

Rozsah

2/1/0. 3 kr. (plus ukončení). Ukončení: zk.
Vyučováno kontaktně

Vyučující

doc. RNDr. Aleš Horák, Ph.D. (přednášející)
doc. Mgr. Pavel Rychlý, Ph.D. (cvičící)
RNDr. Ondřej Sotolář, Ph.D. (cvičící)
Mgr. Radoslav Sabol (pomocník)

Garance

doc. RNDr. Aleš Horák, Ph.D.
Katedra strojového učení a zpracování dat – Fakulta informatiky
Dodavatelské pracoviště: Katedra strojového učení a zpracování dat – Fakulta informatiky

Předpoklady

Prerequisite: standard knowledge of Python programming.

Omezení zápisu do předmětu

Předmět je otevřen studentům libovolného oboru.

Cíle předmětu

This course aims to equip students with a comprehensive, hands-on understanding of Large Language Models (LLMs), focusing on both theoretical foundations and practical applications. Students will explore key components such as transformer architectures, pretraining strategies, data preprocessing techniques, fine-tuning methods, and prompt engineering. The course will also introduce advanced topics like Retrieval-Augmented Generation (RAG), autonomous agents, Model Configuration and Prompting (MCP), and LLMOps for deployment and maintenance. Drawing inspiration from leading academic and industry resources, including Stanford, Coursera, and Hugging Face, the course combines lectures with practical exercises and project-based work to prepare students for real-world development and usage of LLMs.

Výstupy z učení

Upon successful completion of the course, students will be able to:
- Explain the architecture and functioning of transformer-based language models, including key components such as the multi-head attention mechanisms, positional encoding, and training objectives.
- Preprocess and curate textual data for pretraining and fine-tuning, applying best practices in tokenization, data cleaning, and dataset construction.
- Fine-tune and deploy LLMs for specific tasks using modern tools and frameworks such as Hugging Face Transformers, and evaluate their performance effectively.
- Design and optimize prompts, chains, and workflows for applications involving prompt engineering, Retrieval-Augmented Generation (RAG), Model Configuration and Prompting (MCP), and LLM-based agents.
- Apply LLMOps practices to manage the lifecycle of LLM applications, including model versioning, monitoring, scalability, and responsible AI considerations.

Osnova

Background on NLP, tasks, tokenization, embeddings, word2vec, RNN, LSTM, ChatGPT.
Model benchmarks and evaluation, limits of LLM usage, LLM attacks, responsible models.
Attention, self-attention mechanism, Transformer architecture, detailed examples.
Transformer-based models: BERT, GPT, T5.
Attention approximation, hardware optimization, quantization, Mixture of Experts.
Sampling strategies, beam search, prompting, in-context learning.
Pretraining, supervised finetuning, preference tuning, math and code.
Multiple modalities, sound, image and video.
Chain-of-thought prompting, reasoning models, scaling.
Retrieval-augmented generation, function calling, agents and MCP.
LLM deployment, hardware requirements, cloud and local services, model distillation.
Current and future trends.

Literatura

AMIDI, Afshine a Shervine AMIDI. Super study guide : transformers & large language models. First edition. [Spojené státy americké]: [Independently published], 2024, iii, 233. ISBN 9798836693312. info
RASCHKA, Sebastian. Build a large language model (from scratch). First edition. Shelter Island, NY: Manning Publications, 2025, xx, 343. ISBN 9781633437166. info
Plaat, Aske, et al. "Agentic large language models, a survey." arXiv preprint arXiv:2503.23037, 80 pages (2025).
Hugging Face LLM Course, https://huggingface.co/learn/llm-course

Výukové metody

Lectures, hands-on seminars, collaborative projects and related ongoing discussions

Metody hodnocení

Written exam, assessment of the projects and their presentations.

Vyučovací jazyk

Angličtina

Informace učitele

The course is taught every spring semester on a weekly basis. There are two-hour exercises every two weeks.

Další komentáře

Studijní materiály
Předmět je vyučován každoročně.
Výuka probíhá každý týden.

Statistika zápisu (nejnovější)
Permalink: https://is.muni.cz/predmet/fi/jaro2026/PV026

FI:PV026 Large Language Models - Informace o předmětu

PV026 Large Language Models

Další aplikace