👷 Introduction to Information Retrieval
doc. RNDr. Petr Sojka, Ph.D.
👷 Introduction to Information Retrieval

Dear students,

Welcome to the 2024 run of the FI:PV211 Introduction to Information Retrieval course. As the main teacher will take a month of health recovery in Spring 2024, this year's lectures will be [partly] substituted by the previous year's recordings and invited lectures. Enrollment is thus limited (APPROVAL needed) with preference given to UMI students.

The course is based on the Introduction to Information Retrieval textbook by Manning, Raghavan, and Schutze (hard copies available in MU libraries) taught at Stanford, Munich, and other places. In the course you will, among other things, learn how it is possible to fulfill seekers' information needs at the pace of 10,000+ questions per second on the global web-scale within milliseconds. Since 2023, the use of transformers and deep approaches has been added to the syllabus.

Students will be motivated to try active/flipped learning approaches wherever possible.

The course moved from its  to IS MU in 2011. Please look if you would like to take a sneak peek at the and the topics we will discuss in the course. However, this interactive syllabus is this course's primary source of information.

Course trailer (in Czech)
A trailer for the PV211 Introduction to Information Retrieval course by Tomáš Effenberger
Second project assignment
CQADupStack Collection and the ARQMath Collection
Second project assignment (CQADupStack Collection)
Google Colaboratory code for the second project
Second project leaderboard (CQADupStack Collection)
Google Spreadsheet leaderboard for the second project
Alternative second project assignment (ARQMath Collection)
Google Colaboratory code for the alternative second project
Alternative second project leaderboard (ARQMath Collection)
Google Spreadsheet leaderboard for the alternative second project
Projects' Jupyter Hub
Dedicated computational resources for your projects

Kapitola obsahuje:
2
Diskusní fórum
4
PDF
1
Složka
1
Studijní text
4
Web
Učitel doporučuje studovat od 19. 2. 2024 do 25. 2. 2024.
Kapitola obsahuje:
5
PDF
1
Složka
1
Studijní text
5
Web
Učitel doporučuje studovat od 26. 2. 2024 do 3. 3. 2024.
Kapitola obsahuje:
3
PDF
1
Složka
1
Studijní text
1
Web
Učitel doporučuje studovat od 4. 3. 2024 do 10. 3. 2024.
Kapitola obsahuje:
1
Odevzdávárna
7
PDF
1
Složka
1
Studijní text
6
Web
Učitel doporučuje studovat od 11. 3. 2024 do 17. 3. 2024.

2024-03-19: Submissions due for the first project

Kapitola obsahuje:
2
Vzájemné hodnocení
5
PDF
1
Složka
1
Studijní text
1
Web
Učitel doporučuje studovat od 18. 3. 2024 do 24. 3. 2024.

2024-03-26: Peer reviews due for the first project

This week, there will be a summary of the first part of the course, which is building an inverted index and querying on local and global scales, as well as the basics of the new generation of indexing based on the embeddings.

Kapitola obsahuje:
1
Obrázek
6
PDF
1
Studijní text
13
Web
Učitel doporučuje studovat od 20. 3. 2024 do 31. 3. 2024.
Kapitola obsahuje:
11
PDF
1
Složka
1
Studijní text
7
Web
Učitel doporučuje studovat od 28. 3. 2024 do 7. 4. 2024.

Question Answering, Extractive Question Answering, Abstractive Question Answering, Maximum Marginal Likelihood, LLMs vs QA

Kapitola obsahuje:
1
PDF
1
Studijní materiály
1
Studijní text
3
Web
Učitel doporučuje studovat od 8. 4. 2024 do 14. 4. 2024.
Kapitola obsahuje:
2
Studijní materiály
1
Studijní text
Učitel doporučuje studovat od 15. 4. 2024 do 21. 4. 2024.
Kapitola obsahuje:
13
PDF
1
Studijní text
3
Web
Učitel doporučuje studovat od 22. 4. 2024 do 28. 4. 2024.
Kapitola obsahuje:
2
Odevzdávárna
6
PDF
1
Složka
1
Studijní text
4
Web
Učitel doporučuje studovat od 25. 4. 2024 do 5. 5. 2024.

2024-05-12: Submissions due for the second project

Kapitola obsahuje:
1
Obrázek
2
Odevzdávárna
4
PDF
1
Složka
1
Video
1
Studijní text
1
Web
Učitel doporučuje studovat od 6. 5. 2024 do 12. 5. 2024.

2024-05-19: Peer reviews due for the second project

Kapitola obsahuje:
5
PDF
1
Video
1
Studijní text
2
Web
Učitel doporučuje studovat od 13. 5. 2024 do 19. 5. 2024.

    Here are materials from the previous runs of the course: spring 2019, spring 2020, spring 2021, spring 2022 and spring 2023

    I will be glad if you get encouraged into course topics and decide to get insight into them by solving [mini]projects. Activities in this direction will be rewarded with several premium points toward successful grading. The number of stars below is an estimate of project difficulty, from the mini project [(*), 10 points] to the big project size [(*****), 30+ points]. I am also open to assigning/extending a project as a Bachelor/Master/ Dissertation thesis. 

    • (*)+ Pointing to any (factual, typographical) errors in the course materials.
    • (**)+ Preparation of Deepnote instructions, documentation, and support for the solution of course projects
    • (**)+ Preparation of hot topic slides, production or preparation of motivating Khan-Academy style video, or other course materials in LaTeX.
    • (**)+ Presentation or teaching video on topics relevant to the course. Possible topics: Sketch Engine, search with linguistic attributes, random walks in texts, topic search and corpora, time-constrained search, topic modeling with gensim, LDA, Wolfram Alpha, specifics of search of structured data (chemical and mathematical formulae, linguistic trees - syntactic or dependency), etc.
    • (***) Participation in IR competition at Kaggle.com.
    • (***)+ Participation in IR research in our group Math Information Retrieval on research agendas and ARQMath task or EuDML project or DML project.
    • (***)+ Evaluation of Math Information Retrieval in system MIaS - possible as a Dean project or a Bachelor/Master/Dissertation thesis.

    To a pupil who was in danger, Master said, “Those who do not make mistakes, they are most mistaken for all – they do not try anything new.” Anthony de Mello

    Předchozí