KLAŠKA, David, Antonín KUČERA, Vojtěch KŮR, Vít MUSIL and Vojtěch ŘEHÁK. Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes. In Wooldridge M., Dy J., Natarajan S. Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024). Washington, DC,. Neuveden: AAAI Press, 2024, p. 20143-20150. ISBN 978-1-57735-887-9. Available from: https://dx.doi.org/10.1609/aaai.v38i18.29993.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes
Authors KLAŠKA, David (203 Czech Republic, belonging to the institution), Antonín KUČERA (203 Czech Republic, guarantor, belonging to the institution), Vojtěch KŮR (203 Czech Republic, belonging to the institution), Vít MUSIL (203 Czech Republic, belonging to the institution) and Vojtěch ŘEHÁK (203 Czech Republic, belonging to the institution).
Edition Washington, DC,. Neuveden, Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024), p. 20143-20150, 8 pp. 2024.
Publisher AAAI Press
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW Paper URL
Organization unit Faculty of Informatics
ISBN 978-1-57735-887-9
ISSN 2159-5399
Doi http://dx.doi.org/10.1609/aaai.v38i18.29993
Keywords in English Markov decision processes; invariant distribution
Tags International impact, Reviewed
Changed by Changed by: prof. RNDr. Antonín Kučera, Ph.D., učo 2508. Changed: 25/4/2024 10:06.
Abstract
Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.
Links
0011629866, interní kód MUName: Models, Algorithms, and Tools for Solving Adversarial Security Problems
Investor: Ostatní - foreign
PrintDisplayed: 14/6/2024 10:03