Optimizing Local Satisfaction of Long-Run Average Objectives in
Markov Decision Processes

KLAŠKA, David, Antonín KUČERA, Vojtěch KŮR, Vít MUSIL and Vojtěch ŘEHÁK. Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes. In Wooldridge M., Dy J., Natarajan S. Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024). Washington, DC,. Neuveden: AAAI Press, 2024, p. 20143-20150. ISBN 978-1-57735-887-9. Available from: https://dx.doi.org/10.1609/aaai.v38i18.29993.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes
Authors	KLAŠKA, David (203 Czech Republic, belonging to the institution), Antonín KUČERA (203 Czech Republic, guarantor, belonging to the institution), Vojtěch KŮR (203 Czech Republic, belonging to the institution), Vít MUSIL (203 Czech Republic, belonging to the institution) and Vojtěch ŘEHÁK (203 Czech Republic, belonging to the institution).
Edition	Washington, DC,. Neuveden, Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024), p. 20143-20150, 8 pp. 2024.
Publisher	AAAI Press

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Confidentiality degree	is not subject to a state or trade secret
Publication form	printed version "print"
WWW	Paper URL
Organization unit	Faculty of Informatics
ISBN	978-1-57735-887-9
ISSN	2159-5399
Doi	http://dx.doi.org/10.1609/aaai.v38i18.29993
Keywords in English	Markov decision processes; invariant distribution
Tags	International impact, Reviewed
Changed by	Changed by: prof. RNDr. Antonín Kučera, Ph.D., učo 2508. Changed: 25/4/2024 10:06.

Abstract

Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.

Links
0011629866, interní kód MU	Name: Models, Algorithms, and Tools for Solving Adversarial Security Problems
0011629866, interní kód MU	Investor: Ostatní - foreign

PrintDisplayed: 14/6/2024 10:03

Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Other applications