Anytime Guarantees for Reachability in Uncountable Markov
Decision Processes

D 2022

Anytime Guarantees for Reachability in Uncountable Markov Decision Processes

GROVER, Kush; Jan KŘETÍNSKÝ; Tobias MEGGENDORFER a Maximilian WEININGER

Základní údaje

Originální název

Anytime Guarantees for Reachability in Uncountable Markov Decision Processes

Autoři

GROVER, Kush; Jan KŘETÍNSKÝ; Tobias MEGGENDORFER a Maximilian WEININGER

Vydání

33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland. od s. 1-20, 20 s. 2022

Nakladatel

Dagstuhl

Další údaje

Typ výsledku

Stať ve sborníku

Označené pro přenos do RIV

Organizační jednotka

Fakulta informatiky

ISBN

9783959772464

ISSN

DOI

https://doi.org/10.4230/LIPICS.CONCUR.2022.11

Změněno: 17. 3. 2025 14:43, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the “boundary” of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees.

Citovat

GROVER, Kush; Jan KŘETÍNSKÝ; Tobias MEGGENDORFER a Maximilian WEININGER. Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. In 33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland. Dagstuhl, 2022, s. 1-20. ISBN 9783959772464. Dostupné z: https://doi.org/10.4230/LIPICS.CONCUR.2022.11.

@inproceedings{2484776,
   author = {Grover, Kush and Křetínský, Jan and Meggendorfer, Tobias and Weininger, Maximilian},
   booktitle = {33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland.},
   doi = {https://doi.org/10.4230/LIPICS.CONCUR.2022.11},
   isbn = {9783959772464},
   pages = {1-20},
   publisher = {Dagstuhl},
   title = {Anytime Guarantees for Reachability in Uncountable Markov Decision Processes},
   year = {2022}
}

TY  - CONF
ID  - 2484776
AU  - Grover, Kush - Křetínský, Jan - Meggendorfer, Tobias - Weininger, Maximilian
PY  - 2022
TI  - Anytime Guarantees for Reachability in Uncountable Markov Decision Processes
PB  - Dagstuhl
SN  - 9783959772464
N2  - We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the “boundary” of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees.
ER  -

GROVER, Kush; Jan KŘETÍNSKÝ; Tobias MEGGENDORFER a Maximilian WEININGER. Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. In \textit{33rd International Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland.}. Dagstuhl, 2022, s.~1-20. ISBN~9783959772464. Dostupné z: https://doi.org/10.4230/LIPICS.CONCUR.2022.11.

Přehled o publikaci