2025
1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization
AZEEM, Muqsit; Debraj CHAKRABORTY; Sudeep KANAV; Jan KŘETÍNSKÝ; Mohammadsadegh MOHAGHEGHI et al.Základní údaje
Originální název
1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization
Autoři
AZEEM, Muqsit; Debraj CHAKRABORTY; Sudeep KANAV; Jan KŘETÍNSKÝ; Mohammadsadegh MOHAGHEGHI; Stefanie MOHR a Maximilian WEININGER
Vydání
Denver, USA, VMCAI 2025, 26th International Conference on Verification, Model Checking, and Abstract Interpretation, od s. 97-120, 24 s. 2025
Nakladatel
Springer
Další údaje
Jazyk
angličtina
Typ výsledku
Stať ve sborníku
Obor
10201 Computer sciences, information science, bioinformatics
Stát vydavatele
Švýcarsko
Utajení
není předmětem státního či obchodního tajemství
Forma vydání
tištěná verze "print"
Impakt faktor
Impact factor: 0.402 v roce 2005
Označené pro přenos do RIV
Ano
Organizační jednotka
Fakulta informatiky
ISBN
978-3-031-82702-0
ISSN
EID Scopus
Klíčová slova anglicky
model checking; probabilistic verification; Markov decision process; policy synthesis
Příznaky
Mezinárodní význam, Recenzováno
Změněno: 26. 3. 2025 17:04, RNDr. Pavel Šmerk, Ph.D.
Anotace
V originále
Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such huge MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.
Návaznosti
| MUNI/I/1757/2021, interní kód MU |
|