Reachability in Recursive Markov Decision Processes

BRÁZDIL, Tomáš, Václav BROŽEK, Vojtěch FOREJT and Antonín KUČERA. Reachability in Recursive Markov Decision Processes. C. Baier, H. Hermanns (Eds.). In 17th International Conference on Concurrency Theory. Berlin Heidelberg New York: Springer, 2006, p. 358-374. ISBN 3-540-37376-4.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Reachability in Recursive Markov Decision Processes
Name in Czech	Dosažitelnost pro Markovovy rozhodovací procesy
Authors	BRÁZDIL, Tomáš (203 Czech Republic), Václav BROŽEK (203 Czech Republic), Vojtěch FOREJT (203 Czech Republic) and Antonín KUČERA (203 Czech Republic, guarantor). C. Baier, H. Hermanns (Eds.).
Edition	Berlin Heidelberg New York, 17th International Conference on Concurrency Theory, p. 358-374, 17 pp. 2006.
Publisher	Springer

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Germany
Confidentiality degree	is not subject to a state or trade secret
Impact factor	Impact factor: 0.402 in 2005
RIV identification code	RIV/00216224:14330/06:00017081
Organization unit	Faculty of Informatics
ISBN	3-540-37376-4
ISSN	0302-9743
UT WoS	000240256100024
Keywords in English	Markov decision processes; reachability
Tags	Markov decision processes, reachability
Tags	International impact, Reviewed
Changed by	Changed by: prof. RNDr. Antonín Kučera, Ph.D., učo 2508. Changed: 30/3/2010 20:30.

Abstract

We consider a class of infinite-state Markov decision processes generated by stateless pushdown automata. This class corresponds to 1.5-player games over graphs generated by BPA systems or (equivalently) 1-exit recursive state machines. An extended reachability objective is specified by two sets S and T of safe and terminal stack configurations, where the membership to S and T depends just on the top-of-the-stack symbol. The question is whether there is a suitable strategy such that the probability of hitting a terminal configuration by a path leading only through safe configurations is equal to (or different from) a given x in {0,1}. We show that the qualitative extended reachability problem is decidable in polynomial time, and that the set of all configurations for which there is a winning strategy is effectively regular. More precisely, this set can be represented by a deterministic finite-state automaton with a fixed number of control states. This result is a generalization of a recent theorem by Etessami & Yannakakis which says that the qualitative termination for 1-exit RMDPs (which exactly correspond to our 1.5-player BPA games) is decidable in polynomial time. Interestingly, the properties of winning strategies for the extended reachability objectives are quite different from the ones for termination, and new observations are needed to obtain the result. As an application, we derive the EXPTIME-completeness of the model-checking problem for 1.5-player BPA games and qualitative PCTL formulae.

Abstract (in Czech)

V článku se zkoumají vlastnosti nekonečně-stavových Markovových rozhodovacích procesů, které jsou generované zásobníkovými procesy bez kontrolních stavů. Tato třída procesů odpovídá hrám s 1.5 hráči nad grafy, které jsou generované BPA systémy. Dokážeme, že kvalitativní problém dosažitelnosti je pro tuto třídu procesů rozhodnutelný v polynomiálním čase. Aplikaci tohoto výsledku prokážeme EXPTIME-úplnost problému platnosti dané kvalitativní PCTL formule v daném stavu daného procesu.

Links
MSM0021622419, plan (intention)	Name: Vysoce paralelní a distribuované výpočetní systémy
MSM0021622419, plan (intention)	Investor: Ministry of Education, Youth and Sports of the CR, Highly Parallel and Distributed Computing Systems
1M0545, research and development project	Name: Institut Teoretické Informatiky
1M0545, research and development project	Investor: Ministry of Education, Youth and Sports of the CR, Institute for Theoretical Computer Science

PrintDisplayed: 21/9/2024 03:36

Reachability in Recursive Markov Decision Processes

Other applications