Approximating the termination value of one-counter MDPs and
stochastic games

BRÁZDIL, Tomáš, Václav BROŽEK, Kousha ETESSAMI a Antonín KUČERA. Approximating the termination value of one-counter MDPs and stochastic games. Information and Computation. Netherlands: Elsevier Science, 2013, roč. 222, January, s. 121-138. ISSN 0890-5401. Dostupné z: https://dx.doi.org/10.1016/j.ic.2012.01.008.

Další formáty: BibTeX LaTeX RIS

TY  - JOUR
ID  - 1076357
AU  - Brázdil, Tomáš - Brožek, Václav - Etessami, Kousha - Kučera, Antonín
PY  - 2013
TI  - Approximating the termination value of one-counter MDPs and stochastic games
JF  - Information and Computation
VL  - 222
IS  - January
SP  - 121-138
EP  - 121-138
PB  - Elsevier Science
SN  - 08905401
KW  - Markov decision processes
KW  - one-counter automata
N2  - One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value equal to 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in polynomial time (in NP intersection coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value at least p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given an OC-SSG, and given epsilon>0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain linear programs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs.
ER  -

Základní údaje
Originální název	Approximating the termination value of one-counter MDPs and stochastic games
Název česky	Aproximace hodnoty terminování pro jednočítačové rozhodovací procesy a stochastické hry
Autoři	BRÁZDIL, Tomáš (203 Česká republika, garant, domácí), Václav BROŽEK (203 Česká republika, domácí), Kousha ETESSAMI (826 Velká Británie a Severní Irsko) a Antonín KUČERA (203 Česká republika, domácí).
Vydání	Information and Computation, Netherlands, Elsevier Science, 2013, 0890-5401.

Další údaje
Originální jazyk	angličtina
Typ výsledku	Článek v odborném periodiku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Nizozemské království
Utajení	není předmětem státního či obchodního tajemství
Impakt faktor	Impact factor: 0.604
Kód RIV	RIV/00216224:14330/13:00065955
Organizační jednotka	Fakulta informatiky
Doi	http://dx.doi.org/10.1016/j.ic.2012.01.008
UT WoS	000313861100010
Klíčová slova anglicky	Markov decision processes; one-counter automata
Štítky	formela-journal
Příznaky	Mezinárodní význam, Recenzováno
Změnil	Změnil: RNDr. Pavel Šmerk, Ph.D., učo 3880. Změněno: 16. 4. 2014 16:21.

Anotace

One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value equal to 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in polynomial time (in NP intersection coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value at least p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given an OC-SSG, and given epsilon>0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain linear programs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs.

Návaznosti
GAP202/10/1469, projekt VaV	Název: Formální metody pro analýzu a verifikaci komplexních systémů
GAP202/10/1469, projekt VaV	Investor: Grantová agentura ČR, Formální metody pro analýzu a verifikaci komplexních systémů
1M0545, projekt VaV	Název: Institut Teoretické Informatiky
1M0545, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Institut Teoretické Informatiky

Typ	Název	Vložil/a	Vloženo
	Preprint.pdf	Kučera, A.	20. 10. 2013
Vlastnosti Název Preprint.pdf Adresa v ISu https://is.muni.cz/auth/publication/1076357/Preprint.pdf Adresa ze světa https://is.muni.cz/publication/1076357/Preprint.pdf Adresa do Správce https://is.muni.cz/auth/publication/1076357/Preprint.pdf?info Ze světa do Správce https://is.muni.cz/publication/1076357/Preprint.pdf?info Vloženo Ne 20. 10. 2013 13:13, prof. RNDr. Antonín Kučera, Ph.D. Práva Právo číst kdokoliv v Internetu osoba prof. RNDr. Antonín Kučera, Ph.D., učo 2508 osoba doc. RNDr. Tomáš Brázdil, Ph.D., učo 4074 osoba RNDr. Václav Brožek, Ph.D., učo 99081 Právo vkládat Právo spravovat osoba prof. RNDr. Antonín Kučera, Ph.D., učo 2508 osoba doc. RNDr. Tomáš Brázdil, Ph.D., učo 4074 osoba RNDr. Václav Brožek, Ph.D., učo 99081 Atributy Preprint.pdf Aplikace Otevřít soubor. Stáhnout soubor. Adresa v ISu https://is.muni.cz/auth/publication/1076357/Preprint.pdf Adresa ze světa https://is.muni.cz/publication/1076357/Preprint.pdf Typ souboru PDF (application/pdf) Velikost 214,4 KB Hash md5 ee6e19dca0dc6292b0c7687781a0f3fd Vloženo Ne 20. 10. 2013 13:13 Preprint.txt Aplikace Otevřít soubor. Stáhnout soubor. Adresa v ISu https://is.muni.cz/auth/publication/1076357/Preprint.txt Adresa ze světa https://is.muni.cz/publication/1076357/Preprint.txt Typ souboru holý text (text/plain) Velikost 77,9 KB Hash md5 3ac449337e478347429736d43300c311 Vloženo Ne 20. 10. 2013 13:15

Vytisknout
Nahlásit neoprávněně vložený soubor Zobrazeno: 29. 9. 2024 17:21

Approximating the termination value of one-counter MDPs and stochastic games

Vlastnosti

Práva

Preprint.pdf

Preprint.txt

Další aplikace