Trading Performance for Stability in Markov Decision Processes

D 2013

Trading Performance for Stability in Markov Decision Processes

BRÁZDIL, Tomáš; Krishnendu CHATTERJEE; Vojtěch FOREJT a Antonín KUČERA

Základní údaje

Originální název

Trading Performance for Stability in Markov Decision Processes

Autoři

BRÁZDIL, Tomáš (203 Česká republika, domácí); Krishnendu CHATTERJEE (356 Indie); Vojtěch FOREJT (203 Česká republika, domácí) a Antonín KUČERA (203 Česká republika, garant, domácí)

Vydání

London, Proceedings of 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013), od s. 331-340, 10 s. 2013

Nakladatel

IEEE Computer Society

Další údaje

Jazyk

angličtina

Typ výsledku

Stať ve sborníku

Obor

10201 Computer sciences, information science, bioinformatics

Stát vydavatele

Spojené státy

Utajení

není předmětem státního či obchodního tajemství

Forma vydání

paměťový nosič (CD, DVD, flash disk)

Kód RIV

RIV/00216224:14330/13:00066541

Organizační jednotka

Fakulta informatiky

ISBN

978-1-4799-0413-6

ISSN

DOI

https://doi.org/10.1109/LICS.2013.39

UT WoS

000326815000038

Klíčová slova česky

Markovovy rozhodovací procesy; optimalizace

Klíčová slova anglicky

Markov decision processes; optimization

Štítky

core_A, firank_1, formela-conference

Příznaky

Mezinárodní význam, Recenzováno

Změněno: 24. 4. 2014 18:43, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively.

Návaznosti

GPP202/12/P612, projekt VaV

Název: Formální verifikace stochastických systémů s reálným časem (Akronym: Formální verifikace stochastických systémů s reáln)

Investor: Grantová agentura ČR, Formální verifikace stochastických systémů s reálným časem

Citovat

BRÁZDIL, Tomáš; Krishnendu CHATTERJEE; Vojtěch FOREJT a Antonín KUČERA. Trading Performance for Stability in Markov Decision Processes. In Proceedings of 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013). London: IEEE Computer Society, 2013, s. 331-340. ISBN 978-1-4799-0413-6. Dostupné z: https://doi.org/10.1109/LICS.2013.39.

@inproceedings{1130508,
   author = {Brázdil, Tomáš and Chatterjee, Krishnendu and Forejt, Vojtěch and Kučera, Antonín},
   address = {London},
   booktitle = {Proceedings of 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013)},
   doi = {https://doi.org/10.1109/LICS.2013.39},
   keywords = {Markov decision processes; optimization},
   howpublished = {paměťový nosič},
   language = {eng},
   location = {London},
   isbn = {978-1-4799-0413-6},
   pages = {331-340},
   publisher = {IEEE Computer Society},
   title = {Trading Performance for Stability in Markov Decision Processes},
   year = {2013}
}

TY  - CONF
ID  - 1130508
AU  - Brázdil, Tomáš - Chatterjee, Krishnendu - Forejt, Vojtěch - Kučera, Antonín
PY  - 2013
TI  - Trading Performance for Stability in Markov Decision Processes
PB  - IEEE Computer Society
CY  - London
SN  - 9781479904136
KW  - Markov decision processes
KW  - optimization
N2  - We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively.
ER  -

BRÁZDIL, Tomáš; Krishnendu CHATTERJEE; Vojtěch FOREJT a Antonín KUČERA. Trading Performance for Stability in Markov Decision Processes. In \textit{Proceedings of 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013)}. London: IEEE Computer Society, 2013, s.~331-340. ISBN~978-1-4799-0413-6. Dostupné z: https://doi.org/10.1109/LICS.2013.39.

Přehled o publikaci