Verification of Markov Decision Processes using Learning
Algorithms

D 2014

Verification of Markov Decision Processes using Learning Algorithms

BRÁZDIL, Tomáš, Krishnendu CHATTERJEE, Martin CHMELÍK, Vojtěch FOREJT, Jan KŘETÍNSKÝ et. al.

Základní údaje

Originální název

Verification of Markov Decision Processes using Learning Algorithms

Autoři

BRÁZDIL, Tomáš (203 Česká republika, domácí), Krishnendu CHATTERJEE (356 Indie), Martin CHMELÍK (203 Česká republika), Vojtěch FOREJT (203 Česká republika), Jan KŘETÍNSKÝ (203 Česká republika, garant, domácí), Marta KWIATKOWSKA (616 Polsko), David PARKER (826 Velká Británie a Severní Irsko) a Mateusz UJMA (616 Polsko)

Vydání

Heidelberg Dordrecht London New York, Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014, od s. 98-114, 17 s. 2014

Nakladatel

Springer

Další údaje

Jazyk

angličtina

Typ výsledku

Stať ve sborníku

Obor

10201 Computer sciences, information science, bioinformatics

Stát vydavatele

Německo

Utajení

není předmětem státního či obchodního tajemství

Forma vydání

tištěná verze "print"

Impakt faktor

Impact factor: 0.402 v roce 2005

Kód RIV

RIV/00216224:14330/14:00075875

Organizační jednotka

Fakulta informatiky

ISBN

978-3-319-11935-9

ISSN

DOI

http://dx.doi.org/10.1007/978-3-319-11936-6_8

Klíčová slova anglicky

stochastic systems; verification; machine learning; statistical model checking; reinforcement learning

Štítky

core_A, firank_A, formela-conference

Příznaky

Mezinárodní význam, Recenzováno

Změněno: 27. 4. 2015 05:45, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model checking for unbounded properties in MDPs. In contrast with other related approaches, we do not restrict our attention to time-bounded (finite-horizon) or discounted properties, nor assume any particular properties of the MDP. We also show how our techniques extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.

Návaznosti

MUNI/A/0765/2013, interní kód MU

Název: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity (Akronym: SKOMU)

Investor: Masarykova univerzita, Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity, DO R. 2020_Kategorie A - Specifický výzkum - Studentské výzkumné projekty

MUNI/A/0855/2013, interní kód MU

Název: Rozsáhlé výpočetní systémy: modely, aplikace a verifikace III. (Akronym: FI MAV III.)

Investor: Masarykova univerzita, Rozsáhlé výpočetní systémy: modely, aplikace a verifikace III., DO R. 2020_Kategorie A - Specifický výzkum - Studentské výzkumné projekty

Citovat

BRÁZDIL, Tomáš, Krishnendu CHATTERJEE, Martin CHMELÍK, Vojtěch FOREJT, Jan KŘETÍNSKÝ, Marta KWIATKOWSKA, David PARKER a Mateusz UJMA. Verification of Markov Decision Processes using Learning Algorithms. In Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014. Heidelberg Dordrecht London New York: Springer, 2014, s. 98-114. ISBN 978-3-319-11935-9. Dostupné z: https://dx.doi.org/10.1007/978-3-319-11936-6_8.

@inproceedings{1187947,
   author = {Brázdil, Tomáš and Chatterjee, Krishnendu and Chmelík, Martin and Forejt, Vojtěch and Křetínský, Jan and Kwiatkowska, Marta and Parker, David and Ujma, Mateusz},
   address = {Heidelberg Dordrecht London New York},
   booktitle = {Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014},
   doi = {http://dx.doi.org/10.1007/978-3-319-11936-6_8},
   keywords = {stochastic systems; verification; machine learning; statistical model checking; reinforcement learning},
   howpublished = {tištěná verze "print"},
   language = {eng},
   location = {Heidelberg Dordrecht London New York},
   isbn = {978-3-319-11935-9},
   pages = {98-114},
   publisher = {Springer},
   title = {Verification of Markov Decision Processes using Learning Algorithms},
   year = {2014}
}

TY  - JOUR
ID  - 1187947
AU  - Brázdil, Tomáš - Chatterjee, Krishnendu - Chmelík, Martin - Forejt, Vojtěch - Křetínský, Jan - Kwiatkowska, Marta - Parker, David - Ujma, Mateusz
PY  - 2014
TI  - Verification of Markov Decision Processes using Learning Algorithms
PB  - Springer
CY  - Heidelberg Dordrecht London New York
SN  - 9783319119359
KW  - stochastic systems
KW  - verification
KW  - machine learning
KW  - statistical model checking
KW  - reinforcement learning
N2  - We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model checking for unbounded properties in MDPs. In contrast with other related approaches, we do not restrict our attention to time-bounded (finite-horizon) or discounted properties, nor assume any particular properties of the MDP. We also show how our techniques extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.
ER  -

BRÁZDIL, Tomáš, Krishnendu CHATTERJEE, Martin CHMELÍK, Vojtěch FOREJT, Jan KŘETÍNSKÝ, Marta KWIATKOWSKA, David PARKER a Mateusz UJMA. Verification of Markov Decision Processes using Learning Algorithms. In \textit{Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014}. Heidelberg Dordrecht London New York: Springer, 2014, s.~98-114. ISBN~978-3-319-11935-9. Dostupné z: https://dx.doi.org/10.1007/978-3-319-11936-6\_{}8.

Podrobný výpis o publikaci