Value iteration for simple stochastic games: Stopping criterion
and learning algorithm

J 2022

Value iteration for simple stochastic games: Stopping criterion and learning algorithm

EISENTRAUT, Julia; Edon KELMENDI; Jan KŘETÍNSKÝ a Maximilian WEININGER

Základní údaje

Originální název

Value iteration for simple stochastic games: Stopping criterion and learning algorithm

Autoři

EISENTRAUT, Julia; Edon KELMENDI; Jan KŘETÍNSKÝ a Maximilian WEININGER

Vydání

Information and Computation, Amsterdam, Elsevier, 2022, 0890-5401

Další údaje

Typ výsledku

Článek v odborném periodiku

Impakt faktor

Impact factor: 1.000

Označené pro přenos do RIV

Organizační jednotka

Fakulta informatiky

DOI

https://doi.org/10.1016/J.IC.2022.104886

Změněno: 17. 3. 2025 14:43, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations of the value of the game, but is only guaranteed to converge in the limit. We provide an additional converging sequence of over-approximations, based on an analysis of the game graph. Together, these two sequences entail the first error bound and hence the first stopping criterion for VI on simple stochastic games, indicating when the algorithm can be stopped for a given precision. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. We further use this error bound to provide a learning-based asynchronous VI algorithm; it uses simulations and thus often avoids exploring the whole game graph, but still yields the same guarantees. Finally, we experimentally show that the overhead for computing the additional sequence of over-approximations often is negligible.

Citovat

EISENTRAUT, Julia; Edon KELMENDI; Jan KŘETÍNSKÝ a Maximilian WEININGER. Value iteration for simple stochastic games: Stopping criterion and learning algorithm. Information and Computation. Amsterdam: Elsevier, 2022, roč. 285, č. 104886, s. 1-32. ISSN 0890-5401. Dostupné z: https://doi.org/10.1016/J.IC.2022.104886.

@article{2484770,
   author = {Eisentraut, Julia and Kelmendi, Edon and Křetínský, Jan and Weininger, Maximilian},
   article_location = {Amsterdam},
   article_number = {104886},
   doi = {https://doi.org/10.1016/J.IC.2022.104886},
   issn = {0890-5401},
   journal = {Information and Computation},
   title = {Value iteration for simple stochastic games: Stopping criterion and learning algorithm},
   volume = {285},
   year = {2022}
}

TY  - JOUR
ID  - 2484770
AU  - Eisentraut, Julia - Kelmendi, Edon - Křetínský, Jan - Weininger, Maximilian
PY  - 2022
TI  - Value iteration for simple stochastic games: Stopping criterion and learning algorithm
JF  - Information and Computation
VL  - 285
IS  - 104886
SP  - 1-32
EP  - 1-32
PB  - Elsevier
SN  - 08905401
N2  - The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations of the value of the game, but is only guaranteed to converge in the limit. We provide an additional converging sequence of over-approximations, based on an analysis of the game graph. Together, these two sequences entail the first error bound and hence the first stopping criterion for VI on simple stochastic games, indicating when the algorithm can be stopped for a given precision. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. We further use this error bound to provide a learning-based asynchronous VI algorithm; it uses simulations and thus often avoids exploring the whole game graph, but still yields the same guarantees. Finally, we experimentally show that the overhead for computing the additional sequence of over-approximations often is negligible.
ER  -

EISENTRAUT, Julia; Edon KELMENDI; Jan KŘETÍNSKÝ a Maximilian WEININGER. Value iteration for simple stochastic games: Stopping criterion and learning algorithm. \textit{Information and Computation}. Amsterdam: Elsevier, 2022, roč.~285, č.~104886, s.~1-32. ISSN~0890-5401. Dostupné z: https://doi.org/10.1016/J.IC.2022.104886.

Přehled o publikaci