J 2025

PAC statistical model checking of mean payoff in discrete- and continuous-time MDP

KŘETÍNSKÝ, Jan; Chaitanya AGARWAL; Shibashis GUHA a M. PAZHAMALAI

Základní údaje

Originální název

PAC statistical model checking of mean payoff in discrete- and continuous-time MDP

Autoři

KŘETÍNSKÝ, Jan; Chaitanya AGARWAL; Shibashis GUHA a M. PAZHAMALAI

Vydání

Springer, 2025, 0925-9856

Další údaje

Jazyk

angličtina

Typ výsledku

Článek v odborném periodiku

Obor

10200 1.2 Computer and information sciences

Stát vydavatele

Nizozemské království

Utajení

není předmětem státního či obchodního tajemství

Odkazy

Impakt faktor

Impact factor: 0.800 v roce 2024

Označené pro přenos do RIV

Ano

Kód RIV

RIV/00216224:14330/25:00142466

Organizační jednotka

Fakulta informatiky

EID Scopus

Klíčová slova anglicky

Markov decision processes; Statistical model checking; Mean payoff; Reinforcement learning
Změněno: 1. 4. 2026 11:12, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

Markov decision processes (MDPs) and continuous-time MDP (CTMDPs) are the fundamental models for non-deterministic systems with probabilistic uncertainty. Mean payoff (a.k.a. long-run average reward) is one of the most classic objectives considered in their context. We provide the first practical algorithm to compute mean payoff probably approximately correctly in unknown MDPs. Our algorithm is anytime in the sense that if terminated prematurely, it returns an approximate value with the required confidence. Further, we extend it to unknown CTMDPs. We do not require any knowledge of the state or number of successors of a state, but only a lower bound on the minimum transition probability, which has been advocated in literature. Our algorithm learns the unknown MDP/CTMDP through repeated, directed sampling; thus spending less time on learning components with smaller impact on the mean payoff. In addition to providing probably approximately correct (PAC) bounds for our algorithm, we also demonstrate its practical nature by running experiments on standard benchmarks.

Návaznosti

MUNI/I/1757/2021, interní kód MU
Název: MUNI Award in Science and Humanities (Akronym: Křetínský)
Investor: Masarykova univerzita, MUNI Award in Science and Humanities, MASH - MUNI Award in Science and Humanities