D 2023

Shielding in Resource-Constrained Goal POMDPs

AJDARÓW, Michal, Šimon BRLEJ and Petr NOVOTNÝ

Basic information

Original name

Shielding in Resource-Constrained Goal POMDPs

Authors

AJDARÓW, Michal (203 Czech Republic, belonging to the institution), Šimon BRLEJ (703 Slovakia, belonging to the institution) and Petr NOVOTNÝ (203 Czech Republic, belonging to the institution)

Edition

Washington, DC, USA, Proceedings of the 37th AAAI Conference on Artificial Intelligence, p. 14674-14682, 9 pp. 2023

Publisher

AAAI Press

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10200 1.2 Computer and information sciences

Country of publisher

United States of America

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

electronic version available online

References:

RIV identification code

RIV/00216224:14330/23:00131270

Organization unit

Faculty of Informatics

ISBN

978-1-57735-880-0

ISSN

Keywords in English

decision making; Markov decision processes; controller synthesis; resource constraints; shielding
Změněno: 7/4/2024 23:07, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by the agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call resource-constrained goal optimization (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a shield for a given scenario: a procedure that observes the agent and prevents it from using actions that might eventually lead to resource exhaustion. Second, we augment the POMCP heuristic search algorithm for POMDP planning with our shields to obtain an algorithm solving the RSGO problem. We implement our algorithm and present experiments showing its applicability to benchmarks from the literature.

Links

GA21-24711S, research and development project
Name: Efektivní analýza a optimalizace pravděpodobnostních systémů a her (Acronym: Efektivní analýza a optimalizace pravděpodobnostní)
Investor: Czech Science Foundation
MUNI/A/1433/2022, interní kód MU
Name: Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 23
Investor: Masaryk University