Detailed Information on Publication Record
2023
Shielding in Resource-Constrained Goal POMDPs
AJDARÓW, Michal, Šimon BRLEJ and Petr NOVOTNÝBasic information
Original name
Shielding in Resource-Constrained Goal POMDPs
Authors
AJDARÓW, Michal (203 Czech Republic, belonging to the institution), Šimon BRLEJ (703 Slovakia, belonging to the institution) and Petr NOVOTNÝ (203 Czech Republic, belonging to the institution)
Edition
Washington, DC, USA, Proceedings of the 37th AAAI Conference on Artificial Intelligence, p. 14674-14682, 9 pp. 2023
Publisher
AAAI Press
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10200 1.2 Computer and information sciences
Country of publisher
United States of America
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
electronic version available online
References:
RIV identification code
RIV/00216224:14330/23:00131270
Organization unit
Faculty of Informatics
ISBN
978-1-57735-880-0
ISSN
Keywords in English
decision making; Markov decision processes; controller synthesis; resource constraints; shielding
Změněno: 7/4/2024 23:07, RNDr. Pavel Šmerk, Ph.D.
Abstract
V originále
We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by the agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call resource-constrained goal optimization (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a shield for a given scenario: a procedure that observes the agent and prevents it from using actions that might eventually lead to resource exhaustion. Second, we augment the POMCP heuristic search algorithm for POMDP planning with our shields to obtain an algorithm solving the RSGO problem. We implement our algorithm and present experiments showing its applicability to benchmarks from the literature.
Links
GA21-24711S, research and development project |
| ||
MUNI/A/1433/2022, interní kód MU |
|