Scenario Generation for Risk-Aware Reinforcement Learning with Probably Approximately Safe Guarantees 文章

ArXiv CS.AI2026-06-04NEWSen作者: Mohit Prashant, Arvind Easwaran

摘要

arXiv:2606.04812v1 Announce Type: cross Abstract: Guaranteeing safety is critical to the deployment of reinforcement learning (RL) agents in the real-world, especially as policies learned using deep RL may demonstrate susceptibility to transition perturbations that result in unknown or unsafe behaviour. A method of policy verification is to construct probabilistic barrier-certificates by sampling policy trajectories with respect to safety constraints, thereby demarcating known safe behaviour from unknown behaviour. Obtaining tight upper and lower bounds on the probability of violation of these constraints may be difficult if the policy is susceptible to transition uncertainty or perturbation that places the agent in insufficiently explored states.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据