Safety Generalization Under Distribution Shift in Safe Reinforcement Learning: A Diabetes Testbed 文章

ArXiv CS.AI2026-05-26NEWSen作者: Minjae Kwon, Josephine Lamp, Lu Feng

摘要

arXiv:2601.21094v2 Announce Type: replace-cross Abstract: Safe Reinforcement Learning (RL) algorithms are typically evaluated under fixed training conditions. We investigate whether training-time safety guarantees transfer to deployment under distribution shift, using diabetes management as a safety-critical testbed. We benchmark safe RL algorithms on a unified clinical simulator and reveal a safety generalization gap: policies satisfying constraints during training frequently violate safety requirements on unseen patients. We demonstrate that test-time shielding, which filters unsafe actions using learned dynamics models, effectively restores safety across algorithms and patient populations. Across eight safe RL algorithms, three diabetes types, and three age groups, shielding achieves Time-in-Range gains of 13--14\% for strong baselines such as PPO-Lag and CPO while reducing clinical risk index and glucose variability.

Safety Generalization Under Distribution Shift in Safe Reinforcement Learning: A Diabetes Testbed 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (5)