The Alignment Floor: How Persona Customization Breaks Safety in Weakly-Aligned LLMs 文章

ArXiv CS.CL2026-05-29NEWSen作者: Xing Zhang, Guanghui Wang, Yanwei Cui, Wei Qiu, Ziyuan Li, Bing Zhu, Peiyang He

The Alignment Floor: How Persona Customization Breaks Safety in Weakly-Aligned LLMs · 相关人物

暂无数据