The Alignment Floor: When Persona Customization Is Safe 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
The Alignment Floor: When Persona Customization Is Safe arXiv:2605.27382v1 Announce Type: cross Abstract: A key promise of pluralistic AI is behavioral adaptation: persona prompts like "be creative" or "be thorough" let systems respect diverse user values and communication styles. But how much customization can a model absorb before its alignment breaks? We present the first controlled study of the alignment-customization tradeoff, testing seven persona conditions across five tasks on two model
相关产品查看全部 (10)
相关报道查看全部 (1)
The Alignment Floor: How Persona Customization Breaks Safety in Weakly-Aligned LLMs
ArXiv CS.CL2026-05-29