Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models 文章

ArXiv CS.AI2026-05-27NEWSen作者: Hayden Helm, Xiaodong Liu, Weiwei Yang

Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models · 相关人物

暂无数据