PsychoPass: Geometric Profiling of Multi-Turn Adversarial LLM Conversations 事件
PRODUCT_LAUNCH2026-06-03影响: MEDIUM
PsychoPass: Geometric Profiling of Multi-Turn Adversarial LLM Conversations arXiv:2606.03136v1 Announce Type: cross Abstract: Multi-turn jailbreak attacks on large language models (LLMs) reveal a mismatch in current guardrails: they operate on individual turns, while attacks unfold as trajectories across conversations. We propose a shift from content to dynamics, modeling conversations as paths in representation space and asking whether adversarial intent is encoded early in their geometry. We
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
PsychoPass: Geometric Profiling of Multi-Turn Adversarial LLM Conversations
ArXiv CS.CL2026-06-03