Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs 事件

Name: Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs
Start: 2026-06-02

BREAKTHROUGH2026-06-02影响: HIGH

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs arXiv:2603.24511v2 Announce Type: replace-cross Abstract: We show that AI agents are capable of discovering novel algorithms for adversarial attacks against LLMs, advancing the state of the art on white-box jailbreaking and prompt injection evaluations. We deploy frontier agents, such as Claude Code and Codex, in an autoresearch loop with access to a library of 30+ prior methods and an evaluation script wit

人工智能

关系图谱

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs 事件

相关公司查看全部 (9)

相关人物查看全部 (2)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)