Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety arXiv:2606.00801v1 Announce Type: cross Abstract: Current approaches to LLM adversarial testing suffer from coverage gaps: manual red-teaming does not scale, LLM-as-attacker methods exhibit mode collapse, and gradient-based approaches produce uninterpretable gibberish. We introduce a quality-diversity evolutionary framework that operates at the semantic level, evolving interpretable attack strategies rather than t