EvoDefense: Co-Evolving Black-Box Defense with Large Language Models 文章

ArXiv CS.CL2026-06-01NEWSen作者: Yu Li, Yuenan Hou, Yingmei Wei, Yanming Guo, Chaochao Lu

摘要

arXiv:2605.31140v1 Announce Type: cross Abstract: Large Language Models (LLMs) remain highly vulnerable to diverse attacks, particularly in black-box settings where the internals of target models are inaccessible. Existing black-box defenses typically rely on pre-defined filtering heuristics, which often fail to generalize to unseen attack types and target model architectures. We introduce EvoDefense, an experience-guided co-evolving black-box defense paradigm. EvoDefense employs a guard LLM to detect malicious queries and an experience memory module to accumulate defense knowledge from previous interactions. At the core of EvoDefense is a continuous attack-defense evolution loop, where an attack generator and the guard model iteratively refine their attack strategies and defense policies through experience-guided optimization. This design enables EvoDefense to generalize across unseen attacks and target models without retraining.

相关公司

暂无数据

相关人物

暂无数据