Cordyceps: Covert Control Attacks on LLMs via Data Poisoning 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Cordyceps: Covert Control Attacks on LLMs via Data Poisoning arXiv:2605.26595v1 Announce Type: cross Abstract: Large language models (LLMs) are often fine-tuned on uncurated text datasets that adversaries can poison. Existing poisoning attacks primarily rely on fixed trigger phrases that defenses such as outlier detection, clean-data regularization, or online monitoring can neutralize. In this paper, we propose a data poisoning method that teaches an LLM an information hiding scheme reliably an