Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts 文章

ArXiv CS.AI2026-06-02NEWSen作者: Boxuan Wang, Zhuoyun Li, Xiaowei Huang, Yi Dong

详细信息

来源站点: ArXiv CS.AI
作者: Boxuan Wang, Zhuoyun Li, Xiaowei Huang, Yi Dong
文章类型: NEWS
语言: en
发布日期: 2026-06-02

摘要

arXiv:2606.01441v1 Announce Type: new Abstract: Large language models (LLMs) excel in reasoning and knowledge-intensive tasks but remain vulnerable to prompt-level adversarial attacks that preserve intent while triggering commonsense hallucinations. This vulnerability is urgent, as LLMs are rapidly integrated into safety-critical domains where factual reliability is non-negotiable. Existing attack methods either lack efficiency or fail to capture the adaptive strategies of real-world adversaries. We propose an A*-inspired Factual Error Induction Framework, a framework for generating semantically aligned yet obfuscated prompts. At its core is a Hierarchical Rewrite Strategy guided by a dynamic semantic dispersion coefficient $\gamma$ that balances conservative edits early with aggressive obfuscations later, following a reverse simulated annealing schedule.

Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (4)