BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning arXiv:2605.27110v1 Announce Type: cross Abstract: In this work, we propose BAIT (Boundary-Aware Iterative Trap), a three-step jailbreak framework that approaches malicious goals through internal disclosure. BAIT first asks the model to identify the protection boundary, then requires it to refine that boundary, and finally requests a detailed example. By expanding each step upon the model's previous responses, BAIT turns