From Leaky Thoughts to Private Reasoning: Controlling What LRMs Say to Themselves 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

From Leaky Thoughts to Private Reasoning: Controlling What LRMs Say to Themselves arXiv:2602.24210v2 Announce Type: replace Abstract: Large reasoning models (LRMs) produce reasoning traces (RTs) that often contain sensitive information. These leaky thoughts are difficult to control and frequently violate explicit privacy directives. Because RTs can be exposed through prompt injection attacks, this becomes a direct privacy risk to the user. We approach this as a controllability problem: since pr