Steering at the Source: Style Modulation Heads for Robust Persona Control 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Steering at the Source: Style Modulation Heads for Robust Persona Control arXiv:2603.13249v2 Announce Type: replace Abstract: Activation steering offers a computationally efficient mechanism for controlling Large Language Models (LLMs) without fine-tuning. While effectively controlling target traits (e.g., persona), coherency degradation remains a major obstacle to safety and practical deployment. We hypothesize that this degradation stems from intervening on the residual stream, which indiscri