Steering Language Models Before They Speak: Logit-Level Interventions 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Steering Language Models Before They Speak: Logit-Level Interventions arXiv:2601.10960v2 Announce Type: replace Abstract: Controllable generation requires language models to realize output characteristics such as reading level, politeness, and toxicity. Existing steering methods are often indirect, require access to internal activations, or depend on auxiliary trained models. We propose SWAI, a training-free inference-time method that addresses these limitations by steering directly in logit sp
相关产品查看全部 (10)
相关报道查看全部 (1)
Steering Language Models Before They Speak: Logit-Level Interventions
ArXiv CS.CL2026-05-29