Steering Language Models Before They Speak: Logit-Level Interventions 文章

ArXiv CS.CL2026-05-29NEWSen作者: Hyeseon An, Shinwoo Park, Hyundong Jin, Yo-Sub Han

Steering Language Models Before They Speak: Logit-Level Interventions · 相关技术

相关技术