CRaFT: Circuit-Guided Refusal Feature Selection via Cross-Layer Transcoders 文章

ArXiv CS.AI2026-05-28NEWSen作者: Su-Hyeon Kim, Hyundong Jin, Yejin Lee, Yo-Sub Han

CRaFT: Circuit-Guided Refusal Feature Selection via Cross-Layer Transcoders · 相关技术