Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet 文章
ArXiv CS.AI2026-05-29NEWSen作者: Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Trenton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermyn, Shan Carter, Chris Olah, Tom Henighan
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet · 相关人物
暂无数据