Continuous-Depth Field Theory for Transformer Patching and Mechanistic Interpretability 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Continuous-Depth Field Theory for Transformer Patching and Mechanistic Interpretability arXiv:2605.25225v1 Announce Type: cross Abstract: Mechanistic interpretability often uses activation patching, causal tracing, path patching, and steering directions to reveal behaviorally meaningful directions in Transformer activation space. This paper develops a field-theoretic framework for organizing and predicting such interventions. Treating the residual stream as a depth-token field, we formulate pat