Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression arXiv:2605.28567v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) have become a central tool for interpreting language models. However, two key SAE analyses that remain difficult to scale are (1) matching semantically similar features across multi-layers and (2) compressing large feature circuits into interpretable supernodes. Although these have been treated as separate problems, we show that