Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression arXiv:2605.28567v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) have become a central tool for interpreting language models. However, two key SAE analyses that remain difficult to scale are (1) matching semantically similar features across multi-layers and (2) compressing large feature circuits into interpretable supernodes. Although these have been treated as separate problems, we show that

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression · 相关人物