Signed Compression Progress on a Sealed Audit is Goodhart-Resistant 事件

PRODUCT_LAUNCH2026-06-11影响: MEDIUM

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant arXiv:2606.11417v1 Announce Type: cross Abstract: Compression progress is a long-standing proposal for intrinsic motivation: reward an agent when its world model becomes better at predicting or compressing experience. The folk claim is that this reward is "credible" because it is paid only for learning. We make this precise and prove it. If intrinsic reward is the signed decrease of a fixed sealed-audit loss, r_t = E(theta_{t-1