Converted, Not Equivalent: Benchmarking Codebase Conversion via Observational Equivalence 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Converted, Not Equivalent: Benchmarking Codebase Conversion via Observational Equivalence arXiv:2605.29054v1 Announce Type: cross Abstract: Coding agents increasingly act as codebase-scale collaborators that can assist with codebase conversion, but this progress has exposed a critical weakness: agents often over-trust their own local validation routines and declare success on artifacts that satisfy surface checks while violating the semantic contracts users actually care about. This problem is