Latent Reasoning in TRMs is Secretly a Policy Improvement Operator 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Latent Reasoning in TRMs is Secretly a Policy Improvement Operator arXiv:2511.16886v5 Announce Type: replace Abstract: Recently, small models with latent recursion have obtained promising results on complex reasoning tasks. These results are typically explained by the theory that such recursion increases a networks depth, allowing it to compactly emulate the capacity of larger models. However, the performance of recursively added layers remains behind the capabilities of one pass models with th
相关产品查看全部 (10)
相关报道查看全部 (1)
Latent Reasoning in TRMs is Secretly a Policy Improvement Operator
ArXiv CS.CL2026-06-02