Improving mathematical reasoning with process supervision 事件

Name: Improving mathematical reasoning with process supervision
Start: 2023-05-31

BREAKTHROUGH2023-05-31影响: HIGH

Improving mathematical reasoning with process supervision We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome supervision, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by

人工智能

关系图谱

Improving mathematical reasoning with process supervision 事件

相关公司查看全部 (3)

相关人物

相关产品查看全部 (4)

相关技术查看全部 (2)

相关报道查看全部 (1)