Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions 事件

Name: Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions
Start: 2026-05-27

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions arXiv:2605.26414v1 Announce Type: cross Abstract: Large Language Models (LLMs) achieve impressive accuracy on mathematical reasoning benchmarks, yet their performance drops when problems are modified with simple changes like different names or numbers. Code execution methods, which let models generate and run Python code instead of reasoning in natural language, have been proposed as a solution, but their ef

人工智能

关系图谱

Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions 事件

Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions · 相关报道

相关报道