LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks 文章

ArXiv CS.AI2026-06-03NEWSen作者: Po-Nien Kung, Linfeng Song, Dawsen Hwang, Jinsung Yoon, Chun-Liang Li, Simone Severini, Mirek Ol\v{s}\'ak, Edward Lockhart, Quoc V Le, Burak Gokturk, Thang Luong, Tomas Pfister, Nanyun Peng

查看原文 →

关系图谱

摘要

arXiv:2606.03303v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit strong informal mathematical reasoning but struggle to generate mechanically verifiable proofs in formal languages like Lean. We present LEAP, an agentic framework that enables general-purpose foundation models to achieve state-of-the-art performance on automated formal theorem proving. LEAP leverages foundation model capabilities, such as informal reasoning, instruction following, and iterative self-refinement. By decomposing complex problems into smaller units, the system bridges formal proof construction with informal blueprints through continuous interaction with the Lean compiler. To provide a rigorous evaluation beyond increasingly saturated benchmarks, we introduce Lean-IMO-Bench, a benchmark of IMO-style problems formalized in Lean, with short statements yet highly non-routine and multi-step proofs across a wide range of difficulty levels.

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks 文章

摘要

相关事件查看全部 (2)

相关公司

相关人物

相关产品查看全部 (4)

相关技术查看全部 (2)