A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners 文章

ArXiv CS.AI2026-06-03NEWSen作者: Patrick Emami, Nan Qiang, Peter Graf

摘要

arXiv:2606.03685v1 Announce Type: cross Abstract: Supervised fine-tuning (SFT) improves end-to-end classical planning in large language models (LLMs), but do these models also learn to represent and reason about the planning problems they are solving? Due to the relative complexity of classical planning problems and the challenge that end-to-end plan generation poses for LLMs, it has been difficult to explore this question. In our work, we devise and perform a series of interpretability experiments that holistically interrogate world model recovery by examining both internal representations and generative capabilities of fine-tuned LLMs. We find that: a) Supervised fine-tuning on valid action sequences enables LLMs to linearly encode action validity and some state predicates. b) Models that struggle to use output probabilities for classifying action validity may still learn internal representations that separate valid from invalid actions.

A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (6)