A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners 文章

ArXiv CS.AI2026-06-03NEWSen作者: Patrick Emami, Nan Qiang, Peter Graf

摘要

arXiv:2606.03685v1 Announce Type: cross Abstract: Supervised fine-tuning (SFT) improves end-to-end classical planning in large language models (LLMs), but do these models also learn to represent and reason about the planning problems they are solving? Due to the relative complexity of classical planning problems and the challenge that end-to-end plan generation poses for LLMs, it has been difficult to explore this question. In our work, we devise and perform a series of interpretability experiments that holistically interrogate world model recovery by examining both internal representations and generative capabilities of fine-tuned LLMs. We find that: a) Supervised fine-tuning on valid action sequences enables LLMs to linearly encode action validity and some state predicates. b) Models that struggle to use output probabilities for classifying action validity may still learn internal representations that separate valid from invalid actions.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据