Decomposing how prompting steers behavior 文章

ArXiv CS.AI2026-06-03NEWSen作者: Fan L. Cheng, Nikolaus Kriegeskorte

摘要

arXiv:2606.03093v1 Announce Type: new Abstract: Prompting steers large language models (LLMs) and vision-language models (VLMs) without weight updates, but it remains unclear how instruction changes reshape internal representations to produce behavior. We introduce a nested geometric decomposition framework that treats prompting as a transformation of the representational geometry of the content following the prompt. For each prompt pair, we align representations of the same stimuli under two prompts using increasingly expressive stimulus-invariant maps: translation, rigid transformation with uniform scaling, sequential axis scaling, affine transformation, and nonlinear transformation. We then causally test each map by replacing a single layer's prompt-A hidden state for held-out stimuli with its mapped counterpart and measuring recovery of prompt-B representational geometry and behavior.

Decomposing how prompting steers behavior 文章

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (3)