Failure of contextual invariance in large language models 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Failure of contextual invariance in large language models arXiv:2603.23485v2 Announce Type: replace Abstract: Standard evaluation practices assume that large language model (LLM) outputs are stable when prompts are embedded in contextually equivalent discourses. Here, we test this assumption in the setting of gender inference. Using a controlled pronoun selection task, we introduce minimal, theoretically uninformative discourse context and find that this induces large, systematic shifts in mode

Failure of contextual invariance in large language models · 相关人物