On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance 文章

ArXiv CS.CL2026-06-02NEWSen作者: Etienne Casanova, Rafal Kocielnik, R. Michael Alvarez

详细信息

来源站点: ArXiv CS.CL
作者: Etienne Casanova, Rafal Kocielnik, R. Michael Alvarez
文章类型: NEWS
语言: en
发布日期: 2026-06-02

摘要

arXiv:2606.00467v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used for zero-shot annotation and LLM-as-a-judge tasks, yet their reliability hinges on how model-internalized priors interact with user-provided instructions. We investigate three dimensions of this interaction: (1) how an LLM's familiarity with data and task definitions affects performance, (2) the extent to which additional information in prompts can correct zero-shot errors ("decision stickiness"), and (3) model susceptibility to misaligned task definitions. Through experiments on toxicity detection across diverse datasets (spanning social media, gaming, news, and forums) using both dense and mixture-of-experts models, we find that nearly two-thirds of zero-shot errors are resistant to correction, with an overall rescue rate (fraction of initial errors corrected by prompting) of only 34.8%. High-confidence errors prove especially resistant to correction.

On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (6)