Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation 文章

ArXiv CS.CL2026-05-26NEWSen作者: Konstantin Berlin, Adam Swanda

摘要

arXiv:2605.24247v1 Announce Type: new Abstract: Many automated labeling pipelines classify inputs into categories defined by a written specification, content moderation being a prominent use case. Simple category definitions are not detailed enough for labelers to produce the accurate, consistent golden labels these pipelines require. One solution is to write a prescriptive definition that settles enough real boundary cases that labelers cannot disagree with the written interpretation. In practice, definitions at that level of detail exceed what a human annotator can hold in working memory, so annotators fall back on intuition and the labels drift from the written rules, regressing on accuracy and consistency.