When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop 文章

ArXiv CS.AI2026-05-29NEWSen作者: Yang Zhang, Xiukun Wei, Xueru Zhang

详细信息

来源站点: ArXiv CS.AI
作者: Yang Zhang, Xiukun Wei, Xueru Zhang
文章类型: NEWS
语言: en
发布日期: 2026-05-29

摘要

arXiv:2605.29267v1 Announce Type: new Abstract: Foundation models are increasingly trained on synthetic data generated by prior model iterations rather than exclusively on real data. This self-consuming training paradigm can lead to model collapse, divergence, or bias amplification. Recent work (Ferbach et al., 2024) shows that incorporating human curation into the loop can steer a self-consuming model toward human-aligned behavior, but these analyses focus on a single, isolated model that solely consumes its own outputs. In practice, however, models often interact and train on input-output pairs produced by other models. This paper studies self-consuming training in the multi-model regime. We first formalize a framework for interacting self-consuming models and characterize when the resulting dynamical system converges to a stable point.

When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术