Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics 事件

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics arXiv:2606.05168v1 Announce Type: new Abstract: Training on synthetic data causes model collapse, but existing analyses treat this as single-chain degradation. In reality, the AI ecosystem involves cross-contamination: models ingest synthetic data from other models, produce new synthetic text, and contaminate shared corpora. We propose a bilayer coupled SIR/SIRS framework -- a phenomenological mean-fi