Combating Data Laundering in LLM Training 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Combating Data Laundering in LLM Training arXiv:2604.01904v2 Announce Type: replace-cross Abstract: Data rights owners can detect unauthorized data use in large language model (LLM) training by querying with proprietary samples. Often, superior performance (e.g., higher confidence or lower loss) on a sample relative to the untrained data implies it was part of the training corpus, as LLMs tend to perform better on data they have seen during training. However, this detection becomes fragile unde
相关产品查看全部 (10)
相关报道查看全部 (1)
Combating Data Laundering in LLM Training
ArXiv CS.AI2026-05-29