NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models 事件
SHUTDOWN2026-06-01影响: LOW
NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models arXiv:2605.30393v1 Announce Type: cross Abstract: Public numeric benchmarks appear in pretraining, so an evaluation that conditions on a date may be measuring memorized recall rather than out-of-sample skill. We introduce NumLeak, a measurement framework that combines API-boundary probes on production models with a white-box controlled validation on an open causal LM. Top-tier frontier LLMs recall the Fama-French market ex
相关产品查看全部 (10)
相关报道查看全部 (1)
NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models
ArXiv CS.AI2026-06-01