HARP: Efficient Data Selection for Finetuning Large Language Models 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

HARP: Efficient Data Selection for Finetuning Large Language Models arXiv:2606.07690v1 Announce Type: cross Abstract: Finetuning data selection requires balancing two competing goals: selecting examples that improve the downstream objective, and doing so without repeatedly finetuning models. Train-free selectors are scalable but rely on proxies such as embedding similarity or clustering, which may not match the target objective. Train-based selectors better reflect downstream utility through gr