Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA 文章

ArXiv CS.AI2026-06-02NEWSen作者: Sangyoon Lee, Jaeho Lee

摘要

arXiv:2602.09492v2 Announce Type: replace-cross Abstract: Low-rank adaptation (LoRA) is a standard approach for fine-tuning large language models, yet its many variants report conflicting empirical gains, often on the same benchmarks. We show that these contradictions arise from a single overlooked factor: the batch size. When properly tuned, vanilla LoRA often matches the performance of more complex variants. We further propose a proxy-based, cost-efficient strategy for batch size tuning, revealing the impact of rank, dataset size, and model capacity on the optimal batch size. Our findings elevate batch size from a minor implementation detail to a first-order design parameter, reconciling prior inconsistencies and enabling more reliable evaluations of LoRA variants.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据