Multiple Choice Learning of Low-Rank Adapters for Language Modeling 文章

ArXiv CS.CL2026-06-03NEWSen作者: Victor Letzelter, Hugo Malard, Mathieu Fontaine, Ga\"el Richard, Slim Essid, Andrei Bursuc, Patrick P\'erez

查看原文 →

关系图谱

摘要

arXiv:2507.10419v3 Announce Type: replace-cross Abstract: We propose LoRA-MCL, a training scheme that extends next-token prediction in language models with a method designed to decode diverse, plausible sentence continuations at inference time. Traditional language modeling is an intrinsically ill-posed problem: given a context, multiple futures may be equally plausible. Our approach leverages Multiple Choice Learning (MCL) and the winner-takes-all loss to efficiently handle ambiguity through Low-Rank Adaptation. We provide a theoretical interpretation of applying MCL to language modeling, assuming the data is generated from a mixture of distributions. We illustrate the proposed approach using mixtures of Markov chains. We then demonstrate with experiments on audio and visual captioning, as well as machine translation, that our method achieves high diversity and relevance in generated outputs. We release the code for applying LoRA-MCL to a wide range of language models.

Multiple Choice Learning of Low-Rank Adapters for Language Modeling 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (4)