Regression Language Models for Code 文章

ArXiv CS.CL2026-05-28NEWSen作者: Yash Akhauri, Xingyou Song, Arissa Wongpanich, Bryan Lewandowski, Mohamed S. Abdelfattah

摘要

arXiv:2509.26476v2 Announce Type: replace Abstract: We study code-to-metric regression: predicting numeric outcomes of code executions, a challenging task due to the open-ended nature of programming languages. While prior methods have resorted to heavy and domain-specific feature engineering, we show that a single unified Regression Language Model (RLM) using a frozen LLM encoder can simultaneously predict directly from text, (i) the memory footprint of code across multiple high-level languages such as Python and C++, (ii) the latency of Triton GPU kernels, and (iii) the accuracy and speed of trained neural networks represented in ONNX. In particular, a relatively small 300M parameter RLM based on T5Gemma, obtains $>$0.9 Spearman-rank on competitive programming submissions from APPS, and a single unified model achieves $>$0.5 average Spearman-rank across 17 separate languages from CodeNet. Furthermore, the RLM can obtain the highest average Kendall-Tau of 0.

Regression Language Models for Code 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (9)

相关技术查看全部 (8)