BlueFin: Benchmarking LLM Agents on Financial Spreadsheets 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
BlueFin: Benchmarking LLM Agents on Financial Spreadsheets arXiv:2605.30907v1 Announce Type: cross Abstract: We present BlueFin, a benchmark that tasks large language model (LLM) agents with synthesis, manipulation, and comprehension tasks over spreadsheet workbooks in the professional finance domain. Though estimates of the global population of paying users of spreadsheet software range in the hundreds of millions -- an order of magnitude more than the estimated global population of profession
相关公司查看全部 (10)
相关产品查看全部 (10)
相关报道查看全部 (1)
BlueFin: Benchmarking LLM Agents on Financial Spreadsheets
ArXiv CS.CL2026-06-01