TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering 事件

SHUTDOWN2026-05-26影响: LOW

TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering arXiv:2605.24703v1 Announce Type: new Abstract: Large language models (LLMs) and time-series language models (TSLMs) are increasingly applied to time-series question answering (TSQA). Unlike text-only QA, TSQA requires models to ground answers in temporal signals whose patterns may occur at different scales, specific time locations, or across separated intervals. However, existing benchmarks are typically o