Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems 事件
PRODUCT_LAUNCH2026-06-09影响: MEDIUM
Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems arXiv:2606.08034v1 Announce Type: new Abstract: Symbolic benchmarks have emerged as a key approach to assess model robustness under minor modifications to STEM-related questions. However, existing symbolic benchmarks mostly remain limited to mathematical reasoning, lack visual grounding, and are predominantly in English. In this work, we introduce Sci-Rho (Science Rhobustness), a dynamic benchmark for visually-ground
Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems · 相关报道
相关报道
Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems
ArXiv CS.CV2026-06-09