Reading or Guessing? Visual Grounding Failures of Vision-Language Models for OCR in Ancient Greek Editions 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
Reading or Guessing? Visual Grounding Failures of Vision-Language Models for OCR in Ancient Greek Editions arXiv:2605.27750v1 Announce Type: cross Abstract: Recent work has shown that Vision-Language Models (VLMs) used for optical character recognition (OCR) can generate plausible but visually unsupported text, suggesting reliance on language priors. Comparing open-weight VLMs with traditional OCR baselines on low-resource Ancient Greek critical editions, we show that VLM errors often remain fl