PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction arXiv:2512.10888v3 Announce Type: replace Abstract: Table extraction (TE) is a key challenge in document understanding. Traditional approaches detect tables first, then recognize their structure. Recently, interest has surged in developing methods, such as vision-language models (VLMs), to extract tables directly in their full page or document context. However, a lack of annotated data has made progress diffic

PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction · 相关技术