Building a test collection for complex document information processing 论文

2006引用 284

Handwritten Text Recognition TechniquesText and Document Classification TechnologiesImage Processing and 3D Reconstruction

Handwritten Text Recognition Techniques Image Processing and 3D Reconstruction Text and Document Classification Technologies

关系图谱

作者

摘要

Research and development of information access technology for scanned paper documents has been hampered by the lack of public test collections of realistic scope and complexity. As part of a project to create a prototype system for search and mining of masses of document images, we are assembling a 1.5 terabyte dataset to support evaluation of both end-to-end complex document information processing (CDIP) tasks (e.g., text retrieval and data mining) as well as component technologies such as optical character recognition (OCR), document structure analysis, signature matching, and authorship attribution.

作者查看全部 (6)

Jefferson Heard

D. Grossman

Ophir Frieder

Shlomo Argamon

Building a test collection for complex document information processing 论文

摘要

作者查看全部 (6)

相关技术查看全部 (3)

相关事件

相关文章