Query-based sampling of text databases 论文

2001ACM Transactions on Information Systems引用 370

Web Data Mining and AnalysisAdvanced Database Systems and QueriesData Management and Algorithms

企业软件 Data Management and Algorithms Advanced Database Systems and Queries Web Data Mining and Analysis

作者

摘要

The proliferation of searchable text databases on corporate networks and the Internet causes a database selection problem for many people. Algorithms such as gGLOSS and CORI can automatically select which text databases to search for a given information need, but only if given a set of resource descriptions that accurately represent the contents of each database. The existing techniques for a acquiring resource descriptions have significant limitations when used in wide-area networks controlled by many parties. This paper presents query-based sampling , a new technicque for acquiring accurate resource descriptions. Query-based sampling does not require the cooperation of resource providers, nor does it require that resource providers use a particular search engine or representation technique. An extensive set of experimental results demonstrates that accurate resource descriptions are crated, that computation and communication costs are reasonable, and that the resource descriptions do in fact enable accurate automatic dtabase selection.

作者查看全部 (2)

Margaret E. Connell

Jamie Callan

Query-based sampling of text databases 论文

摘要

作者查看全部 (2)

相关技术查看全部 (1)

相关事件

相关文章