Quickly generating billion-record synthetic databases 论文

1994引用 351

Advanced Database Systems and QueriesData Management and AlgorithmsAlgorithms and Data Compression

企业软件 Algorithms and Data Compression Data Management and Algorithms Advanced Database Systems and Queries

作者

摘要

Evaluating database system performance often requires generating synthetic databases—ones having certain statistical properties but filled with dummy information. When evaluating different database designs, it is often necessary to generate several databases and evaluate each design. As database sizes grow to terabytes, generation often takes longer than evaluation. This paper presents several database generation techniques. In particular it discusses: (1) Parallelism to get generation speedup and scaleup. (2) Congruential generators to get dense unique uniform distributions. (3) Special-case discrete logarithms to generate indices concurrent to the base table generation. (4) Modification of (2) to get exponential, normal, and self-similar distributions.

作者查看全部 (5)

P. Weinberger

Ken Baclawski

Susanne Englert

Prakash Sundaresan

Quickly generating billion-record synthetic databases 论文

摘要

作者查看全部 (5)

相关技术查看全部 (2)

相关事件

相关文章