A compression algorithm for DNA sequences and its applications in genome comparison 论文
2000引用 218
Algorithms and Data CompressionGenomics and Phylogenetic StudiesGenome Rearrangement Algorithms
摘要
We present a lossless compression algorithm, Gen-Compress, for DNA sequences, based on searching for approximate repeats. Our algorithm achieves the best compression ratios for benchmark DNA sequences, comparing to other DNA compression programs [3, 7]. Significantly better compression results show that the approximate repeats are one of the main hidden regularities in DNA sequences.