High-performance code generation for stencil computations on GPU architectures 论文

2012引用 241
Parallel Computing and Optimization TechniquesAdvanced Data Storage TechnologiesDistributed and Parallel Computing Systems

摘要

Stencil computations arise in many scientific computing domains, and often represent time-critical portions of applications. There is significant interest in offloading these computations to high-performance devices such as GPU accelerators, but these architectures offer challenges for developers and compilers alike. Stencil computations in particular require careful attention to off-chip memory access and the balancing of work among compute units in GPU devices.