High-performance code generation for stencil computations on GPU architectures 论文
2012引用 241
Parallel Computing and Optimization TechniquesAdvanced Data Storage TechnologiesDistributed and Parallel Computing Systems
摘要
Stencil computations arise in many scientific computing domains, and often represent time-critical portions of applications. There is significant interest in offloading these computations to high-performance devices such as GPU accelerators, but these architectures offer challenges for developers and compilers alike. Stencil computations in particular require careful attention to off-chip memory access and the balancing of work among compute units in GPU devices.