CoCheck: checkpointing and process migration for MPI 论文
2002引用 339
Distributed systems and fault toleranceParallel Computing and Optimization TechniquesAdvanced Data Storage Technologies
摘要
Checkpointing of parallel applications can be used as the core technology to provide process migration. Both checkpointing and migration, are an important issue for parallel applications on networks of workstations. The CoCheck environment which we present in this paper introduces a new approach to provide checkpointing and migration for parallel applications. CoCheck sits on top of the message passing library and achieves consistency at a level above the message passing system. It uses an existing single process checkpointer which is available for a wide range of systems. Hence, CoCheck can be easily adapted to both, different message passing systems and new machines.
相关事件
暂无数据
相关文章
暂无数据