ReStore
ReStore for MPI programs enables scalable in-memory recovery of data after process failures via an appropriate data distribution and replication. It supports shrinking and replacing recovery-schemes and is substantially faster than parallel file system based approaches.
- Distributed Memory Algorithms
- Fault-Tolerance
- High performance computing