-
Notifications
You must be signed in to change notification settings - Fork 0
2022 06 13
Aurelien Bouteiller edited this page Jun 13, 2022
·
1 revision
ULFM read during last MPI Forum. Some changes requested (mostly minor changes). Implementation of changes in progress, will report back with actual text next meeting.
We had a quick impromptu discussion about FT for GPUs. We discussed a the common failure scenarios for GPUs, and Pratesh expressed interest into having some AMD GPU support checkpointing/restarting and reaction to fault events.