Opened on Wednesday 17th November 2021, last updated
Resolved —
All V100 nodes are back online. Crashed jobs will have to be resubmitted, unfortunately.
Posted
by Bob
Identified —
Due to a power outage in of the data centers, several V100 GPU nodes went down, and running jobs on these nodes have crashed. We hope to bring these nodes back online today.