Affected services:

  • Merlin Hypervisors

GPU outage on merlin

Opened on Tuesday 25th October 2022, last updated

Resolved — The liqid switch has been replaced. The GPU's should all have been restored to their respecitive vms. If you're still experiencing problems, please contact us at hpc@rug.nl.

Posted

Identified — We have had a debug session with the supplier of our hardware, and the conclusion at this moment is that the pci express switch that connects the gpu's to Merlin is broken. We will need to replace it with a new one. We are still waiting on an eta for this. Again our apologies for the inconvenience this is causing. We will keep you informed.

Posted

Identified — An attempt to resolve the issue in the data center has been unsuccessful. We are now in contact with our supplier for further assistance.

Posted

Identified — All gpu's in the merlin cloud are experiencing a malfunction. We are currently working to resolve this.

Posted