GPUs unusable
Incident Report for renku-limited
Resolved
This incident has been resolved.
Posted Feb 03, 2023 - 14:04 CET
Monitoring
The previous version of CUDA has been deployed in the GPU nodes and we are monitoring the fix.
Posted Feb 03, 2023 - 13:47 CET
Identified
The latest release of CUDA: 12.0.1 does not work properly. We are working to roll back the installed version to the previous: 12.0.0.
Posted Feb 03, 2023 - 13:37 CET
Investigating
It is currently not possible to access GPU resources in Limited. An unattended upgrade (that should have been disabled) has broken the CUDA toolkit/driver interface.
We are working to deploy a fix ASAP.
Posted Feb 03, 2023 - 10:17 CET
This incident affected: Renkulab sessions and Loud.