How Slurm meets Kubernetes: introducing Soperator
Managing distributed multi-node ML training on Slurm can be challenging. Soperator, our open-source Kubernetes operator for Slurm, offers a streamlined solution for ML and HPC engineers, making it easier to manage and scale workloads.
Join out live webinar where we’ll demonstrate how Soperator can manage a multi-node GPU cluster to simplify operations and boost productivity.
.png?cache-buster=2025-01-10T15:08:55.910Z)
Mikhail Mokrushin
Managed Schedulers Team Leader
.png?cache-buster=2025-01-10T15:11:23.441Z)
Try Nebius AI console today
Get immediate access to NVIDIA® GPUs, along with CPU resources, storage and additional services through our user-friendly self-service console.