How Slurm meets Kubernetes: introducing Soperator

Managing distributed multi-node ML training on Slurm can be challenging. Soperator, our open-source Kubernetes operator for Slurm, offers a streamlined solution for ML and HPC engineers, making it easier to manage and scale workloads.

Join out live webinar where we’ll demonstrate how Soperator can manage a multi-node GPU cluster to simplify operations and boost productivity.

Mikhail Mokrushin

Managed Schedulers Team Leader

Alexander Kim

Cloud Solutions Architect at Nebius

Try Nebius AI console today

Get immediate access to NVIDIA® GPUs, along with CPU resources, storage and additional services through our user-friendly self-service console.