Pre-installed drivers are now available for Kubernetes nodes
Pre-installed drivers are now available for Kubernetes nodes
Images with pre-installed drivers in our Managed Service for Kubernetes enhance NVIDIA GPU cluster scalability and reduce compute node start-up time by 2-3x.
Preparing a GPU cluster requires some effort and careful attention from your engineering team. It can become a significant challenge in terms of deploying and configuring a compute environment.
Today, we’re happy to announce a small yet meaningful enhancement that makes the cluster preparation process much easier and faster: Managed Service for Kubernetes can now start a node with an AI/ML-ready image. This image includes drivers for NVIDIA GPUs, drivers for NVIDIA Quantum InfiniBand-based networking and other components required for the proper operation of a GPU-accelerated environment.
Faster compute provisioning
Having this new type of image significantly reduces the time for deploying nodes in Kubernetes clusters. We expect this update will accelerate node start-up, making it up to 3x faster than when using the driverless K8s image. This doesn’t just shorten the time to value for our customers — it brings more scalability and efficient compute provisioning for GPU clusters.
In particular, it will make a big difference for model inference installations. Quick node initialization allows for rapid cluster scaling when demand increases and additional capacity needs to be provisioned.
Enhanced UX and availability
Pre-installed drivers also make the setup process and maintenance more user-friendly and hassle-free for all Nebius users. You won’t need to figure out the list of required software, search for installation packages, install the components, or ensure that the cluster is ready to use. The installation process, as well as the driver update process, requires additional expertise — GPU and network operators have many dependencies, making troubleshooting and maintenance complex and challenging for administrators.
With this new approach, all these routines become streamlined and standardized, leaving no room for outdated software or misconfigured components.
This feature is now available for all Managed Kubernetes clusters using NVIDIA GPU-enabled nodes.