Nebius opens pre-orders for NVIDIA Blackwell GPU-powered clusters

We are now accepting pre-orders for NVIDIA GB200 NVL72 and NVIDIA HGX B200 clusters to be deployed in our data centers in the United States and Finland from early 2025. Based on NVIDIA Blackwell, the architecture to power a new industrial revolution of generative AI, these new clusters deliver a massive leap forward over existing solutions.

December 4, 2024

3 mins to read

New hardware on Nebius AI cloud

In this case, the new hardware we’re going to provide — over 22,000 NVIDIA Blackwell GPUs will be deployed on the Nebius AI-native cloud — is a complete game changer. For the NVIDIA GB200 Grace Blackwell Superchip, the entire mainframe, including the cooling system and even CPU architecture, has been reimagined to accommodate the latest and upcoming colossal models. The NVIDIA HGX B200 system, which has a form factor that may appear more familiar, still requires adaptation if you’ve previously worked with the NVIDIA HGX H200 or HGX H100 systems.

The in-house hardware expertise of your GPU cloud provider is critical to gain maximum value from your GPU investment and the technicalities of the planned migration. With years of practice designing and maintaining high-load systems, our hardware R&D team knows how to properly set up and exploit sophisticated server appliances. We deliver maximum performance from every GPU hour.

Figure 1. NVIDIA GB200 NVL72 rack-scale system

Similarly, the Arm architecture that powers the NVIDIA GB200 Grace Blackwell Superchip is not something that has been widely used in our domain in recent years. Nebius’ dedicated team of Linux kernel developers will lend a helping hand here — they are currently creating a custom-made software layer for the smooth and stable operation of this new hardware. It’s easier to integrate something so sophisticated into newly written systems. Just weeks ago, we finished rewriting our entire cloud from the ground up, so there’s no legacy to hold us back when developing on top of Arm.

Speaking of rewriting the cloud, one of the user benefits we achieved through this process was building much faster storage — as highlighted in our October announcement. We expect our AI-tailored shared filesystem to deliver up to 180 GBps per NVIDIA GB200 NVL72 rack for read operations, highly relevant for running multi-node training and recovering checkpoints as quickly as possible. Combined with powerful GPU compute, these resource-demanding processes will become more predictable and less stressful for your team.

Figure 2. NVIDIA HGX B200 system

Multi-node operations also require orchestration when scaling up and down. We deliver NVIDIA GB200 and HGX B200-powered clusters as fully integrated cloud solutions with managed Kubernetes and Slurm-based workload orchestration. If any complexities arise, our solution architects will provide you with all the necessary DevOps expertise to save you time and keep you focused on machine learning.

Availability in data centers

The NVIDIA GB200 NVL72 densely packs and interconnects GPUs using a copper cable cartridge for operational simplicity. It delivers 25x lower cost and energy consumption compared with NVIDIA HGX H100 — a leap made possible by the liquid-cooling system designed by NVIDIA, which is currently being installed in our own data center in Finland and a colocation facility in Kansas City. The project includes components designed by ourselves to help ensure the hardware operates seamlessly under intensive loads when training large models across hundreds or thousands of nodes. Liquid cooling is also suitable for NVIDIA HGX B200. Additionally, we conduct extensive testing of each component before deployment to maximize efficiency.

By offering NVIDIA Blackwell-powered clusters in both Europe and the United States, we eliminate the need for customers to worry about intercontinental latency. These new systems can be ever close to your operation physically, addressing even detailed concerns such as the placement of availability zones.

You can pre-order your GB200 NVL72 or NVIDIA HGX B200 here and be fully prepared for the new architecture, which will enable you to train and infer models with unprecedented efficiency.

Pre-order Blackwell GPU-powered clusters

Fill the short form

Explore Nebius

Docs

Nebius team

Pre-order Blackwell GPU-powered clusters

Fill the short form

Nebius opens pre-orders for NVIDIA Blackwell GPU-powered clusters

New hardware on Nebius AI cloud

Availability in data centers

Pre-order Blackwell GPU-powered clusters

Explore Nebius

Pre-order Blackwell GPU-powered clusters

See also

Nebius opens its first availability zone in the United States

Nebius launches its first availability zone in France

Products

Resources

Solutions

Prices

Security and compliance

Programs

Company

Legal

Nebius opens pre-orders for NVIDIA Blackwell GPU-powered clusters

New hardware on Nebius AI cloudNew hardware on Nebius AI cloud

Availability in data centersAvailability in data centers

Pre-order Blackwell GPU-powered clusters

Explore Nebius

Pre-order Blackwell GPU-powered clusters

See also

Nebius opens its first availability zone in the United States

Nebius launches its first availability zone in France

New hardware on Nebius AI cloud

Availability in data centers