Nebius AI Cloud “Aether 3.5”: Frictionless compute for real world AI
This release introduces new serverless capabilities, the NVIDIA RTX PRO™ 6000 Blackwell Server Edition GPU for applied AI use cases, improved cluster configuration tools, streamlined data operations and platform-level enhancements that reduce routine complexity while preserving full control.
LK losses: Training speculative decoding draft models to directly maximize acceptance rate
We introduce LK losses — training objectives that directly optimize the acceptance rate for speculative decoding draft models. They are a drop-in replacement for KL divergence, with no computational overhead, and they work with any draft architecture and any target model size, to deliver consistent improvements in inference throughput across models ranging from 8B to 685B parameters. We are open-sourcing our trained draft models (LK-Speculators) and training datasets (Infinity-Instruct-Completions). An implementation of LK losses is also available as a pull request to SpecForge.
MLPerf® Inference v6.0: Top-tier AI performance on NVIDIA Blackwell and Blackwell Ultra
The results of our MLPerf® Inference v6.0 submission demonstrate Nebius’ ability to maximize efficiency for modern AI inference workloads on the latest NVIDIA Blackwell and Blackwell Ultra platforms.
Nebius partners with Positronic on Physical AI Leaderboard (PhAIL)
Physical AI is moving from controlled demos to real-world deployment — and that shift demands benchmarks grounded in actual operations, not lab conditions. Today, Nebius joins Positronic as a founding consortium partner of the Physical AI Leaderboard (PhAIL), a new platform that evaluates robot AI models on real hardware using commercially relevant tasks and production-grade metrics.
Nebius VPN Gateway CLI: Easily manage site-to-site VPNs in AI Cloud
The Nebius VPN Gateway CLI provides a simple way to configure and operate site-to-site VPN connectivity in Nebius AI Cloud. In this post, we walk through how it enables infrastructure-as-code workflows for IPSec gateways, helping teams manage secure and reliable connectivity across cloud and on-prem environments.
Introducing NVIDIA RTX PRO 6000 Blackwell Server Edition on Nebius
NVIDIA RTX PRO 6000 Blackwell opens new opportunities for cost-efficient inference and increased performance for visual computing and scientific simulations.
Introducing DevPods, Jobs and Endpoints: Easy compute access with serverless AI
The serverless services at Nebius are a natural extension of how an AI infrastructure cloud evolves over time, building on a mature and well-established underlying platform. As the platform develops, it becomes possible to expose compute in more flexible and elastic forms that better match how AI workloads are consumed.
Nebius and PyTorch partner to accelerate frontier MoE training on NVIDIA Blackwell
In collaboration with PyTorch, Nebius helped demonstrate up to 41% faster pre-training of DeepSeek-V3 models on NVIDIA Blackwell GPUs.
Incident post-mortem analysis: us-central1 service disruption on March 10, 2026
A detailed analysis of the incident on March 10, 2026 that led to service outages in the us-central1 region.
Delivering a validated AI Factory stack for agent workloads on Nebius AI Cloud with DataRobot
At NVIDIA GTC 2026, Nebius and DataRobot, with NVIDIA, introduced a validated AI Factory stack for production-grade agent workloads. In this post, we outline how the DataRobot Agent Workforce Platform runs on Nebius AI Cloud to support sustained inference, governance and cost control for AI agents deployed in live business workflows.
Nebius and Eigen AI partner to accelerate frontier open-source AI inference
Nebius and Eigen AI are partnering to bring optimized frontier open-source models to Nebius Token Factory. As part of the collaboration, optimized implementations of models such as DeepSeek, GLM, GPT-OSS, Kimi, Llama, MiniMax and Qwen will be published on the platform, giving developers direct access to high-performance inference through production-ready endpoints and APIs.
From fragmented data to production-grade agents: Nebius, Nexla and Tripadvisor at NVIDIA GTC
Nexla and Nebius are partnering to deliver a production-ready data and agent stack that connects governed enterprise data with infrastructure built for sustained inference. In this post, we outline how this architecture enables multi-agent systems to move from fragmented data pipelines to reliable production workflows, and show it in action through a live “Inspiration to Trip” demo presented with Tripadvisor at NVIDIA GTC.
Incident post-mortem analysis: eu-north-1 service disruption on February 26, 2026
A detailed analysis of the incident on February 26, 2026 that led to service outages in the eu-north-1 region.