We kicked off Q2 with a single mission: turn raw compute horsepower into concrete business outcomes. With numerous launches, including groundbreaking models, streamlined fine-tuning, scalable throughput and seamless integrations, Nebius AI Studio made significant strides to simplify how you build, optimize and scale your AI workloads. Here’s how these Q2 updates directly empower AI builders, enterprises and researchers to accelerate their projects and achieve tangible results.
What it means for builders: Eliminate complexity from scaling, enjoy predictable costs and meet demanding SLAs whether you’re prototyping or scaling to millions of queries per day.
Adaptive burst rate limits: Automatically scales your traffic spikes into unused capacity, eliminating “429” errors and manual rate-limit adjustments.
Batch and Async API: Process massive inference workloads (10GB+ datasets) at up to 50% lower cost compared to real-time. Ideal for large-scale content pipelines, data processing and background operations.
Expanded GPU regions: New NVIDIA Blackwell Ultra capacity based in the UK (operational by Q4 2025) and Europe-first NVIDIA GB200 NVL72 ensure compliance, low latency and reliable performance across regions.
Bottom line: You focus on innovation, we handle seamless scaling.
What it means for builders: Access precisely the right model for your use case, without juggling multiple providers.
Llama-3.1 Nemotron-Ultra-253B: GPT-4-level reasoning, 97% accuracy on MATH500 benchmark, 128K token context — ideal for deep, complex use-cases.
DeepSeek R1-0528 and R1-Distill-70B: Choose full accuracy or a streamlined, 50× smaller version for faster inference and edge deployments.
Qwen3 Family (0.6B–235B): A single, flexible family that covers conversational AI, coding assistants and multimodal tasks, enabling effortless scaling as your requirements evolve.
Bottom line: Select exactly the right model for your use case, to optimize performance and budget.
What it means for builders: Easily customize models to embed domain-specific knowledge, minimize inaccuracies and launch faster, even with limited data.
Fine-tuning (LoRA and full): Embed your data, terminology and business logic into top models like DeepSeek and Qwen in minutes.
Reinforcement fine-tuning (RFT) early access: Train high-performing, specialized models using 10–100× less labeled data, ideal for regulated sectors like finance, healthcare and legal.
One-click LoRA hosting: Deploy your specialized adapters in 60 seconds, without managing GPU infrastructure.
Bottom line: Customize your AI effortlessly, without heavy infrastructure overhead or extensive labeled datasets.
What it means for builders: Access inspiration, expert resources and generous credits to accelerate your projects.
Open learning and cookbooks: Access comprehensive cookbooks, end-to-end notebooks, blogs and curated sample repositories, including our popular awesome-ai-apps repository, that showcases practical examples using Google ADK, OpenAI Agents SDK, LangChain, LlamaIndex, Agno, CrewAI and integrations with tools like Tavily, Firecrawl, YFinance and more.
Bottom line: You’re supported by a vibrant ecosystem of experts, resources and credits to fast-track your AI journey.
What it means for builders: Rely on a secure compliant, high-performance infrastructure to confidently scale your AI globally.
NVIDIA Blackwell Ultra capacity based in the UK (operational in Q4 2025): High-capacity local GPU infrastructure for research, startups, and regulated workloads.
Europe-first NVIDIA GB200 NVL72 and NVIDIA AI Enterprise: Supercomputer-grade resources under strict EU compliance, blending hyperscale flexibility and enterprise reliability.
Bottom line: Nebius AI Studio is backed by globally distributed, enterprise-grade infrastructure, ensuring reliability, security and compliance at any scale.
We are now accepting pre-orders for NVIDIA GB200 NVL72 and NVIDIA HGX B200 clusters to be deployed in our data centers in the United States and Finland from early 2025. Based on NVIDIA Blackwell, the architecture to power a new industrial revolution of generative AI, these new clusters deliver a massive leap forward over existing solutions.