Nebius monthly digest: May 2025

Even larger workloads are welcome! We raised quotas last month, so you can now access up to 32 NVIDIA H200 GPUs on demand in both the US and Europe. Speaking of Europe, we’ll be revealing many of our plans for the region at NVIDIA GTC Paris next week. Our research and other initiatives have continued as well — today’s digest covers it all.

NVIDIA GTC Paris & VivaTech 2025: Let’s meet in Paris on June 11–13

We’re heading to Paris for two great events taking place at the same venue! Nebius’ ambition is to be the default-choice AI cloud in Europe and beyond. Why do we believe we can achieve this? Attend our sessions, meet us at our booths and talk to our customers to find out.

Booths

  • NVIDIA GTC: D03, Hall 7
  • VivaTech: J57, Hall 1

Sessions

Access up to 32 NVIDIA H200 GPUs on demand in Kansas City and Finland

We’ve raised quotas for our self service once again. Now you can deploy even larger workloads simply by yourself, without contacting our team, via Nebius console, API, CLI, Terraform or any of our integrations.

New research done by our AI R&D team

  • Our AI R&D team’s research paper, “Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents,” has been accepted to ICML 2025 — here’s the preprint on arXiv. The acceptance rate for this year’s conference after peer review was just 27%, not counting desk rejections.

  • SWE-rebench is our new benchmark for evaluating agentic LLMs on a continuously updated and decontaminated set of real-world software engineering tasks mined from real GitHub repositories.

Nebius demonstrates industry-leading AI training performance in latest MLPerf® results

As a peer-reviewed industry benchmark suite, MLPerf® Training by MLCommons® is one of the most trustworthy sources of data about AI cloud performance in the industry. We achieved 124.5 min and 244.6 min training step time for Llama 3.1 405B on the 128-node and 64-node cluster, respectively. These numbers demonstrate near-linear scaling of Nebius infrastructure.

Supporting projects that push industries forward

  • SGLang, an LLM inference framework, teamed up with Nebius to supercharge DeepSeek R1’s performance for real-world use. The team achieved a 2× throughput boost and significantly lower latency.

  • Converge Bio is redefining precision medicine by combining single-cell RNA sequencing with large language models to unlock patient-level therapeutic insights. With Nebius’ AI-native infrastructure, they’ve trained a full-transcriptome foundation model (Converge-SC) capable of processing 20,000+ genes per cell — delivering state-of-the-art accuracy, explainability and speed for drug discovery and clinical development.

  • Recent advancements in LLMs have opened new possibilities for generative molecular drug design. Researchers from YerevaNN and Yerevan State University presented three Nebius-based models, continuously pre-trained on a novel corpus of 110M molecules with computed properties, totaling 40B tokens. A genetic algorithm integrates the models to optimize molecules with promising properties.

AI Discovery Award: The nominees are in!

In our annual award recognizing startups that are leveraging AI in healthcare and life sciences, the judges reviewed all 103 semifinalists and selected three nominees in each of the four categories — along with seven remarkable companies that we’ve included as honorable mentions.

Technical articles and docs: more technical than ever

  • Our Cloud Solutions Architect Alex Kim walked through how to get Llama 4 and Qwen3 running on Nebius (recently integrated with SkyPilot) by using SGLang as the serving framework.

  • In the past weeks, we have made several significant observability-related improvements to the Nebius AI Cloud. Among them: advanced monitoring metrics in our web console and API, out-of-the-box Grafana dashboards, as well as monitoring and logging features, allowing customers to upload their custom metrics and logs to our cloud.

  • Observability services are now documented in one place — everything on Monitoring and Logging is now available in a single section to help you track resource metrics, set up alerts and analyze logs for better infrastructure visibility and reliability.

  • More useful additions for Slurm and Soperator users: a new guide walks you through downloading data to your cluster; you can now connect to login nodes via VSCode — great for interactive development — learn about running jobs in containers with Docker and managing jobs and queues in your cluster.

  • Nebius AI Studio keeps evolving. Billing now has its own dedicated section — learn how to manage payment methods and view usage. You can integrate Helicone to track costs and metrics of your model runs, or connect Managed MLflow to gain more control and visibility across the ML workflow.

  • Compute docs have been expanded with a maintenance overview and a how-to on stopping and starting VMs during maintenance periods.

  • Managed Kubernetes now includes a guide on deploying and managing applications — useful for getting your clusters production-ready.

  • Finally, learn how to create IAM access tokens for authenticating API calls and accessing services, and how they differ from access keys.

Explore Nebius AI Cloud

Explore Nebius AI Studio

author
Nebius team
Sign in to save this post