Nebius partners with Positronic on Physical AI Leaderboard (PhAIL)

March 31, 2026

5 mins to read

Physical AI is moving fast, but the field has lacked a rigorous, real-world benchmark to measure progress. Demo videos and lab success rates tell only part of the story. The operators who decide whether to deploy robotics at scale need harder numbers: throughput, reliability and reproducibility on genuine commercial tasks.

Today, Nebius is excited to announce our role as a founding consortium partner of the Physical AI Leaderboard (PhAIL) by Positronic, a platform for training and deploying any robot AI model on any robot. PhAIL uses real hardware to evaluate vision-language-action (VLA) models on bin-to-bin order picking—a high-volume, commercially representative task.

Unlike existing benchmarks that report abstract success rates, it measures metrics that matter on an actual shop floor: Units Per Hour (UPH) and Mean Time Between Failures or Assists (MTBF/A). Every run is recorded and published with synchronized video, robot telemetry and scoring logs, so any result can be independently audited. Positronic developed the evaluation methodology and operates the benchmark rigs. The inaugural results, including comparisons to human and teleoperated baselines, are live now at phail.ai.

As part of the consortium, Nebius will provide its vertically-integrated AI infrastructure for fine-tuning and evaluation of robot models. Nebius AI Cloud is well suited for physical AI workloads and includes a managed service for data and compute workflows in robotics. Nebius has integrated NVIDIA OSMO, an open workflow orchestration framework to deliver an easy-to-consume managed service, providing unified, agentic orchestration across the entire physical AI development pipeline. Nebius also offers scalable, high-performance storage, powerful NVIDIA Blackwell and NVIDIA Hopper clusters for AI training and inference and simulation instances with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Teams that submit their models for evaluation can apply for Nebius compute credits to support their fine-tuning work.

PhAIL is designed to be open and reproducible from end to end. Positronic publishes a free fine-tuning dataset collected through teleoperated demonstrations, along with open-source training scripts that any team can use to prepare their model for evaluation. The benchmark hardware is a Franka Research 3 arm with a Robotiq 2F-85 gripper in the DROID configuration, which is widely available and reproducible. Evaluation is ‘blind’: model checkpoints are rotated randomly so the operator does not know which model is running. Full methodology is documented in the PhAIL white paper.

If you’re building physical AI models, the path to participation is open: download the dataset, fine-tune and submit your checkpoint for evaluation on Positronic’s rigs. The consortium launching PhAIL already includes Toloka, the human data infrastructure for frontier AI. If you represent a hardware vendor, simulation platform, academic lab or industry operator and want to help shape what PhAIL measures next, the consortium is actively welcoming new members.

Read Positronic’s full blog post for more details, explore the live leaderboard and get in touch or email us at hi@phail.ai if you’d like to get involved.

Explore Nebius AI Cloud

Docs

Explore Nebius Token Factory

Docs and support

Evan Helda

Head of Physical AI

Akshai Parthasarathy

Solutions Marketing Director

The serverless services at Nebius are a natural extension of how an AI infrastructure cloud evolves over time, building on a mature and well-established underlying platform. As the platform develops, it becomes possible to expose compute in more flexible and elastic forms that better match how AI workloads are consumed.

Introducing NVIDIA RTX PRO 6000 Blackwell Server Edition on Nebius

NVIDIA RTX PRO 6000 Blackwell opens new opportunities for cost-efficient inference and increased performance for visual computing and scientific simulations.

Nebius AI Cloud “Aether 3.5”: Frictionless compute for real world AI

This release introduces new serverless capabilities, the NVIDIA RTX PRO™ 6000 Blackwell Server Edition GPU for applied AI use cases, improved cluster configuration tools, streamlined data operations and platform-level enhancements that reduce routine complexity while preserving full control.

Nebius partners with Positronic on Physical AI Leaderboard (PhAIL)

Explore Nebius AI Cloud

Explore Nebius Token Factory

See also

Introducing DevPods, Jobs and Endpoints: Easy compute access with serverless AI

Introducing NVIDIA RTX PRO 6000 Blackwell Server Edition on Nebius

Nebius AI Cloud “Aether 3.5”: Frictionless compute for real world AI