Ray Cluster logo

Ray Cluster

by Ray
OrchestrationInfrastructure

During installation, you can select the number and parameters of worker groups, configure autoscaling, and set up the head node. Alternatively, you may provide your own configuration in YAML format. Optionally, you can install the KubeRay Operator, which is required for Ray Cluster to function properly. You may also optionally install the Kubernetes Prometheus Stack, which integrates with the KubeRay Operator and Ray Cluster for monitoring and observability. By default, the installation parameters enable autoscaling and set the number of worker replicas to 0. This means that as soon as a task is submitted, Ray Cluster will automatically create a worker. Once the task is finished, the worker will be deleted after a short grace period if it is not used anymore. Ray Cluster autoscaling and Managed Service for Kubernetes cluster work together, so if no suitable node is available, the Kubernetes cluster autoscaler can provision one.

Key features

Flexible cluster configuration

Configure head and worker groups or provide custom YAML.

Autoscaling support

Scales Ray workers with workload demand and integrates with Kubernetes autoscaling.

Node group targeting

Supports node selectors and custom resources for precise scheduling.

Optional monitoring stack

Can integrate kube-prometheus-stack with Grafana dashboards.


Pricing

Additional Nebius infrastructure costs may apply. Use the Nebius Pricing Page to estimate your infrastructure costs.

Self-managed

Ray Cluster on Kubernetes

Deploy Ray clusters on Kubernetes with configurable worker groups and autoscaling.

Free
Charged for resources
Setup time20+ minutes
ScalingAuto
MaintenanceSelf-managed (cluster)
Deploy
White-glove

Deploy with a solutions architect

Some applications are easier with a hand on the wheel. Talk to an architect who has deployed this in production.

  • Architecture review & sizing
  • Hands-on deploy session
  • 30 days of follow-up support
Talk to an expert

Security & compliance

Run Ray Cluster on infrastructure built for AI workloads

Reliable AI infrastructure backed by top-tier NVIDIA GPUs, purpose-built for demanding inference workloads. Multiple deployment methods — virtual machines for full hardware control, Kubernetes for scalable cluster deployments, and managed serverless applications for teams that want inference running without infrastructure overhead

Learn about Nebius AI Cloud

Security & compliance, out of the box

Nebius meets a broad set of security and compliance standards. Fine-grained IAM controls, audit logs, and encrypted storage are available out of the box — so teams can meet security requirements without additional tooling.

Explore the Trust center

Support

Application support

Provided by Ray. See the documentation and project links above.

Infrastructure support

Provided by Nebius for the underlying cloud infrastructure.