Kubernetes Operator for Apache Spark™ logo

Kubernetes Operator for Apache Spark™

by Apache
Orchestration

Apache Spark™ unifies the processing of your data in batches and real time streaming, using your preferred language: Python, SQL, Scala, Java or R. It executes fast, distributed ANSI SQL queries for dashboards and ad hoc reporting faster than most data warehouses. Users can perform Exploratory Data Analysis (EDA) on petabyte scale data without having to resort to downsampling and train machine learning algorithms on a laptop, using the same code to scale to fault tolerant clusters of thousands of machines. The Kubernetes Operator for Apache Spark, developed by Google Cloud, uses the Kubernetes operator pattern and custom resources to handle Apache Spark applications the same way as other Kubernetes workloads.

Key features

K8S deployment

Deploy Kubernetes Operator for Apache Spark via Helm.

SparkApplication CRDs

Manage Spark jobs declaratively with Kubernetes resources.

Lifecycle management

Handle submission, status monitoring, and cleanup through operator logic.

Scalable data processing

Run distributed Spark workloads with Kubernetes scheduling.


Pricing

Additional Nebius infrastructure costs may apply. Use the Nebius Pricing Page to estimate your infrastructure costs.

Self-managed

Kubernetes Operator for Apache Spark™ on Kubernetes

Deploy Kubernetes Operator for Apache Spark™ on Kubernetes.

Free
Charged for resources
Setup time20+ minutes
ScalingAuto
MaintenanceSelf-managed (cluster)
Deploy
White-glove

Deploy with a solutions architect

Some applications are easier with a hand on the wheel. Talk to an architect who has deployed this in production.

  • Architecture review & sizing
  • Hands-on deploy session
  • 30 days of follow-up support
Talk to an expert

Security & compliance

Run Kubernetes Operator for Apache Spark™ on infrastructure built for AI workloads

Reliable AI infrastructure backed by top-tier NVIDIA GPUs, purpose-built for demanding inference workloads. Multiple deployment methods — virtual machines for full hardware control, Kubernetes for scalable cluster deployments, and managed serverless applications for teams that want inference running without infrastructure overhead

Learn about Nebius AI Cloud

Security & compliance, out of the box

Nebius meets a broad set of security and compliance standards. Fine-grained IAM controls, audit logs, and encrypted storage are available out of the box — so teams can meet security requirements without additional tooling.

Explore the Trust center

Support

Application support

Provided by Apache. See the documentation and project links above.

Infrastructure support

Provided by Nebius for the underlying cloud infrastructure.