Kubernetes Provider¶
Scalable v2.0.0 supports Kubernetes-based execution through the
scalable[kubernetes] extra, using the Dask Kubernetes Operator.
Installation¶
pip install scalable[kubernetes]
This installs dask-kubernetes and kubernetes Python client.
Prerequisites¶
A Kubernetes cluster with the Dask Kubernetes Operator installed.
A valid
KUBECONFIGpointing to the cluster.Appropriate RBAC permissions for creating DaskCluster resources.
Configuration¶
The KubernetesProvider maps manifest
components to Kubernetes worker groups.
Target options:
namespace: Kubernetes namespace (default:"default")image: Default container image for scheduler/workersn_workers: Initial worker count per groupworker_service_account: Service account for worker podsadaptive: Dict withminimumandmaximumfor adaptive scalingresources: Default resource requests (cpu, memory)env: Extra environment variables for podstolerations: Kubernetes tolerations listnode_selector: Node selector dict
Example manifest:
# Scalable manifest targeting GKE (Google Kubernetes Engine)
# Requires: pip install scalable[kubernetes]
#
# Demeter LULCC running example — see docs/tutorials/demeter_setup.rst
# for the full setup instructions and docs/examples/scalable.demeter.yaml
# for the canonical multi-target version.
version: 1
project:
name: demeter-lulcc-gke
default_storage: gs://my-bucket/scalable-runs/
targets:
gke:
provider: kubernetes
namespace: demeter-prod
image: gcr.io/my-project/demeter:2.0.1
adaptive:
minimum: 2
maximum: 20
overlay: gke-prod
components:
demeter:
image: gcr.io/my-project/demeter:2.0.1
cpus: 8
memory: 32G
tags: [lulcc, downscaling, gcam]
env:
DEMETER_DATA: /data
postprocess:
image: gcr.io/my-project/postprocess:latest
cpus: 4
memory: 16G
tags: [lulcc, aggregation]
tasks:
run_demeter_scenario:
component: demeter
cache: true
outputs:
output_dir: dir
aggregate_demeter_outputs:
component: postprocess
cache: true
overlays:
gke-prod:
components:
demeter:
memory: 64G
cpus: 16
postprocess:
memory: 32G
gke-dev:
targets:
gke:
namespace: demeter-dev
adaptive:
minimum: 1
maximum: 5
components:
demeter:
memory: 16G
cpus: 4
How It Works¶
The provider creates a
KubeClustervia the Dask Kubernetes Operator.Each manifest component becomes a separate worker group with its own resource requests and container image.
If
adaptiveis configured, the cluster auto-scales within the specified bounds.Worker groups are labeled with component names for observability.
Validation¶
Run scalable validate to check your Kubernetes manifest:
scalable validate scalable.yaml --target gke
Run with dry-run for planning:
scalable run scalable.yaml --target gke --dry-run
See Also¶
Provider Abstraction — Full provider abstraction documentation
Cloud Providers — AWS and GCP cloud providers
Manifest Overlays — Environment-specific configuration overrides