Beginner Tutorial 8: Container Orchestration with Kubernetes

The Big Picture

In Tutorial 5, you learned about containers — packaged software environments that run anywhere. But what happens when you need to run 50 containers across multiple machines? Who decides which machine runs which container? What happens when a container crashes? What about scaling up and down?

Kubernetes (often abbreviated “K8s”) is the answer. It’s a platform for managing containers at scale — automatically placing them on machines, restarting them when they fail, and scaling them up or down based on demand.

This tutorial explains Kubernetes from first principles and shows how Scalable uses it to run distributed workflows on container infrastructure.

What You Will Learn

By the end of this tutorial you will:

  • Understand what Kubernetes is and what problem it solves.

  • Know the key K8s concepts: pods, nodes, namespaces, operators.

  • Understand how the Dask Kubernetes Operator works.

  • Configure Scalable’s Kubernetes provider.

  • Understand resource requests, limits, and quotas.

  • Know when Kubernetes is appropriate vs. overkill.

Prerequisites

Key Concepts Explained

💡 Key Concept: What is Container Orchestration?

You know what a container is (packaged software). Container orchestration is the automation of deploying, managing, and scaling containers across multiple machines.

Without orchestration:

  • Manually decide which server runs which container

  • Manually restart containers that crash

  • Manually add/remove containers when load changes

  • Manually route traffic to healthy containers

With orchestration (Kubernetes):

  • You declare: “I want 10 copies of this container”

  • K8s decides where to put them (across available machines)

  • K8s automatically restarts crashed containers

  • K8s auto-scales based on demand

  • K8s routes traffic to healthy instances

Analogy: Container orchestration is like an air traffic controller for containers. You don’t tell each plane exactly which runway and gate to use — the controller optimally assigns resources based on the current situation.

💡 Key Concept: What is Kubernetes?

Kubernetes (from Greek: κυβερνήτης, “helmsman”) is an open-source container orchestration platform originally developed by Google.

It manages:

  • Where containers run (scheduling across machines)

  • How many containers run (scaling)

  • Healthy containers stay running (self-healing)

  • Network connectivity between containers (service discovery)

  • Storage for containers (persistent volumes)

Kubernetes is the industry standard — it runs in AWS (EKS), Google Cloud (GKE), Azure (AKS), and on-premise. Scalable uses it as one of its deployment providers.

💡 Key Concept: Pods

A pod is the smallest deployable unit in Kubernetes — a group of one or more containers that share network and storage.

Most commonly, a pod = one container. But sometimes related containers are grouped (e.g., your app container + a logging sidecar).

┌─── Pod ─────────────────────┐
│                             │
│  ┌───────────────────────┐  │
│  │  Container            │  │
│  │  (your Dask worker)   │  │
│  └───────────────────────┘  │
│                             │
│  Shared: IP address,        │
│  storage volumes            │
└─────────────────────────────┘

In Scalable’s context: each Dask worker runs in its own pod.

💡 Key Concept: Nodes

A node is a physical or virtual machine in the Kubernetes cluster. Pods are scheduled onto nodes.

Kubernetes Cluster
├── Node 1 (machine with 16 CPUs, 64GB RAM)
│   ├── Pod A (your worker, 4 CPU, 16GB)
│   ├── Pod B (your worker, 4 CPU, 16GB)
│   └── Pod C (system pod)
├── Node 2 (machine with 16 CPUs, 64GB RAM)
│   ├── Pod D (your worker, 4 CPU, 16GB)
│   └── Pod E (your worker, 4 CPU, 16GB)
└── Node 3 (machine with 8 CPUs, 32GB RAM)
    └── Pod F (scheduler pod)

The Kubernetes scheduler decides which node each pod runs on, based on available resources and constraints.

💡 Key Concept: Namespaces

A namespace is an isolation boundary within a Kubernetes cluster. Different teams or projects use different namespaces to avoid conflicts.

Think of namespaces like departments in a building:

  • team-energy namespace — your team’s pods

  • team-water namespace — another team’s pods

  • system namespace — cluster infrastructure

Resources in one namespace can’t accidentally interfere with another. Resource quotas can limit how much CPU/memory each namespace uses.

💡 Key Concept: Operators (Kubernetes Extension)

A Kubernetes Operator is a program that extends Kubernetes to manage complex applications automatically. It encodes domain-specific knowledge about how to deploy, scale, and maintain an application.

The Dask Kubernetes Operator:

  • Knows how to create Dask clusters (scheduler + workers)

  • Manages worker scaling automatically

  • Handles upgrades and restarts

  • Integrates with Kubernetes native features (quotas, monitoring)

Without an operator, you’d need to manually create pods for the scheduler, pods for each worker, configure networking between them, and handle failures. The operator does all this for you.

💡 Key Concept: kubectl

kubectl (pronounced “cube-control” or “cube-C-T-L”) is the command-line tool for interacting with Kubernetes clusters.

# List running pods in your namespace
kubectl get pods -n team-energy

# See details about a specific pod
kubectl describe pod worker-abc123

# View pod logs (stdout/stderr)
kubectl logs worker-abc123

# Delete a pod (Kubernetes will restart it if managed)
kubectl delete pod worker-abc123

Think of kubectl as the Kubernetes equivalent of docker commands, but for a whole cluster instead of a single machine.

💡 Key Concept: Resource Requests vs. Limits

In Kubernetes, each pod declares:

Requests — minimum guaranteed resources:

“I need at least 2 CPUs and 4GB RAM to function”

Limits — maximum allowed resources:

“Never let me use more than 4 CPUs or 8GB RAM”

resources:
  requests:
    cpu: "2"
    memory: "4Gi"
  limits:
    cpu: "4"
    memory: "8Gi"

Why both? Requests are used for scheduling (Kubernetes finds a node with enough free capacity). Limits prevent runaway containers from consuming all resources on a node and affecting other pods.

💡 Key Concept: Helm Charts

Helm is a package manager for Kubernetes (like pip for Python or apt for Linux). A Helm chart is a package of Kubernetes configuration files.

Instead of writing dozens of YAML files to deploy an application, you install a chart:

helm install dask-operator dask/dask-kubernetes-operator

Charts can be versioned, shared, and configured with values files.

Step 1: Kubernetes Architecture for Scalable

When you use Scalable’s Kubernetes provider, this is what gets created:

Kubernetes Cluster
└── Your Namespace (team-energy)
    ├── Dask Scheduler Pod (1x)
    │   └── Container: dask-scheduler
    │       Port 8786 (client connections)
    │       Port 8787 (dashboard)
    ├── Dask Worker Pods (N×)
    │   └── Container: your-image
    │       Runs your Python code
    │       Connected to scheduler
    └── Client (your script, outside cluster)
        └── Connects to scheduler via port-forward or ingress

The Dask Kubernetes Operator manages all of this based on a DaskCluster custom resource that Scalable creates from your manifest.

Step 2: Configuring the Kubernetes Provider

# scalable.yaml
targets:
  k8s:
    provider: kubernetes
    namespace: team-energy
    image: ghcr.io/my-org/demeter:2.0.1
    adaptive:
      minimum: 2
      maximum: 20
    resources:
      requests:
        cpu: "4"
        memory: "16Gi"
      limits:
        cpu: "4"
        memory: "16Gi"

What each setting does:

namespace: team-energy

Deploy into this Kubernetes namespace. Must exist and you must have permissions to create pods there.

image: ghcr.io/my-org/demeter:2.0.1

Container image for worker pods. Must contain your code, Python, and all dependencies (including Scalable itself).

adaptive: {minimum: 2, maximum: 20}

Start with 2 worker pods, scale up to 20 based on queue depth.

resources

CPU and memory for each worker pod. Maps directly to Kubernetes resource specifications.

Step 3: The Deployment Lifecycle

1. You run: scalable run ./scalable.yaml --target k8s
2. Scalable creates a DaskCluster custom resource in your namespace
3. The Dask Operator sees the resource and creates:
   - 1 scheduler pod
   - N worker pods (starting at adaptive.minimum)
4. Your client connects to the scheduler
5. Tasks are submitted and executed on worker pods
6. Adaptive scaling adds/removes worker pods based on load
7. When complete, the DaskCluster is deleted
8. All pods are cleaned up

Under the Hood: Custom Resources

Kubernetes has built-in resource types (Pod, Service, Deployment). But you can also define Custom Resource Definitions (CRDs) — new types that Kubernetes doesn’t know about natively.

The Dask Operator defines a DaskCluster CRD. When you create a DaskCluster resource, the operator watches for it and creates the necessary pods, services, and configurations automatically.

This is the declarative pattern again: you declare “I want a DaskCluster with these specs” and the operator makes it happen.

Step 4: When to Use Kubernetes

✅ Good fit for Kubernetes

❌ Overkill / Wrong tool

Team sharing a cluster for multiple projects

Single user on a laptop

Need for resource isolation between teams

Simple batch job on one machine

Auto-scaling based on demand

Fixed workload with known size

Long-running services + batch jobs

One-off analysis

Already have K8s infrastructure

Don’t have K8s (use cloud Fargate instead)

Need reproducible deployment

Rapid development iteration

🤔 Think About It

Kubernetes adds complexity. You need to:

  • Maintain a cluster (or pay for a managed one)

  • Build and push container images

  • Configure namespaces, quotas, and RBAC

  • Learn kubectl and K8s concepts

For many scientific workflows, the local provider (development) + cloud Fargate (production) is simpler than Kubernetes. K8s shines when you have a shared cluster already or need fine-grained resource management.

Step 5: Working with Container Images

Your code runs inside containers in K8s. The image must contain everything:

# Dockerfile
FROM python:3.12-slim

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc && rm -rf /var/lib/apt/lists/*

# Install Python packages
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy your workflow code
COPY . /app
WORKDIR /app

# Default command (overridden by Dask worker command)
CMD ["python", "-m", "distributed.cli.dask_worker"]

Build and push:

# Build the image
docker build -t ghcr.io/my-org/demeter:2.0.1 .

# Push to registry (GitHub Container Registry in this example)
docker push ghcr.io/my-org/demeter:2.0.1

💡 Key Concept: Image Pull Policy

When Kubernetes creates a pod, it needs to download (pull) the container image from the registry. Pull policies control when:

  • Always — always pull the latest (good for development)

  • IfNotPresent — use cached version if available (faster)

  • Never — never pull (image must be pre-loaded)

For production, use specific image tags (v1.2.3) rather than latest to ensure reproducibility.

Step 6: Monitoring Kubernetes Deployments

# Watch pods come up
kubectl get pods -n team-energy -w

# Output:
# NAME                          READY   STATUS    RESTARTS   AGE
# dask-scheduler-abc123         1/1     Running   0          30s
# dask-worker-def456            1/1     Running   0          25s
# dask-worker-ghi789            1/1     Running   0          25s

# Check resource usage
kubectl top pods -n team-energy

# View worker logs
kubectl logs dask-worker-def456 -n team-energy

# Access Dask dashboard (port-forward to localhost)
kubectl port-forward svc/dask-scheduler 8787:8787 -n team-energy
# Then open http://localhost:8787 in your browser

Common Questions

Q: Do I need to be a Kubernetes expert to use Scalable with K8s?

No. Scalable abstracts most K8s complexity. You need to know:

  • Your namespace name

  • Your container image URI

  • Basic kubectl commands for debugging

The Dask Operator handles pod creation, scaling, and cleanup.

Q: What’s the difference between Kubernetes and Docker?

  • Docker = creates and runs individual containers on one machine

  • Kubernetes = manages many containers across many machines

Docker builds the containers; Kubernetes orchestrates them.

Q: How does auto-scaling work in Kubernetes?

The Dask Operator watches queue depth (pending tasks). When tasks queue up, it creates more worker pods. When workers are idle, it removes them. This maps to the adaptive configuration in your manifest.

Q: What happens if a node (machine) fails?

Kubernetes detects the failure and reschedules pods from the failed node onto healthy nodes. Combined with Scalable’s retry logic, tasks on the failed node are re-executed on new workers.

Q: Is Kubernetes free?

Kubernetes itself is open-source (free). But you pay for:

  • The machines (nodes) that form the cluster

  • Managed K8s services (EKS, GKE, AKS charge a management fee)

  • Networking and storage

On-premise clusters have hardware and maintenance costs instead.

What You Learned

Term

Definition

Container Orchestration

Automating deployment, management, and scaling of containers

Kubernetes (K8s)

Industry-standard container orchestration platform

Pod

Smallest deployable unit in K8s (usually = one container)

Node

Physical or virtual machine in the K8s cluster

Namespace

Isolation boundary for resources within a cluster

Operator

K8s extension that manages complex applications automatically

kubectl

Command-line tool for interacting with Kubernetes

Helm

Package manager for Kubernetes applications

Resource Requests

Minimum guaranteed CPU/memory for a pod

Resource Limits

Maximum allowed CPU/memory for a pod

Custom Resource (CRD)

User-defined extension to Kubernetes resource types

Image Pull

Downloading a container image from a registry

Next Steps

You now understand Kubernetes fundamentals and how Scalable uses it for container-based distributed workflows.