Tutorials

Hands-on, step-by-step guides that walk you through Scalable’s features from first installation to advanced production workflows. Each tutorial builds on a realistic scenario, includes full code examples with expected output, and ends with suggested next steps.

Beginner Tutorials

Tip

New to Scalable or distributed computing? Start with the beginner tutorials above. They cover the same 10 topics as the advanced tutorials below but explain every concept from first principles — no prior distributed systems, cloud, or container experience required. Once you’re comfortable with the concepts, graduate to the advanced tutorials for production patterns.

Advanced Tutorials

Prerequisites by Tutorial

Tutorial

Install Extra

External Requirements

1–4

pip install scalable

None (local only)

5

pip install scalable[cloud]

AWS/GCP credentials

6–7

pip install scalable

None

8

pip install scalable[kubernetes]

Kubernetes cluster + kubectl

9

pip install scalable[ml]

5+ telemetry runs

10

pip install scalable[ai]

None (optional: LLM API key)

Conventions Used

Throughout these tutorials:

  • All code examples use Python 3.11+ syntax.

  • Shell commands assume a Unix-like environment (macOS/Linux). Windows equivalents are noted where they differ.

  • Running example. Tutorials 2–10 use the Demeter land-use / land-cover disaggregation model (cloned into capabilities/demeter) as the running example. The project name is demeter-lulcc and the components are preprocess, demeter, and postprocess. The canonical pipeline is:

    prepare_demeter_config -> run_demeter_scenario (×N) -> aggregate_demeter_outputs
    

    Tutorial 1 keeps a deliberately trivial hello-scalable project so installation can be verified before the larger story begins. See Tutorial Setup: Run the Demeter Example End-to-End for the one-time setup that makes the examples actually executable (scalable.demeter.yaml + docs/examples/workflow_demeter.py).

  • Environment variables use the ${VAR:-default} pattern for portability.

  • Expected output blocks show representative output — exact values (timestamps, hashes, run IDs) will differ on your machine.