Tutorial 1: Getting Started with Scalable¶
What You Will Learn¶
By the end of this tutorial you will:
Install Scalable and verify its dependencies are satisfied.
Understand the project layout Scalable expects.
Create a minimal
scalable.yamlmanifest.Validate, plan, and execute a local workflow end-to-end.
Inspect the telemetry output of a successful run.
This tutorial establishes the foundation for every subsequent tutorial in the series. If you are new to Scalable, start here.
Prerequisites¶
Python 3.11 or later (3.12 and 3.13 are also supported).
A working
pip(or equivalent package manager such asuv).Familiarity with the command line.
Basic Python fluency (functions, imports, virtual environments).
No HPC cluster, Docker installation, or cloud credentials are required for this tutorial — we run everything locally.
Where this is going
This tutorial uses a deliberately trivial hello-scalable project so
you can verify the install in a few minutes. Tutorials 2–10 then graduate
to a realistic running example: downscaling
Demeter land-use / land-cover
projections across many GCAM scenarios in parallel
(project demeter-lulcc, components preprocess, demeter,
postprocess). When you’re ready to actually execute that pipeline,
Tutorial Setup: Run the Demeter Example End-to-End walks through the one-time setup
(pip install -e capabilities/demeter +
demeter.get_package_data("./demeter_data")) and the canonical
scalable.demeter.yaml.
Step 1: Install Scalable¶
Create a fresh virtual environment and install Scalable from PyPI:
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install scalable
Verify the installation:
scalable --help
Expected output (abbreviated):
usage: scalable [-h] {validate,plan,run,report,advise,...} ...
Scalable CLI — orchestrate distributed workflows.
positional arguments:
{validate,plan,run,report,advise,...}
Why this matters: The scalable CLI entry point is the primary interface
for validating manifests, planning executions, and generating reports. You can
also drive everything from Python — the CLI wraps the same public API.
Note
If your shell cannot find the scalable command after installation, ensure
that the scripts directory for your virtual environment is on PATH.
Step 2: Create a Project Directory¶
Scalable workflows live in a dedicated project directory. The minimal layout looks like this:
my-project/
├── scalable.yaml # Manifest (single source of truth)
└── workflow.py # Your Python workflow script
Create it:
mkdir my-project && cd my-project
Step 3: Write a Minimal Manifest¶
The manifest (scalable.yaml) is a declarative document describing your
project, execution targets, compute components, and task bindings. Create the
file:
# scalable.yaml
version: 1
project:
name: hello-scalable
targets:
local:
provider: local
max_workers: 2
threads_per_worker: 1
processes: false
containers: none
components:
analysis:
cpus: 1
memory: 1G
tasks:
run_analysis:
component: analysis
Let’s unpack this:
versionSchema version. Currently
1is the only supported version.project.nameA human-readable project identifier. It is embedded in telemetry run IDs and artifact paths.
targetsNamed execution environments. Here we define a single target called
localusing the built-inLocalProvider. The provider spawns a DaskLocalClusterunder the hood with the specified worker configuration.componentsResource profiles for your workloads. Each component declares CPU and memory requirements. Components map to Dask worker resource annotations.
tasksNamed work units that bind to a component. Tasks are the scheduling atoms — when you
submita function you associate it with a task definition.
Trade-off note: Setting processes: false runs Dask workers as threads
within a single process. This is fast to start and avoids serialization overhead
but provides no memory isolation between tasks. For CPU-bound workloads or tasks
that hold the GIL, set processes: true.
Step 4: Validate the Manifest¶
Before running anything, validate the manifest for structural and semantic errors:
scalable validate ./scalable.yaml
Expected output:
✓ Manifest is valid (0 errors, 0 warnings)
If you introduce a typo — say providr: local — validation will report:
ERROR targets.local: unknown provider 'providr'
The validator checks:
Required top-level keys (
version,project).Component key spelling (
cpus,memory,image, etc.).Task-component references resolve.
Provider-specific option constraints (e.g.,
max_workersmust be a positive integer for the local provider).
Step 5: Plan the Execution¶
Planning produces a dry-run execution plan without allocating real resources:
scalable plan ./scalable.yaml --target local --dry-run --output plan.json
Plan created for target 'local' (provider: local)
Workers: 2 × analysis (1 cpu, 1G memory)
Manifest lock: sha256:a3b8f1...
Inspect the generated plan.json:
{
"target_name": "local",
"provider": "local",
"manifest_lock": "sha256:a3b8f1...",
"scale_plan": {
"analysis": {
"count": 2,
"resources": {"cpus": 1, "memory": "1G"}
}
}
}
Architectural note: The manifest_lock is a content-addressable hash of
the expanded manifest. It guarantees reproducibility — if two plans share the
same lock fingerprint they were derived from byte-identical configurations
(modulo environment variable expansion).
Step 6: Write a Workflow Script¶
Create workflow.py:
"""A minimal Scalable workflow."""
from scalable import ScalableSession
def analyze(scenario_id: int) -> dict:
"""Simulate an expensive computation."""
import time
time.sleep(1)
return {"scenario": scenario_id, "result": scenario_id * 42}
def main():
# Initialize a session from the manifest
session = ScalableSession.from_yaml("./scalable.yaml", target="local")
# Plan (validates + computes resource allocation)
plan = session.plan(dry_run=True)
print(f"Manifest lock: {plan.manifest_lock}")
# Start the cluster and get a client
client = session.start(plan)
# Submit tasks tagged to the 'analysis' component
futures = []
for i in range(5):
fut = client.submit(analyze, i, tag="analysis")
futures.append(fut)
# Gather results
results = client.gather(futures)
for r in results:
print(r)
# Tear down
session.close()
if __name__ == "__main__":
main()
Step 7: Run the Workflow¶
Execute the workflow using the CLI:
scalable run ./scalable.yaml --target local --workflow workflow.py
Or run it directly with Python:
python workflow.py
Expected output:
Manifest lock: sha256:a3b8f1...
{'scenario': 0, 'result': 0}
{'scenario': 1, 'result': 42}
{'scenario': 2, 'result': 84}
{'scenario': 3, 'result': 126}
{'scenario': 4, 'result': 168}
What happened under the hood:
ScalableSession.from_yamlparsed the manifest, resolved environment variables, and built aDeploymentSpec.session.plan()validated the spec and computed aDryRunPlanincluding worker counts and resource annotations.session.start()instantiated aLocalProvider, which created a DaskLocalClusterwith 2 workers each annotated with 1 CPU / 1 GB.Each
client.submit(..., tag="analysis")routed the function to workers matching theanalysiscomponent’s resource profile.session.close()shut down workers and finalized telemetry.
Step 8: Inspect Telemetry¶
Every manifest-driven run records structured telemetry. Check what was persisted:
scalable report --latest
Expected output:
Run: run-20260520T035200Z-hello-scalable-a1b2c3d4
Status: completed
Target: local (provider: local)
Duration: 6.2s
Tasks: 5 submitted, 5 succeeded, 0 failed
The telemetry lives under .scalable/runs/<run-id>/:
.scalable/runs/run-20260520T035200Z-hello-scalable-a1b2c3d4/
├── run.json # Run metadata
├── tasks.jsonl # Per-task lifecycle events
├── resources.jsonl # Resource utilization snapshots
└── workers.jsonl # Worker lifecycle events
These structured records power the resource advising and ML optimization features covered in later tutorials.
Step 9: Environment Variables¶
Scalable is configured through environment variables for deployment flexibility. The most relevant ones for getting started:
Variable |
Default |
Description |
|---|---|---|
|
|
Default manifest path (avoids passing |
|
(unset) |
Default target override |
|
|
Disk cache directory for |
|
|
Set to |
|
(unset) |
Set to |
Example — run with debug logging and a custom cache directory:
export SCALABLE_LOG_LEVEL=DEBUG
export SCALABLE_CACHE_DIR=/tmp/scalable-cache
python workflow.py
Troubleshooting¶
- “scalable: command not found”
Ensure your virtual environment is activated and the scripts directory is on
PATH. On some systems you may needpython -m scalable.cli.mainas a fallback.- “ModuleNotFoundError: No module named ‘dask’”
Scalable’s core dependencies (
dask,distributed) should be installed automatically. If missing, runpip install scalableagain in your environment.- Manifest validation reports “unknown provider”
Double-check the
provider:value matches a built-in name (local,slurm) or that you have installed the relevant extra (scalable[cloud],scalable[kubernetes]).- Tasks complete but results are None
Ensure your function returns a value and that all data passed as arguments is serializable by
dill(Scalable’s default serializer). Lambda functions and module-level functions are fine; nested closures over non-picklable objects will fail silently.
Next Steps¶
Now that you have a working local workflow:
Tutorial 2: Mastering the Manifest System — Deep-dive into the manifest schema, environment variable expansion, and multi-target configurations.
Tutorial 4: Performance Optimization and Caching — Add the
@cacheabledecorator to skip redundant computation across retries.Tutorial 6: Monitoring and Observability with Telemetry — Understand the telemetry data model and generate custom reports.