Tutorial 10: AI-Assisted Workflow Composition

What You Will Learn

By the end of this tutorial you will:

  • Use the AI assistant suite to accelerate workflow development.

  • Onboard new model components with scalable init-component, using the real Demeter repository in capabilities/demeter as the input.

  • Diagnose run failures with scalable diagnose.

  • Generate human-readable explanations of execution plans.

  • Compose new workflows from natural language descriptions.

  • Migrate manifests between providers with scalable migrate.

  • Understand heuristic mode vs. LLM-enhanced mode.

Prerequisites

Scenario

Your team has just cloned the Demeter land-use / land-cover disaggregation model into capabilities/demeter and wants to wrap it as a Scalable component so a single GCAM run can be downscaled across many scenarios in parallel. You also need to (a) understand a failed Demeter run, (b) explain the execution plan to a stakeholder, (c) compose a multi-scenario workflow from a natural-language description, and (d) move the manifest from Slurm to Kubernetes for production. The AI assistants automate the tedious parts of all four tasks.

The pipeline you’ll build is the canonical one used everywhere else in this tutorial series:

prepare_demeter_config
     │  fan-out, one .ini per GCAM scenario
     ▼
run_demeter_scenario  ×  N
     │
     ▼
aggregate_demeter_outputs

Step 1: Heuristic vs. LLM Modes

All AI assistants work in two modes:

Heuristic mode (``–no-ai``, default when ``SCALABLE_AI_BACKEND=none``):

  • Uses deterministic rules, templates, and pattern matching.

  • No external API calls. Works offline.

  • Fast, reproducible, and auditable.

  • Best for CI/CD and automated pipelines.

LLM-enhanced mode (``SCALABLE_AI_BACKEND=openai`` or ``ollama``):

  • Augments heuristics with a language model for richer explanations and more creative workflow composition.

  • Requires API credentials and network access.

  • May produce varied output across invocations.

  • Best for interactive development and exploration.

Configure the backend:

# Heuristic only (default)
export SCALABLE_AI_BACKEND=none

# OpenAI
export SCALABLE_AI_BACKEND=openai
export SCALABLE_AI_MODEL=gpt-4
export OPENAI_API_KEY=sk-...

# Ollama (local)
export SCALABLE_AI_BACKEND=ollama
export SCALABLE_AI_MODEL=llama3
export SCALABLE_AI_ENDPOINT=http://localhost:11434

Step 2: Onboarding the Demeter Component

The init-component command analyzes a model directory and generates a component configuration. Point it at the cloned Demeter repo:

scalable init-component ./capabilities/demeter --name demeter --no-ai
Analyzing ./capabilities/demeter...
Detected:
  Language: Python (3.9+)
  Dependencies: configobj, numpy, pandas, scipy, requests,
                gcamreader, xarray, netcdf4, matplotlib
  Entry point: demeter.run_model(config_file=...)
  Container: Dockerfile.scalable (python:3.11-slim)
  Estimated resources: 4 CPUs, 16G memory

Generated component configuration:

components:
  demeter:
    image: ghcr.io/jgcri/demeter:2.0.1
    runtime: apptainer
    cpus: 4
    memory: 16G
    tags: [lulcc, downscaling, gcam]
    mounts:
      ./demeter_data: /data
    env:
      DEMETER_DATA: /data

Suggested task binding:

tasks:
  run_demeter_scenario:
    component: demeter
    cache: true
    outputs:
      output_dir: dir

Written to: ./capabilities/demeter/scalable-component.yaml

What the analyzer checks (heuristic mode):

  • Language detection — setup.py and capabilities/demeter/demeter/_version.py identify this as a Python 3.9+ package.

  • Dependency scanning — requirements.txt is parsed verbatim.

  • Resource estimation — known Demeter profiles plus the input-data size (demeter_data/ weighs in around several hundred MB) determine the CPU/memory defaults.

  • Container image inference — when a Dockerfile or Dockerfile.scalable is present, the analyzer suggests an image tag matching the package version (2.0.1).

The same workflow applies to non-Python models: a directory containing an R script with a DESCRIPTION file would produce Language: R (via rpy2), and a compiled binary like GridLAB-D would be flagged Language: compiled with the system’s Makefile parsed for dependencies. The Demeter repo just happens to be the convenient real example shipped with Scalable.

Python API:

from scalable.ai import onboard_component

result = onboard_component(
    "./capabilities/demeter",
    name="demeter",
    no_ai=True,
)

print(result.component_yaml)
print(result.task_yaml)
print(result.recommendations)

Step 3: Diagnosing Run Failures

After a failed run, use the diagnostic assistant to identify root causes. The example below comes from a 50-scenario Demeter ensemble where a fraction of scenarios failed:

scalable diagnose --latest --no-ai
═══════════════════════════════════════════════════════════
Diagnosis: run-20260520T041500Z-demeter-lulcc-f8e2a1b3
═══════════════════════════════════════════════════════════

Status: failed (13 task failures across 50 run_demeter_scenario tasks)

Root Cause Analysis:
─────────────────────
PRIMARY: Memory exhaustion (8 of 13 failures)
  Pattern: Scenarios with spatial_resolution <= 0.1° exhaust the 16G
  memory limit during the kernel-density convolution step inside
  demeter.process.ProcessStep.
  Evidence: All OOM failures occur on tasks where the projected-LU
  CSV expands to > 500k grid cells.

SECONDARY: Missing input file (3 of 13 failures)
  Pattern: ``IOError: constraints/soil_quality.csv not found``.
  Evidence: All three failures share scenario-prefixed config paths
  where the constraints directory was never copied into the run dir.

TERTIARY: Serialization error (2 of 13 failures)
  Pattern: Demeter's ProcessStep returns a logger handle alongside the
  output dataframe; the handle is unpicklable.
  Evidence: ``TypeError`` in dill serialization at the
  ``run_demeter_scenario`` return boundary.

Recommendations:
─────────────────
1. Apply the ``k8s-fine-resolution`` overlay (``demeter.memory: 64G``)
   for any scenario with ``spatial_resolution <= 0.1°``.
2. Add ``constraints/`` to the ``mounts:`` block for the demeter
   component, or copy the file into ``demeter_data/`` before fan-out.
3. In ``run_demeter_scenario``, return only the output paths (already
   done in ``docs/examples/workflow_demeter.py``) and let Demeter's own
   logger close at the end of ``run_model``.

Programmatic access:

from scalable.ai import diagnose_run

result = diagnose_run(
    run_dir=".scalable/runs/run-20260520T041500Z.../",
    no_ai=True,
)

print(f"Root cause: {result.summary}")
for finding in result.findings:
    print(f"  [{finding.severity}] {finding.category}")
    print(f"    Pattern: {finding.pattern}")
    print(f"    Suggestion: {finding.suggestion}")

Step 4: Explaining Execution Plans

Make execution plans understandable for non-technical stakeholders:

# Generate a plan
scalable plan ./docs/examples/scalable.demeter.yaml \
    --target aws --dry-run --output plan.json

# Explain it in plain language
scalable explain plan.json
Plan Explanation
═════════════════

This plan will execute the "demeter-lulcc" project on AWS (Fargate)
in the us-east-1 region.

What will happen:
1. A Dask cluster will be created with 1 preprocess worker (1 vCPU,
   2 GiB), up to 10 demeter workers (4 vCPU, 16 GiB each), and
   1 postprocess worker (2 vCPU, 8 GiB).
2. Workers auto-scale between 1 (minimum) and 10 (maximum) demeter
   pods based on the scenario backlog.
3. ``run_demeter_scenario`` results are cached per scenario so reruns
   skip already-completed scenarios.
4. Outputs land in s3://${ARTIFACT_STORAGE}/demeter-lulcc/.

Estimated cost: $4.82 (≈ 2.5 hours of Fargate compute + S3 storage
for 50 scenarios at 0.25° resolution)

Risks:
• Demeter's NetCDF writes are not atomic; partial writes on worker
  eviction will trigger a retry rather than corrupt downstream data.
  The cache key includes the input checksum, so retries are safe.
• Fargate cold-start adds 30-90s to the first task on each worker.

Python API:

from scalable.ai import explain_plan

result = explain_plan("plan.json")
print(result.explanation)
print(result.risks)
print(result.cost_summary)

Step 5: Composing Workflows from Natural Language

The most powerful AI assistant — generate complete workflow configurations from descriptions. Here we ask it to compose the canonical Demeter pipeline:

scalable compose "Run Demeter to downscale GCAM land allocations for \
  a list of scenarios in parallel, then aggregate per-scenario NetCDF \
  outputs into a summary table"
Generated workflow:
═══════════════════

# workflow.py
from scalable import ScalableSession, cacheable

@cacheable(return_type=str, scenario=str, base_config=str, output_dir=str)
def prepare_demeter_config(scenario: str, base_config: str, output_dir: str) -> str:
    """Materialize a Demeter .ini for one GCAM scenario."""
    # TODO: clone the base config and override [PARAMS] scenario + output_dir
    return f"{output_dir}/{scenario}/demeter_{scenario}.ini"

@cacheable(return_type=dict, config_file=str)
def run_demeter_scenario(config_file: str) -> dict:
    """Invoke demeter.run_model for a single scenario."""
    import demeter
    demeter.run_model(config_file=config_file, write_outputs=True)
    return {"scenario": config_file, "output_dir": str(...)}

@cacheable(return_type=dict, scenario_results=list)
def aggregate_demeter_outputs(scenario_results: list) -> dict:
    """Collect per-scenario output paths into a single summary."""
    # TODO: write a summary JSON / CSV
    return {"summary_path": "./outputs/demeter/_summary/scenarios.json"}

def main():
    session = ScalableSession.from_yaml("./scalable.yaml", target="local")
    client = session.start()

    scenarios = ["reference", "ssp1", "ssp2", "ssp5"]

    config_futures = [
        client.submit(prepare_demeter_config, s,
                      "./demeter_data/example/config_gcam_reference/config.ini",
                      "./outputs/demeter",
                      tag="preprocess")
        for s in scenarios
    ]
    config_files = client.gather(config_futures)

    scenario_futures = [
        client.submit(run_demeter_scenario, cfg, tag="demeter")
        for cfg in config_files
    ]
    scenario_results = client.gather(scenario_futures)

    summary = client.submit(
        aggregate_demeter_outputs,
        scenario_results,
        tag="postprocess",
    ).result()

    print(f"Pipeline complete: {summary['summary_path']}")
    session.close()

if __name__ == "__main__":
    main()

═══════════════════
Suggested manifest additions:

components:
  preprocess:
    cpus: 1
    memory: 2G
    tags: [lulcc, config]
  demeter:
    image: ghcr.io/jgcri/demeter:2.0.1
    cpus: 4
    memory: 16G
    tags: [lulcc, downscaling, gcam]
  postprocess:
    cpus: 2
    memory: 8G
    tags: [lulcc, aggregation]

tasks:
  prepare_demeter_config:
    component: preprocess
    cache: true
  run_demeter_scenario:
    component: demeter
    cache: true
  aggregate_demeter_outputs:
    component: postprocess
    cache: true

The generated workflow matches the hand-written reference at workflow_demeter.py (see docs/examples/workflow_demeter.py) — heuristic mode hits this shape because Demeter ships with a clear run_model entry point and a predictable per-scenario fan-out.

Python API for programmatic composition:

from scalable.ai import compose_workflow

result = compose_workflow(
    "Run Demeter for a list of GCAM scenarios in parallel and "
    "aggregate the NetCDF outputs"
)

print(result.workflow_code)
print(result.manifest_additions)
print(result.dependencies)

Step 6: Migrating Between Providers

Move your Demeter pipeline from Slurm to Kubernetes:

scalable migrate docs/examples/scalable.demeter.yaml \
    --to-provider kubernetes
Migration: slurm → kubernetes
══════════════════════════════

Changes required:
1. Target 'hpc' → new target 'k8s'
   - Remove: queue, account, walltime, interface
   - Add: namespace, image, adaptive

2. The demeter component already has an image
   (ghcr.io/jgcri/demeter:2.0.1); no change needed.
   preprocess and postprocess inherit the demeter image since they're
   lightweight Python; no per-component image required.

3. Environment / mount changes:
   - Apptainer bind mount ``./demeter_data:/data`` → PVC mount
     ``demeter-data-pvc:/data`` (or GCS bucket).
   - DEMETER_DATA env var stays the same.
   - Slurm walltime is replaced by pod ``activeDeadlineSeconds``.

Generated manifest:

targets:
  k8s:
    provider: kubernetes
    namespace: demeter-prod
    image: ghcr.io/jgcri/demeter:2.0.1
    adaptive:
      minimum: 2
      maximum: 20

components:
  demeter:
    image: ghcr.io/jgcri/demeter:2.0.1
    cpus: 4
    memory: 16G
    tags: [lulcc, downscaling, gcam]
    env:
      DEMETER_DATA: /data
    volume_mounts:
      - name: demeter-data
        mountPath: /data
    volumes:
      - name: demeter-data
        persistentVolumeClaim:
          claimName: demeter-data-pvc

  postprocess:
    cpus: 2
    memory: 8G
    tags: [lulcc, aggregation]

Migration notes:
• Mount ``./demeter_data`` lives on the laptop / NFS scratch in the
  Slurm version; on Kubernetes you must create ``demeter-data-pvc``
  ahead of time (or switch to a GCS bucket via ``fsspec``).
• The HPC overlay (``demeter.memory: 64G``) is preserved so a single
  k8s overlay (``k8s-fine-resolution``) can re-apply it for fine-
  resolution scenarios.

Python API:

from scalable.ai import migrate_manifest

result = migrate_manifest(
    "docs/examples/scalable.demeter.yaml",
    to_provider="kubernetes",
)

print(result.migrated_yaml)
print(result.changes_summary)
print(result.migration_notes)

Step 7: Integration into Development Workflow

Combine AI assistants into a smooth development loop using the Demeter example:

# 1. Onboard the model from a fresh clone
scalable init-component ./capabilities/demeter --name demeter

# 2. Compose the workflow that consumes the new component
scalable compose "Run Demeter for a list of GCAM scenarios then \
  aggregate the NetCDF outputs"

# 3. Validate the generated configuration
scalable validate ./scalable.yaml

# 4. Plan and review (explain for team review)
scalable plan ./scalable.yaml --target local --dry-run --output plan.json
scalable explain plan.json

# 5. Run locally
scalable run ./scalable.yaml --target local --workflow workflow.py

# 6. If it fails, diagnose
scalable diagnose --latest

# 7. When ready for production, migrate
scalable migrate ./scalable.yaml --to-provider kubernetes

Step 8: Customizing AI Heuristics

The heuristic mode uses rule-based templates that you can inspect and influence:

from scalable.ai.heuristics import (
    detect_language,
    estimate_resources,
    suggest_component_config,
)

# Language detection
lang = detect_language("./capabilities/demeter")
print(f"Detected: {lang}")  # "python"

# Resource estimation from known model profiles
resources = estimate_resources(
    model_name="demeter",
    input_size_mb=512,           # demeter_data/ from get_package_data
    num_scenarios=50,
)
print(f"Estimated: {resources}")
# {'cpus': 4, 'memory': '16G', 'walltime': '02:30:00'}

The heuristics are deterministic — same input always produces same output. This makes them suitable for automated CI/CD pipelines where reproducibility matters.

Step 9: LLM-Enhanced Mode

For richer, context-aware responses, enable an LLM backend:

export SCALABLE_AI_BACKEND=openai
export SCALABLE_AI_MODEL=gpt-4
export OPENAI_API_KEY=sk-...

# Now compose generates more detailed, context-aware workflows
scalable compose "Run Demeter for the SSP1-5 scenarios in parallel, \
  then run the LandCoverPlotter post-processor against each output \
  directory, then build a single multi-page PDF comparing the SSPs"

LLM-enhanced mode adds:

  • More detailed code comments and documentation.

  • Context-aware parameter suggestions based on Demeter’s docstrings (the LLM reads the README + module docstrings via the --include-docs flag).

  • Richer error explanations, including links to the Demeter issue tracker for known failure modes.

  • More creative workflow architectures for complex descriptions like the multi-SSP example above.

Important: LLM output is non-deterministic. For reproducible pipelines, always use --no-ai (heuristic mode) in CI/CD.

Step 10: Validating AI-Generated Output

Always validate AI-generated configurations before running:

from scalable.ai import compose_workflow
from scalable import ScalableSession

# Generate workflow
result = compose_workflow(
    "Run Demeter for the SSP1-5 scenarios then aggregate"
)

# Write generated manifest additions
# (merge with your existing scalable.yaml)

# Validate the result
session = ScalableSession.from_yaml("./scalable.yaml", target="local")
report = session.validate()

if not report.ok:
    print("Generated config has issues:")
    for issue in report.errors:
        print(f"  [{issue.code}] {issue.path}: {issue.message}")
    # Fix issues and re-validate
else:
    print("Generated config is valid — ready to run")

Troubleshooting

“ImportError: jinja2 not installed”

Install the AI extra: pip install scalable[ai].

``init-component`` reports “Language: unknown” for Demeter

Make sure you point it at the package root (./capabilities/demeter), not at ./capabilities/demeter/demeter. The analyzer relies on setup.py and requirements.txt at the top level.

``compose`` generates ``run_demeter_scenario`` but doesn’t import demeter

Heuristic mode emits # TODO: import demeter in the function body. Replace with import demeter; demeter.run_model(config_file=...) — see workflow_demeter.py for the canonical body.

LLM mode is slow

LLM API calls typically take 5–30 seconds. For quick iteration, use --no-ai for heuristic mode and only switch to LLM mode for complex composition tasks.

“SCALABLE_AI_BACKEND=openai but no OPENAI_API_KEY”

Set your API key: export OPENAI_API_KEY=sk-.... The error is raised at call time, not import time.

Migration suggests incompatible changes

Migration is advisory — it shows what needs to change but cannot verify that cloud infrastructure exists. Always validate the migrated manifest and test with --dry-run before production deployment.

Next Steps