.. _tutorial_ai_composition:

======================================================
Tutorial 10: AI-Assisted Workflow Composition
======================================================

What You Will Learn
-------------------

By the end of this tutorial you will:

* Use the AI assistant suite to accelerate workflow development.
* Onboard new model components with ``scalable init-component``, using the
  real :doc:`Demeter <demeter_setup>` repository in
  ``capabilities/demeter`` as the input.
* Diagnose run failures with ``scalable diagnose``.
* Generate human-readable explanations of execution plans.
* Compose new workflows from natural language descriptions.
* Migrate manifests between providers with ``scalable migrate``.
* Understand heuristic mode vs. LLM-enhanced mode.

Prerequisites
-------------

* Completed :ref:`tutorial_getting_started` and :ref:`tutorial_manifest_system`.
* ``pip install scalable[ai]`` (installs ``jinja2``, ``rich``).
* :ref:`tutorial_demeter_setup` — the running example throughout this
  tutorial assumes ``capabilities/demeter`` is on disk; for the
  ``compose`` examples you also want
  ``demeter.get_package_data("./demeter_data")`` to have run.
* For LLM-enhanced mode (optional): an API key for OpenAI or a running
  Ollama instance.

Scenario
--------

Your team has just cloned the
`Demeter <https://github.com/JGCRI/demeter>`_ land-use / land-cover
disaggregation model into ``capabilities/demeter`` and wants to wrap it as
a Scalable component so a single GCAM run can be downscaled across many
scenarios in parallel. You also need to (a) understand a failed Demeter
run, (b) explain the execution plan to a stakeholder, (c) compose a
multi-scenario workflow from a natural-language description, and (d) move
the manifest from Slurm to Kubernetes for production. The AI assistants
automate the tedious parts of all four tasks.

The pipeline you'll build is the canonical one used everywhere else in
this tutorial series:

.. code-block:: text

   prepare_demeter_config
        │  fan-out, one .ini per GCAM scenario
        ▼
   run_demeter_scenario  ×  N
        │
        ▼
   aggregate_demeter_outputs

Step 1: Heuristic vs. LLM Modes
---------------------------------

All AI assistants work in two modes:

**Heuristic mode (``--no-ai``, default when ``SCALABLE_AI_BACKEND=none``):**

* Uses deterministic rules, templates, and pattern matching.
* No external API calls. Works offline.
* Fast, reproducible, and auditable.
* Best for CI/CD and automated pipelines.

**LLM-enhanced mode (``SCALABLE_AI_BACKEND=openai`` or ``ollama``):**

* Augments heuristics with a language model for richer explanations and
  more creative workflow composition.
* Requires API credentials and network access.
* May produce varied output across invocations.
* Best for interactive development and exploration.

Configure the backend:

.. code-block:: bash

   # Heuristic only (default)
   export SCALABLE_AI_BACKEND=none

   # OpenAI
   export SCALABLE_AI_BACKEND=openai
   export SCALABLE_AI_MODEL=gpt-4
   export OPENAI_API_KEY=sk-...

   # Ollama (local)
   export SCALABLE_AI_BACKEND=ollama
   export SCALABLE_AI_MODEL=llama3
   export SCALABLE_AI_ENDPOINT=http://localhost:11434

Step 2: Onboarding the Demeter Component
------------------------------------------

The ``init-component`` command analyzes a model directory and generates a
component configuration. Point it at the cloned Demeter repo:

.. code-block:: bash

   scalable init-component ./capabilities/demeter --name demeter --no-ai

.. code-block:: text

   Analyzing ./capabilities/demeter...
   Detected:
     Language: Python (3.9+)
     Dependencies: configobj, numpy, pandas, scipy, requests,
                   gcamreader, xarray, netcdf4, matplotlib
     Entry point: demeter.run_model(config_file=...)
     Container: Dockerfile.scalable (python:3.11-slim)
     Estimated resources: 4 CPUs, 16G memory

   Generated component configuration:

   components:
     demeter:
       image: ghcr.io/jgcri/demeter:2.0.1
       runtime: apptainer
       cpus: 4
       memory: 16G
       tags: [lulcc, downscaling, gcam]
       mounts:
         ./demeter_data: /data
       env:
         DEMETER_DATA: /data

   Suggested task binding:

   tasks:
     run_demeter_scenario:
       component: demeter
       cache: true
       outputs:
         output_dir: dir

   Written to: ./capabilities/demeter/scalable-component.yaml

**What the analyzer checks (heuristic mode):**

* Language detection — ``setup.py`` and ``capabilities/demeter/demeter/_version.py``
  identify this as a Python 3.9+ package.
* Dependency scanning — ``requirements.txt`` is parsed verbatim.
* Resource estimation — known Demeter profiles plus the input-data size
  (``demeter_data/`` weighs in around several hundred MB) determine the
  CPU/memory defaults.
* Container image inference — when a ``Dockerfile`` or
  ``Dockerfile.scalable`` is present, the analyzer suggests an image tag
  matching the package version (``2.0.1``).

The same workflow applies to non-Python models: a directory containing an
R script with a ``DESCRIPTION`` file would produce ``Language: R (via
rpy2)``, and a compiled binary like GridLAB-D would be flagged
``Language: compiled`` with the system's ``Makefile`` parsed for
dependencies. The Demeter repo just happens to be the convenient real
example shipped with Scalable.

Python API:

.. code-block:: python

   from scalable.ai import onboard_component

   result = onboard_component(
       "./capabilities/demeter",
       name="demeter",
       no_ai=True,
   )

   print(result.component_yaml)
   print(result.task_yaml)
   print(result.recommendations)

Step 3: Diagnosing Run Failures
---------------------------------

After a failed run, use the diagnostic assistant to identify root causes.
The example below comes from a 50-scenario Demeter ensemble where a
fraction of scenarios failed:

.. code-block:: bash

   scalable diagnose --latest --no-ai

.. code-block:: text

   ═══════════════════════════════════════════════════════════
   Diagnosis: run-20260520T041500Z-demeter-lulcc-f8e2a1b3
   ═══════════════════════════════════════════════════════════

   Status: failed (13 task failures across 50 run_demeter_scenario tasks)

   Root Cause Analysis:
   ─────────────────────
   PRIMARY: Memory exhaustion (8 of 13 failures)
     Pattern: Scenarios with spatial_resolution <= 0.1° exhaust the 16G
     memory limit during the kernel-density convolution step inside
     demeter.process.ProcessStep.
     Evidence: All OOM failures occur on tasks where the projected-LU
     CSV expands to > 500k grid cells.

   SECONDARY: Missing input file (3 of 13 failures)
     Pattern: ``IOError: constraints/soil_quality.csv not found``.
     Evidence: All three failures share scenario-prefixed config paths
     where the constraints directory was never copied into the run dir.

   TERTIARY: Serialization error (2 of 13 failures)
     Pattern: Demeter's ProcessStep returns a logger handle alongside the
     output dataframe; the handle is unpicklable.
     Evidence: ``TypeError`` in dill serialization at the
     ``run_demeter_scenario`` return boundary.

   Recommendations:
   ─────────────────
   1. Apply the ``k8s-fine-resolution`` overlay (``demeter.memory: 64G``)
      for any scenario with ``spatial_resolution <= 0.1°``.
   2. Add ``constraints/`` to the ``mounts:`` block for the demeter
      component, or copy the file into ``demeter_data/`` before fan-out.
   3. In ``run_demeter_scenario``, return only the output paths (already
      done in ``docs/examples/workflow_demeter.py``) and let Demeter's own
      logger close at the end of ``run_model``.

Programmatic access:

.. code-block:: python

   from scalable.ai import diagnose_run

   result = diagnose_run(
       run_dir=".scalable/runs/run-20260520T041500Z.../",
       no_ai=True,
   )

   print(f"Root cause: {result.summary}")
   for finding in result.findings:
       print(f"  [{finding.severity}] {finding.category}")
       print(f"    Pattern: {finding.pattern}")
       print(f"    Suggestion: {finding.suggestion}")

Step 4: Explaining Execution Plans
------------------------------------

Make execution plans understandable for non-technical stakeholders:

.. code-block:: bash

   # Generate a plan
   scalable plan ./docs/examples/scalable.demeter.yaml \
       --target aws --dry-run --output plan.json

   # Explain it in plain language
   scalable explain plan.json

.. code-block:: text

   Plan Explanation
   ═════════════════

   This plan will execute the "demeter-lulcc" project on AWS (Fargate)
   in the us-east-1 region.

   What will happen:
   1. A Dask cluster will be created with 1 preprocess worker (1 vCPU,
      2 GiB), up to 10 demeter workers (4 vCPU, 16 GiB each), and
      1 postprocess worker (2 vCPU, 8 GiB).
   2. Workers auto-scale between 1 (minimum) and 10 (maximum) demeter
      pods based on the scenario backlog.
   3. ``run_demeter_scenario`` results are cached per scenario so reruns
      skip already-completed scenarios.
   4. Outputs land in s3://${ARTIFACT_STORAGE}/demeter-lulcc/.

   Estimated cost: $4.82 (≈ 2.5 hours of Fargate compute + S3 storage
   for 50 scenarios at 0.25° resolution)

   Risks:
   • Demeter's NetCDF writes are not atomic; partial writes on worker
     eviction will trigger a retry rather than corrupt downstream data.
     The cache key includes the input checksum, so retries are safe.
   • Fargate cold-start adds 30-90s to the first task on each worker.

Python API:

.. code-block:: python

   from scalable.ai import explain_plan

   result = explain_plan("plan.json")
   print(result.explanation)
   print(result.risks)
   print(result.cost_summary)

Step 5: Composing Workflows from Natural Language
---------------------------------------------------

The most powerful AI assistant — generate complete workflow configurations
from descriptions. Here we ask it to compose the canonical Demeter
pipeline:

.. code-block:: bash

   scalable compose "Run Demeter to downscale GCAM land allocations for \
     a list of scenarios in parallel, then aggregate per-scenario NetCDF \
     outputs into a summary table"

.. code-block:: text

   Generated workflow:
   ═══════════════════

   # workflow.py
   from scalable import ScalableSession, cacheable

   @cacheable(return_type=str, scenario=str, base_config=str, output_dir=str)
   def prepare_demeter_config(scenario: str, base_config: str, output_dir: str) -> str:
       """Materialize a Demeter .ini for one GCAM scenario."""
       # TODO: clone the base config and override [PARAMS] scenario + output_dir
       return f"{output_dir}/{scenario}/demeter_{scenario}.ini"

   @cacheable(return_type=dict, config_file=str)
   def run_demeter_scenario(config_file: str) -> dict:
       """Invoke demeter.run_model for a single scenario."""
       import demeter
       demeter.run_model(config_file=config_file, write_outputs=True)
       return {"scenario": config_file, "output_dir": str(...)}

   @cacheable(return_type=dict, scenario_results=list)
   def aggregate_demeter_outputs(scenario_results: list) -> dict:
       """Collect per-scenario output paths into a single summary."""
       # TODO: write a summary JSON / CSV
       return {"summary_path": "./outputs/demeter/_summary/scenarios.json"}

   def main():
       session = ScalableSession.from_yaml("./scalable.yaml", target="local")
       client = session.start()

       scenarios = ["reference", "ssp1", "ssp2", "ssp5"]

       config_futures = [
           client.submit(prepare_demeter_config, s,
                         "./demeter_data/example/config_gcam_reference/config.ini",
                         "./outputs/demeter",
                         tag="preprocess")
           for s in scenarios
       ]
       config_files = client.gather(config_futures)

       scenario_futures = [
           client.submit(run_demeter_scenario, cfg, tag="demeter")
           for cfg in config_files
       ]
       scenario_results = client.gather(scenario_futures)

       summary = client.submit(
           aggregate_demeter_outputs,
           scenario_results,
           tag="postprocess",
       ).result()

       print(f"Pipeline complete: {summary['summary_path']}")
       session.close()

   if __name__ == "__main__":
       main()

   ═══════════════════
   Suggested manifest additions:

   components:
     preprocess:
       cpus: 1
       memory: 2G
       tags: [lulcc, config]
     demeter:
       image: ghcr.io/jgcri/demeter:2.0.1
       cpus: 4
       memory: 16G
       tags: [lulcc, downscaling, gcam]
     postprocess:
       cpus: 2
       memory: 8G
       tags: [lulcc, aggregation]

   tasks:
     prepare_demeter_config:
       component: preprocess
       cache: true
     run_demeter_scenario:
       component: demeter
       cache: true
     aggregate_demeter_outputs:
       component: postprocess
       cache: true

The generated workflow matches the hand-written reference at
:download:`workflow_demeter.py </examples/workflow_demeter.py>` (see
``docs/examples/workflow_demeter.py``) — heuristic mode hits this shape
because Demeter ships with a clear ``run_model`` entry point and a
predictable per-scenario fan-out.

Python API for programmatic composition:

.. code-block:: python

   from scalable.ai import compose_workflow

   result = compose_workflow(
       "Run Demeter for a list of GCAM scenarios in parallel and "
       "aggregate the NetCDF outputs"
   )

   print(result.workflow_code)
   print(result.manifest_additions)
   print(result.dependencies)

Step 6: Migrating Between Providers
-------------------------------------

Move your Demeter pipeline from Slurm to Kubernetes:

.. code-block:: bash

   scalable migrate docs/examples/scalable.demeter.yaml \
       --to-provider kubernetes

.. code-block:: text

   Migration: slurm → kubernetes
   ══════════════════════════════

   Changes required:
   1. Target 'hpc' → new target 'k8s'
      - Remove: queue, account, walltime, interface
      - Add: namespace, image, adaptive

   2. The demeter component already has an image
      (ghcr.io/jgcri/demeter:2.0.1); no change needed.
      preprocess and postprocess inherit the demeter image since they're
      lightweight Python; no per-component image required.

   3. Environment / mount changes:
      - Apptainer bind mount ``./demeter_data:/data`` → PVC mount
        ``demeter-data-pvc:/data`` (or GCS bucket).
      - DEMETER_DATA env var stays the same.
      - Slurm walltime is replaced by pod ``activeDeadlineSeconds``.

   Generated manifest:

   targets:
     k8s:
       provider: kubernetes
       namespace: demeter-prod
       image: ghcr.io/jgcri/demeter:2.0.1
       adaptive:
         minimum: 2
         maximum: 20

   components:
     demeter:
       image: ghcr.io/jgcri/demeter:2.0.1
       cpus: 4
       memory: 16G
       tags: [lulcc, downscaling, gcam]
       env:
         DEMETER_DATA: /data
       volume_mounts:
         - name: demeter-data
           mountPath: /data
       volumes:
         - name: demeter-data
           persistentVolumeClaim:
             claimName: demeter-data-pvc

     postprocess:
       cpus: 2
       memory: 8G
       tags: [lulcc, aggregation]

   Migration notes:
   • Mount ``./demeter_data`` lives on the laptop / NFS scratch in the
     Slurm version; on Kubernetes you must create ``demeter-data-pvc``
     ahead of time (or switch to a GCS bucket via ``fsspec``).
   • The HPC overlay (``demeter.memory: 64G``) is preserved so a single
     k8s overlay (``k8s-fine-resolution``) can re-apply it for fine-
     resolution scenarios.

Python API:

.. code-block:: python

   from scalable.ai import migrate_manifest

   result = migrate_manifest(
       "docs/examples/scalable.demeter.yaml",
       to_provider="kubernetes",
   )

   print(result.migrated_yaml)
   print(result.changes_summary)
   print(result.migration_notes)

Step 7: Integration into Development Workflow
----------------------------------------------

Combine AI assistants into a smooth development loop using the Demeter
example:

.. code-block:: bash

   # 1. Onboard the model from a fresh clone
   scalable init-component ./capabilities/demeter --name demeter

   # 2. Compose the workflow that consumes the new component
   scalable compose "Run Demeter for a list of GCAM scenarios then \
     aggregate the NetCDF outputs"

   # 3. Validate the generated configuration
   scalable validate ./scalable.yaml

   # 4. Plan and review (explain for team review)
   scalable plan ./scalable.yaml --target local --dry-run --output plan.json
   scalable explain plan.json

   # 5. Run locally
   scalable run ./scalable.yaml --target local --workflow workflow.py

   # 6. If it fails, diagnose
   scalable diagnose --latest

   # 7. When ready for production, migrate
   scalable migrate ./scalable.yaml --to-provider kubernetes

Step 8: Customizing AI Heuristics
----------------------------------

The heuristic mode uses rule-based templates that you can inspect and
influence:

.. code-block:: python

   from scalable.ai.heuristics import (
       detect_language,
       estimate_resources,
       suggest_component_config,
   )

   # Language detection
   lang = detect_language("./capabilities/demeter")
   print(f"Detected: {lang}")  # "python"

   # Resource estimation from known model profiles
   resources = estimate_resources(
       model_name="demeter",
       input_size_mb=512,           # demeter_data/ from get_package_data
       num_scenarios=50,
   )
   print(f"Estimated: {resources}")
   # {'cpus': 4, 'memory': '16G', 'walltime': '02:30:00'}

The heuristics are deterministic — same input always produces same output.
This makes them suitable for automated CI/CD pipelines where reproducibility
matters.

Step 9: LLM-Enhanced Mode
---------------------------

For richer, context-aware responses, enable an LLM backend:

.. code-block:: bash

   export SCALABLE_AI_BACKEND=openai
   export SCALABLE_AI_MODEL=gpt-4
   export OPENAI_API_KEY=sk-...

   # Now compose generates more detailed, context-aware workflows
   scalable compose "Run Demeter for the SSP1-5 scenarios in parallel, \
     then run the LandCoverPlotter post-processor against each output \
     directory, then build a single multi-page PDF comparing the SSPs"

LLM-enhanced mode adds:

* More detailed code comments and documentation.
* Context-aware parameter suggestions based on Demeter's docstrings (the
  LLM reads the README + module docstrings via the ``--include-docs``
  flag).
* Richer error explanations, including links to the Demeter issue tracker
  for known failure modes.
* More creative workflow architectures for complex descriptions like the
  multi-SSP example above.

**Important:** LLM output is non-deterministic. For reproducible pipelines,
always use ``--no-ai`` (heuristic mode) in CI/CD.

Step 10: Validating AI-Generated Output
-----------------------------------------

Always validate AI-generated configurations before running:

.. code-block:: python

   from scalable.ai import compose_workflow
   from scalable import ScalableSession

   # Generate workflow
   result = compose_workflow(
       "Run Demeter for the SSP1-5 scenarios then aggregate"
   )

   # Write generated manifest additions
   # (merge with your existing scalable.yaml)

   # Validate the result
   session = ScalableSession.from_yaml("./scalable.yaml", target="local")
   report = session.validate()

   if not report.ok:
       print("Generated config has issues:")
       for issue in report.errors:
           print(f"  [{issue.code}] {issue.path}: {issue.message}")
       # Fix issues and re-validate
   else:
       print("Generated config is valid — ready to run")

Troubleshooting
---------------

**"ImportError: jinja2 not installed"**
  Install the AI extra: ``pip install scalable[ai]``.

**``init-component`` reports "Language: unknown" for Demeter**
  Make sure you point it at the package root
  (``./capabilities/demeter``), not at ``./capabilities/demeter/demeter``.
  The analyzer relies on ``setup.py`` and ``requirements.txt`` at the
  top level.

**``compose`` generates ``run_demeter_scenario`` but doesn't import demeter**
  Heuristic mode emits ``# TODO: import demeter`` in the function body.
  Replace with ``import demeter; demeter.run_model(config_file=...)`` —
  see :download:`workflow_demeter.py </examples/workflow_demeter.py>` for the canonical
  body.

**LLM mode is slow**
  LLM API calls typically take 5–30 seconds. For quick iteration, use
  ``--no-ai`` for heuristic mode and only switch to LLM mode for complex
  composition tasks.

**"SCALABLE_AI_BACKEND=openai but no OPENAI_API_KEY"**
  Set your API key: ``export OPENAI_API_KEY=sk-...``. The error is raised
  at call time, not import time.

**Migration suggests incompatible changes**
  Migration is advisory — it shows what needs to change but cannot verify
  that cloud infrastructure exists. Always validate the migrated manifest
  and test with ``--dry-run`` before production deployment.

Next Steps
----------

* :ref:`tutorial_demeter_setup` — One-time setup (clone, install,
  ``demeter.get_package_data``, optional Docker image build) for the
  examples in this tutorial.
* :ref:`tutorial_getting_started` — If you're new, start from the
  beginning for full context.
* :ref:`tutorial_manifest_system` — Deep-dive into the manifest schema
  that AI assistants generate.
* :ref:`tutorial_kubernetes` — Deploy AI-generated Kubernetes
  configurations.
* :ref:`tutorial_ml_advanced` — Combine AI composition with ML-driven
  resource optimization.