.. _tutorial_demeter_setup: ========================================================== Tutorial Setup: Run the Demeter Example End-to-End ========================================================== What You Will Do ----------------- Tutorials 2–10 use the `Demeter `_ land-use / land-cover disaggregation model as their running example. This page is the **one-time setup** that makes those tutorials' code blocks actually executable on your machine. After completing it you will have: * The Scalable repo and the Demeter source tree on disk. * The Demeter Zenodo example dataset under ``./demeter_data/``. * (Optional) A locally built ``demeter:local`` container image suitable for use as a Dask worker image on Kubernetes / Fargate. * A successful smoke-run of one Demeter scenario through Scalable. The Demeter source already lives in the Scalable monorepo at ``capabilities/demeter`` — you cloned it when you cloned this repository. Prerequisites ------------- * Python 3.11 or later. * Git (to verify the Demeter clone). * (Optional) Docker for the container build / smoke-test. * ~ 1 GB of free disk space for ``demeter_data/``. Step 1: Install Scalable and Demeter -------------------------------------- From the Scalable repository root: .. code-block:: bash python -m venv .venv source .venv/bin/activate # Scalable itself pip install -e . # Demeter as an editable install so you can see the source it ships pip install -e capabilities/demeter Both installs are lightweight; Demeter's heavy native dependencies (``netcdf4``, ``scipy``, ``matplotlib``) come from PyPI wheels. Step 2: Download the Demeter example data ------------------------------------------- Demeter ships a Zenodo bundle of GCAM reference scenarios and constraint files. Download it once: .. code-block:: python import demeter demeter.get_package_data("./demeter_data") The directory layout looks like this when complete (Zenodo bundle for Demeter 2.0.x): .. code-block:: text demeter_data/ ├── config_gcam_reference.ini └── inputs/ ├── allocation/ ├── constraints/ ├── mapping/ ├── observed/ └── projected/ The :download:`canonical workflow ` resolves the base ``.ini`` automatically; if your bundle expands to a different path, set the ``DEMETER_DATA`` environment variable or pass ``--data-dir`` to ``scalable_example.run``. Step 3 (optional): Build the Demeter container image ------------------------------------------------------ Tutorials 5 (cloud), 8 (Kubernetes), and 10 (AI composition) reference a Demeter container image. The Scalable-friendly Dockerfile lives at ``capabilities/demeter/Dockerfile.scalable`` and uses Python 3.11 + ``slim-bookworm``: .. code-block:: bash docker build \ -t demeter:local \ -f capabilities/demeter/Dockerfile.scalable \ capabilities/demeter Skip this step if you only plan to run the local target — the ``containers: none`` mode in :download:`scalable.demeter.yaml ` does not need an image. Step 4: Smoke-test one scenario --------------------------------- The ``capabilities/demeter/scalable_example/`` subpackage ships a runnable driver that exercises the full pipeline. Run a single scenario locally: .. code-block:: bash python -m scalable_example.run --scenarios reference Expected output (abbreviated): .. code-block:: text 2026-05-20 14:03:11 INFO scalable_example.demeter Using shared manifest: /…/scalable/docs/examples/scalable.demeter.yaml 2026-05-20 14:03:13 INFO scalable_example.demeter Prepared 1 scenario config(s) 2026-05-20 14:04:42 INFO scalable_example.demeter Completed Demeter runs for: reference { "summary_path": "/…/outputs/demeter/_summary/scenarios.json", "scenario_count": 1 } What just happened? 1. Stage 1 (``prepare_demeter_config``) cloned the Zenodo ``config.ini`` into ``./outputs/demeter/reference/`` and rewrote the ``[PARAMS] scenario`` and ``[STRUCTURE] output_dir`` fields. 2. Stage 2 (``run_demeter_scenario``) invoked :func:`demeter.run_model` against that ``.ini`` on a Scalable worker tagged ``demeter``. 3. Stage 3 (``aggregate_demeter_outputs``) collected the per-scenario summary into ``_summary/scenarios.json``. Add scenarios to fan out: .. code-block:: bash python -m scalable_example.run --scenarios reference ssp1 ssp2 Each scenario runs in its own Dask process; on a 4-core laptop the local target processes them in parallel up to ``max_workers: 4``. Step 5: Run the same workflow via the ``scalable`` CLI -------------------------------------------------------- The :download:`reference workflow ` lives at ``docs/examples/workflow_demeter.py``. It reads the same manifest and ships the same three task functions but is loaded as a script by the Scalable CLI: .. code-block:: bash scalable run docs/examples/scalable.demeter.yaml \ --target local \ --workflow docs/examples/workflow_demeter.py This is the form the rest of the tutorials use when they say "run the Demeter pipeline". The ``scalable_example`` driver above is a convenience wrapper that bypasses the CLI in favor of direct Python; both ultimately call :class:`scalable.ScalableSession` and produce identical output. Troubleshooting --------------- **``ModuleNotFoundError: No module named 'demeter'``** The Demeter editable install in step 1 didn't pick up. Re-run ``pip install -e capabilities/demeter`` and ensure the ``.venv`` is activated. **``FileNotFoundError`` for ``config_gcam_reference/config.ini``** Step 2 (data download) was skipped or wrote to a different directory. Confirm with ``ls demeter_data/example/config_gcam_reference/`` and re-run ``demeter.get_package_data("./demeter_data")`` if needed. **``MemoryError`` during ``run_demeter_scenario``** Default local target allocates 4 GB per worker; the reference scenario fits, but fine-resolution variants can exceed it. Edit the manifest target ``local`` to increase per-process memory, or apply the ``k8s-fine-resolution`` overlay (which sets ``demeter.memory: 64G``) when running on a larger target. **Docker build fails on ``apt-get install libhdf5-dev``** The Scalable-friendly Dockerfile assumes the host can reach the Debian package mirrors. Behind a proxy you may need to pass ``--build-arg HTTP_PROXY=...``. **Tests fail with ``ImportError: cannot import name 'configobj'``** The Demeter editable install did not pull its dependencies. Run ``pip install -r capabilities/demeter/requirements.txt``. **``ModuleNotFoundError: No module named 'pkg_resources'`` on Python 3.13+** Setuptools 81+ dropped ``pkg_resources``; Demeter still imports it. Pin to ``pip install 'setuptools<81'`` until the upstream Demeter release migrates to ``importlib.metadata``. **``ValueError: assignment destination is read-only`` inside demeter.demeter_io.reader.read_base** Triggered by a Demeter-internal compatibility issue with numpy 2.x where a returned array carries ``writeable=False``. Pin to ``pip install 'numpy<2'`` or apply the upstream fix from the Demeter issue tracker. The Scalable side of the pipeline is unaffected — the per-scenario ``.ini`` was generated correctly and the task was scheduled on a Scalable worker; the failure is wholly inside ``demeter.run_model``. What's Next ----------- You now have a runnable Demeter pipeline. Continue with: * :ref:`tutorial_getting_started` — Tutorial 1 verifies the Scalable install with the trivial ``hello-scalable`` project. * :ref:`tutorial_manifest_system` — Tutorial 2 dissects ``scalable.demeter.yaml``. * :ref:`tutorial_scaling_strategies` — Tutorial 3 fans out to N Demeter scenarios. * :ref:`tutorial_ai_composition` — Tutorial 10 onboards the same Demeter repo via ``scalable init-component``.