Cloud Providers¶
Scalable v2.0.0 supports cloud-based execution through the scalable[cloud]
extra, which provides access to AWS and GCP deployment providers with
integrated cost estimation.
Installation¶
pip install scalable[cloud]
This installs dask-cloudprovider, s3fs, gcsfs, and fsspec.
AWS Provider¶
The AWSBatchProvider wraps
dask-cloudprovider’s FargateCluster or EC2Cluster.
Target options:
region: AWS region (default:us-east-1)cluster_type:"fargate"(default) or"ec2"instance_type: EC2 instance type (for cost estimation)image: Docker image for workersn_workers: Initial worker countworker_cpu: CPU units per worker (Fargate: 256-4096)worker_mem: Memory in MiB per workervpc: VPC identifiersubnets: List of subnet IDssecurity_groups: List of security group IDsexecution_role_arn: ECS execution role ARNtask_role_arn: ECS task role ARNadaptive: Dict withminimumandmaximumfor adaptive scaling
Example manifest:
# Scalable manifest targeting AWS Fargate
# Requires: pip install scalable[cloud]
#
# Demeter LULCC running example — see docs/tutorials/demeter_setup.rst
# for the full setup instructions and docs/examples/scalable.demeter.yaml
# for the canonical multi-target version.
version: 1
project:
name: demeter-lulcc-aws
default_storage: s3://my-bucket/scalable-runs/
targets:
aws:
provider: aws
region: us-east-1
cluster_type: fargate
instance_type: m5.xlarge
worker_cpu: 4096
worker_mem: 16384
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/demeter:2.0.1
execution_role_arn: arn:aws:iam::123456789:role/ecsTaskExecutionRole
task_role_arn: arn:aws:iam::123456789:role/ecsTaskRole
subnets:
- subnet-abc123
- subnet-def456
security_groups:
- sg-xyz789
adaptive:
minimum: 1
maximum: 10
components:
demeter:
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/demeter:2.0.1
cpus: 4
memory: 16G
tags: [lulcc, downscaling, gcam]
postprocess:
cpus: 2
memory: 8G
tags: [lulcc, aggregation]
tasks:
run_demeter_scenario:
component: demeter
cache: true
outputs:
output_dir: dir
aggregate_demeter_outputs:
component: postprocess
cache: true
GCP Provider (Scaffold)¶
The GCPProvider is a validation-only
scaffold. It validates manifest options but raises NotImplementedError
on build_cluster().
Target options:
region: GCP regionproject_id: GCP project identifierinstance_type: GCE machine type (for cost estimation)image: Container imagen_workers: Worker count
Cost Estimation¶
Cloud providers include static cost tables for common instance types.
Run scalable run --dry-run to see estimated costs:
scalable run scalable.yaml --target aws --dry-run
The cost estimate is also recorded in telemetry (cost.jsonl).
See Cost Estimation for detailed cost estimation documentation.
See Also¶
Provider Abstraction — Full provider abstraction documentation
Cost Estimation — Cost estimation primitives and tables
Artifact Store — Remote artifact storage with S3/GCS backends