Python API#
stitches offers a programmatic API in Python.
Note
For questions or request for support, please reach out to the development team. Your feedback is much appreciated in evolving this API!
Core functionality#
stitches.match_neighborhood#
- stitches.match_neighborhood(target_data, archive_data, tol=0, drop_hist_duplicates=True)[source]#
Calculate the Euclidean distance between target and archive data.
This function takes data frames of target and archive data and calculates the Euclidean distance between the target values (fx and dx) and the archive values.
- Parameters:
target_data – Data frame of the target fx and dx values.
archive_data – Data frame of the archive fx and dx values.
tol (
float
) – Tolerance for the neighborhood of matching. Defaults to 0 degC, meaning only the nearest-neighbor is returned. Must be a float.drop_hist_duplicates (bool) – Determines whether to consider historical values across SSP scenarios as duplicates (True) and drop all but one from matching, or to consider them as distinct points for matching (False). Defaults to True.
- Returns:
Data frame with the target data and the corresponding matched archive data.
stitches.permute_stitching_recipes#
- stitches.permute_stitching_recipes(N_matches, matched_data, archive, optional=None, testing=False)[source]#
Sample from matched_data to produce permutations of stitching recipes.
This function samples from matched_data (the results of match_neighborhood(target, archive, tol)) to produce permutations of possible stitching recipes that will match the target data.
- Parameters:
N_matches (int) – The maximum number of matches per target data.
matched_data – Data output from match_neighborhood.
archive – The archive data to use for re-matching duplicate points.
optional – A previous output of this function that contains a list of already created recipes to avoid re-making (this is not implemented).
testing (bool) – When True, the behavior can be reliably replicated without setting global seeds. Defaults to False.
- Returns:
A data frame with the same structure as the raw matched data, with duplicate matches replaced.
stitches.generate_gridded_recipe#
- stitches.generate_gridded_recipe(messy_recipe, res='mon')[source]#
Create a recipe for the stitching process using a messy recipe.
- Parameters:
messy_recipe – A data frame generated by the permute_recipes function.
res (str) – The resolution of the recipe, either ‘mon’ for monthly or ‘day’ for daily.
- Returns:
A data frame formatted as a recipe for stitching.
stitches.make_recipe#
- stitches.make_recipe(target_data, archive_data, N_matches, res='mon', tol=0.1, non_tas_variables=None, reproducible=False)[source]#
Generate a stitching recipe from target and archive data.
- Parameters:
target_data – A pandas DataFrame of climate information to emulate.
archive_data – A pandas DataFrame of temperature data to use as the archive to match on.
N_matches (int) – The maximum number of matches per target data.
res (str) – Resolution of the stitched data, either ‘mon’ or ‘day’.
tol (float) – Tolerance used in the matching process, default is 0.1.
non_tas_variables (list[str]) – List of variables other than tas to stitch together; defaults to None, which stitches tas only.
reproducible (bool) – If True, ensures reproducible behavior by using the testing=True argument in permute_stitching_recipes(); defaults to False.
- Returns:
A pandas DataFrame of a formatted recipe.
stitches.gridded_stitching#
- stitches.gridded_stitching(out_dir, rp)[source]#
Stitch the gridded NetCDFs for variables contained in the recipe file and save them.
- Parameters:
out_dir (str) – Directory location where to write the NetCDF files.
rp – DataFrame of the recipe including variables to stitch.
- Returns:
List of the NetCDF file paths.
stitches.gmat_stitching#
stitches.fetch_pangeo_table#
- stitches.fetch_pangeo_table()[source]#
Fetch the Pangeo CMIP6 archive table of contents as a pandas DataFrame.
Retrieve a copy of the Pangeo CMIP6 archive contents, which includes information about the available models, sources, experiments, ensembles, and more.
- Returns:
A pandas DataFrame with details on the datasets available for download from Pangeo.
stitches.fetch_nc#
stitches.make_tas_archive#
stitches.make_matching_archive#
- stitches.make_matching_archive(smoothing_window=9, chunk_window=9, add_staggered=False)[source]#
Create an archive of rate of change (dx) and mean (fx) values.
This function processes the CMIP6 archive to produce values used in the matching portion of the stitching pipeline.
- Parameters:
- Returns:
The file location of the matching archive.
stitches.make_pangeo_table#
stitches.make_pangeo_comparison#
stitches.install_package_data#
- stitches.install_package_data(data_dir=None)[source]#
Download and unpack Zenodo-minted stitches package data.
This function matches the current installed stitches distribution and unpacks the data into the specified directory or the default data directory of the package.
- Parameters:
data_dir (str) – Optional. Full path to the directory to store the data. Default is the data directory of the package.
- Returns:
None