Climate Departure Analysis from ERA5-Land Reanalysis
Overview
cd fetches ERA5-Land hourly reanalysis (1950–present, ~9 km native grid) from the DestinE Earth Data Hub, subsets to British Columbia, aggregates to monthly, seasonal, and annual periods, derives additional climate variables (vapour pressure deficit and relative humidity from temperature + dewpoint via the Tetens equation; soil moisture as a 4-depth mean) and snow-pack variables (snow water equivalent, snowfall, snowmelt, snow cover, plus annual derived scalars: peak SWE, snowfall fraction, snowmelt 50% day-of-year, peak weekly melt rate), and writes Cloud-Optimized GeoTIFFs alongside a static SpatioTemporal Asset Catalog (STAC) in a public S3 bucket. A monthly GitHub Action keeps the catalog current. On the consumer side, R functions (cd_catalog, cd_extract, cd_baseline, cd_anomaly, cd_trend, cd_compare, cd_summary, cd_plot_timeseries, cd_plot_comparison) read the COGs directly via GDAL’s /vsicurl/ — no credentials, no tile server — crop to a user-supplied area of interest, compute baselines and anomalies for arbitrary reference periods, and run Mann-Kendall and Theil-Sen trend statistics. All baseline and comparison logic stays on the consumer side, so reference periods are not baked into the served data.
Installation
pak::pak("NewGraphEnvironment/cd")Quick start
library(cd)
# Load the live STAC catalog and an example area of interest
catalog <- cd_catalog()
aoi <- sf::st_read(
system.file("extdata", "example_aoi_kotl.gpkg", package = "cd"),
quiet = TRUE
)
# Extract zonal mean time series
ts <- cd_extract(catalog, aoi)
# Compute baseline and anomalies
bl <- cd_baseline(ts, baseline_years = 1951:1955)
ano <- cd_anomaly(ts, bl)
# Trend analysis
trn <- cd_trend(ano, trend_start = 1951)
# Reporting table
cd_summary(trn)
# Compare time windows directly
cd_compare(ts, window_a = 1956:1960, window_b = 1951:1955)Caching
cd_extract() and cd_crop() cache each COG on first read, so repeated extractions, report renders, and vignette rebuilds pull each file from S3 once and read locally thereafter — turning recurring S3 egress into a one-time cost. Caching is on by default (cache = TRUE).
# First call downloads; later calls read the local cache.
ts <- cd_extract(catalog, aoi) # cache = TRUE by default
cd_cache_info() # where the cache lives + size
cd_cache_clear() # wipe it
cd_extract(catalog, aoi, cache = FALSE) # bypass the cache for one callFreshness is checked with a cheap HTTP HEAD (S3 ETag), so the monthly catalog republish is picked up automatically; cd_cache_fetch(href, refresh = TRUE) forces a re-download. For a fully-offline session set options(cd.cache_revalidate = FALSE) to serve cached copies without any network call.
Stopgap without the cache. If you read COGs through GDAL directly (e.g. raw terra::rast("/vsicurl/...") outside cd_crop()), you can cut repeat egress within a session by enabling GDAL’s /vsicurl/ cache:
Sys.setenv(VSI_CACHE = "TRUE", VSI_CACHE_SIZE = "100000000") # 100 MB
Sys.setenv(GDAL_HTTP_MAX_RETRY = "3", GDAL_HTTP_RETRY_DELAY = "1")This only persists within one R session; the cd_* cache above persists across sessions, which is what kills recurring report-dev egress.
Data
The producer pipeline fetches ERA5-Land hourly reanalysis from DestinE Earth Data Hub, derives additional variables (VPD, RH, soil moisture), aggregates to monthly, seasonal, and annual periods on a single EPSG:4326 BC grid, and writes Cloud-Optimized GeoTIFFs alongside a static SpatioTemporal Asset Catalog (STAC) in a public S3 bucket.
- Catalog (JSON): https://stac-era5-land.s3.us-west-2.amazonaws.com/catalog.json
- Region: BC (~48–60° N, 114–140° W), 1950–2025, ~9 km native grid
- Variables (15):
- Core climate (7): tmean, tmax, tmin, prcp, vpd, rh, soil_moisture
- Snow monthly natives (4): swe, snowfall, snowmelt, snow_cover
- Snow annual derived (4): swe_max, snowfall_fraction, snowmelt_doy_50, snowmelt_rate_peak
- Periods: seasonal (DJF/MAM/JJA/SON) and annual for monthly-native vars; annual only for snow_max / snowfall_fraction / snowmelt_doy_50 / snowmelt_rate_peak
The catalog is consumable directly outside R — for example, in QGIS via the STAC plugin, in gdalcubes, or with any STAC-aware client. The cd_catalog() consumer function reads the catalog URL by default.
Architecture
cd has a strict producer / consumer split, kept separate so reference periods, baselines, and statistical choices are not baked into the published data:
| Side | Functions | Where it runs |
|---|---|---|
| Producer |
cd_fetch(), cd_derive(), cd_aggregate(), cd_cog_write(), cd_stac_catalog(), cd_stac_item(), cd_s3_push(), plus scripts/backfill_edh_*.py and scripts/pipeline_*_edh.R
|
A monthly GitHub Action pulls fresh ERA5-Land data from DestinE EDH, derives variables, writes COGs, rebuilds the STAC catalog, and pushes to s3://stac-era5-land. Run by maintainers; users do not need to run it. |
| Consumer |
cd_catalog(), cd_extract(), cd_crop(), cd_baseline(), cd_anomaly(), cd_trend(), cd_compare(), cd_summary(), cd_plot_timeseries(), cd_plot_comparison(), cd_periods(), cd_seasons(), cd_cache_*()
|
Reads COGs directly via /vsicurl/ — no credentials, no tile server. All baseline selection and statistical work (Mann-Kendall, Theil-Sen, Welch’s t, Mann-Whitney U) happen client-side on the cropped time series. |
Anyone can compute climate departure against any baseline period, in any AOI, with whatever statistical test they prefer — without re-publishing the underlying data.
Roadmap
- BEC zone shift analysis (#59) — methodology lit review for biogeoclimatic ecosystem classification (BEC) shifts under climate change, alongside the existing temperature / precipitation+drying / interpretation-framing methodology stacks.
-
Local-time daily aggregation (#37) —
tmax/tmincurrently use UTC-day. Fixing this aligns the daily extremes with local solar timing for BC longitudes; ~6 h offset matters for late-summer daytime maxima. -
Vignette templates —
peace-fwcpandkootenay-lakeare reference implementations of the regional reporting pattern. Future regional vignettes follow the same structure (trends → recent vs pre-warming → spatial → per-ecoregion → snowpack) so cross-region findings are directly comparable.
Acknowledgements
This package is inspired by bcgov/bc_climate_anomaly, a Shiny app developed by the Province of British Columbia (Aseem Sharma and contributors) that visualizes monthly, seasonal, and annual climate anomalies for temperature, precipitation, humidity, vapor pressure, and soil moisture across BC. The variable set and the eco-region / watershed framing in cd follow directly from that work. bc_climate_anomaly is licensed under the Apache License 2.0.
