link 0.40.4
Reproduce bcfp’s per-species accessibility so dam-downstream segments emit the dam descriptor (#200). The mapping_code phase previously drove accessible from barriers_<sp>_unified (all barriers, including dams), so every segment below a dam read inaccessible and lost its ;DAM/;MODELLED/;ASSESSED second token — emitting a bare SPAWN/REAR where bcfp emits SPAWN;DAM. It now uses a new per-species barriers_<sp>_access view that reproduces bcfp’s barriers_<sp>: natural barriers only (gradient at the species threshold ∪ falls ∪ subsurface), minus the observation/habitat override, plus all user-definite barriers (override-exempt). Dams stay in barrier_sources and annotate token2 only.
All three access inputs are now persisted province-wide so the cross-WSG downstream walk is correct in every watershed group, not just the run’s own: natural barriers (already), user_barriers_definite (new USER_DEFINITE family in lnk_barriers_unify, ltree-resolved via the FWA join like falls), and the observation/habitat override (new <persist_schema>.barrier_overrides table). Validated against bcfishpass@v0.7.15: PARS BT 98.95%, LFRA BT 97.77% / CO 97.90% per-segment mapping_code match. See RUNBOOK.md §5.
link 0.40.3
Persist the per-source downstream-barrier flag columns in streams_access so lnk_pipeline_mapping_code’s second token (DAM/MODELLED/ASSESSED/REMEDIATED/NONE) populates from persisted state instead of defaulting to NONE. Three coupled fixes (#196): lnk_persist_init adds the six flag columns to the streams_access DDL; lnk_pipeline_run pre-persists barriers before the mapping_code phase for cross-WSG dam visibility (link#152); lnk_pipeline_persist projects the flag columns in the INSERT (the DDL/INSERT pair must match — the missing projection was the actual NONE-token bug).
Adds RUNBOOK.md — the durable mental model of the barrier → access → mapping_code machinery, including the authoritative bcfp access-set mechanism (read from smnorris/bcfishpass@v0.7.15). Note: the per-species accessibility set still carries dams (it should be natural-only + observation/habitat-overridden, per bcfp); dam-downstream segments therefore still emit a bare habitat token rather than SPAWN;DAM. Characterized in RUNBOOK §5 with a scoped fix (follow-up issue).
link 0.40.2
Hotfix for wide-table species-set evolution in v0.40.0/v0.40.1’s lnk_pipeline_run(mapping_code = TRUE) path. Closes #194.
v0.40.1 made the mapping_code phase use active_species (per-WSG subset of bundle species) for working schema’s streams_access columns — matching the persist DDL because lnk_persist_init was also passed active_species. But persist is province-wide. When WSG #2 has a different active subset than WSG #1, the persist table is locked to #1’s column set and #2’s INSERT projection fails:
ERROR: column "has_barriers_ch_dnstr" of relation "streams_access" does not exist
Live smoke 2026-05-19: PARS ran first (default config, active = bt/gr/ko/rb) → 4-column persist DDL. BULK next (default config, active = bt/ch/co/pk/sk/st/rb — BULK is salmon-bearing in the Skeena) → 7-column INSERT projection against 4-column table → fail.
Fix: lnk_pipeline_run passes cfg$species (full bundle, 11 species for default config) to lnk_persist_init instead of active_species. Persist DDL is bundle-sized; per-WSG INSERTs in lnk_pipeline_persist continue using active_species for projection so unused species’ columns get NULL. Per-species habitat tables (streams_habitat_<sp>) similarly created for the full bundle — extras stay empty until populated.
Verified live: PARS + BULK now coexist in fresh_default.streams_access / fresh_default.streams_mapping_code with their respective active subsets, NULL for non-active columns.
Migration: existing persist tables created with narrower DDL do NOT auto-grow. Drop <persist_schema>.streams_access, <persist_schema>.streams_mapping_code, and <persist_schema>.streams_habitat_long_vw, then re-run lnk_pipeline_run(mapping_code = TRUE) to recreate them with the bundle-wide DDL.
No regression in bcfishpass bundle: cfg$species = bcfp 8 = active for most bcfp-bundle WSGs → identical INSERT projection.
link 0.40.1
Hotfix for v0.40.0’s lnk_pipeline_run(mapping_code = TRUE) path on non-bcfp bundles. Closes #192.
v0.40.0’s mapping_code phase hardcoded sp_set <- c("bt","ch","cm","co","pk","sk","st","wct") (the bcfp 8 species) and called lnk_barriers_views without a species arg (uses the same bcfp 8 default). Working schema’s streams_access got bcfp-8-species columns, but persist streams_access was created by lnk_persist_init(species = active_species) with the bundle’s species — for the default config that’s bt/gr/ko/rb. The persist INSERT ... SELECT projects persist’s column list against working → fails with column a.has_barriers_ko_dnstr does not exist.
Effect: lnk_pipeline_run(mapping_code = TRUE) worked only for the bcfishpass bundle (full 8-species). Every other bundle (including the default operator-facing one) errored on persist.
Fix: pipeline_run’s mapping_code phase now uses active_species (bundle-driven) for both lnk_barriers_views(species = ...) and lnk_pipeline_access(barriers_per_sp = ...). Passes species_<role> to lnk_mapping_code filtered against active_species — species in active_species that don’t appear in any bcfp residence category (GR/KO/RB) fall through to species_resident (placeholder; the data-driven categorization lands via #189).
Caught by live smoke 2026-05-19 on PARS with default bundle. Pre-merge unit tests didn’t cover this path; #191 tracks the test catch-up.
link 0.40.0
Mapping_code tunnel decouple + portable lnk_mapping_code() build + <type>_<role> rename sweep. Closes #187. Major architectural shift in how access semantics flow through the parity diff. BC: parameter and CLI-flag renames (deprecation shims for one release; removal v0.41.0).
Persist
streams_access+streams_mapping_code+ long-form habitat view.lnk_persist_init()now creates two new per-WSG per-species persist tables (streams_access,streams_mapping_code) and one VIEW (streams_habitat_long_vw=UNION ALLacrossstreams_habitat_<sp>tables, presents the per-species split as long-form for any consumer that prefers it). Per-species column generators (.lnk_cols_streams_access_per_sp(),.lnk_cols_streams_mapping_code_per_sp()) are species-driven — pass a different species set, get matching columns.lnk_pipeline_persist()extended withstreams_access+streams_mapping_codewrite blocks, gated by presence of the working-side tables (skip cleanly when the mapping_code path didn’t run).lnk_mapping_code()— new exported portable build entry point. Schema-aware wrapper aroundlnk_pipeline_mapping_code()(the pure data transform). Takes explicittable_<role>args (table_access,table_habitat,table_streams) — function works against working-schema tables (mid-pipeline) or persist-schema tables (ad-hoc rebuild). Caller can invoke it directly against persist data with the tunnel down to rebuildstreams_mapping_codewithout re-running the full pipeline — the headline use case unblocking QGIS bcfp-shape symbology viadata-raw/build_species_views.R --bcfp.lnk_pipeline_run(..., mapping_code = TRUE)— tunnel-free mapping_code phase. New optional phase that runslnk_barriers_views(over working<schema>.barriers, tunnel-free, link-canonical) +lnk_pipeline_access+lnk_mapping_codebetweenlnk_barriers_unifyandlnk_pipeline_persist. Persist phase copies both new tables to<persist_schema>. Methodology shift: ACCESS now uses link’s own per-species barriers (derived from<schema>.barriers’sblocks_speciespredicate per link#152) instead of bcfp’s barriers tables staged via the tunnel. Pre-#187 the only path that builtstreams_mapping_codewaslnk_compare_wsg, and access there used bcfp-staged barriers — so link’sstreams_mapping_codereflected link’s habitat + bcfp’s access. Post-#187 it reflects link’s habitat + link’s access. The parity diff vsbcfishpass.streams_mapping_codebecomes more meaningful (surfaces real link-vs-bcfp divergence that was artificially suppressed before). Expect non-trivial parity-number deltas on the next provincial run vs pre-#187 baselines.lnk_compare_wsg()refactored. Build delegated tolnk_pipeline_run(mapping_code = TRUE); only the diff stays in compare..lnk_compare_wsg_mapping_code_diff()rewritten to read from<persist_schema>.streams_mapping_codeinstead of working schema. The orphan helpers (.lnk_compare_wsg_mapping_code,.lnk_compare_wsg_stage_reference_barriers) deleted — ~200 lines simpler.BC: parameter rename
with_mapping_code→mapping_codeinlnk_compare_wsg()andlnk_pipeline_run(). Old name accepted with.Deprecated()warning for one release; removal in v0.41.0.BC: parameter rename
<role>_species→species_<role>inlnk_pipeline_mapping_code()(three params:resident_species→species_resident,anadromous_species→species_anadromous,spawn_only_species→species_spawn_only). Matches the documented<type>_<role>convention (col_<role>,table_<role>,exp_<role>, nowspecies_<role>). Old names accepted with deprecation warning until v0.41.0.BC: CLI flag rename
--with-mapping-code→--mapping-codeinwsgs_run_pipeline.sh,wsgs_dispatch.sh,wsgs_run_m4_offline.sh,trifecta_smoke.sh,wsgs_run_host.R. Old flag accepted with stderr deprecation warning until v0.41.0.lnk_barriers_views()gainsbarriers_tablearg. DefaultNULLpreserves the existing<persist_schema>.barrierssource. Pass a working-schema table to build views over a per-WSG working barriers table — used by the newmapping_codephase. Backward-compatible.Follow-up filed: #189 — data-drive species residence categorization (
species_resident/species_anadromous/species_spawn_only) fromdimensions.csv. Today the defaults are hardcoded to bcfp’s species residence model; #189 moves them to bundle data so custom species (sea-run cutthroat, Dolly Varden, future mixes) work without monkey-patching function defaults.
link 0.39.1
Fail loud on transient cypher prep failures. Closes #182. Trip-mode hardening before M1 takes over cypher dispatch while the user is in Europe.
-
data-raw/cypher_prep.sh— replaceset -ewithset -euo pipefail; wrap three| tail -Npipelines with tempfile + exit-check pattern (bash snapshot_bcfp.sh,Rscript pak::local_install,Rscript lnk_persist_init). Before:tail’s exit 0 masked upstream failures, script printed=== READYwhile cypher was half-prepped, umbrella’s downstream marker-grep caught it but the failure was opaque on the cypher itself. After: each failure mode dumps its full log to stderr and exits 1, ssh-back to the umbrella surfaces the non-zero exit, marker-grep continues to work as belt-and-suspenders. Hit twice in 2026-05-15 (Peace Tier 2 retry + post-#185 re-spin; transient bcdata openmaps WFS timeout in both cases). Sibling fix shipped in rtj#163 for the cypher orchestration scripts; this is the link-side complement covering the per-cypher prep script.
link 0.39.0
Additive multi-host runs + two coupled fixes to schema_consolidate.R. Closes #180 and #185. Validated end-to-end via Peace Tier 2 retry (2026-05-15): 16 Peace WSGs additively dispatched into an existing 13-WSG fresh_default, all 16 land with complete per-species habitat tables, M4 final state = 29 WSGs.
Additive Step 0 (BC).
wsgs_run_pipeline.sh’s Step 0 (state_clean.sh --schemas=$SCHEMAwipe) now requires--reset-schemato fire. Default is additive — pipeline writes rely onlnk_pipeline_persist’s per-WSG DELETE-WHERE-WSG idempotency to replace cleanly without losing other WSGs in the schema. Enables adding a new WSG set (e.g. Peace 16) to an in-flight schema without rebuilding everything.Bucket-filtered COPY-streaming (BC).
schema_consolidate.Rreplacespg_dump+scp+pg_restorewith per-tablessh <host> 'docker exec psql -c "COPY (SELECT * FROM <t> WHERE wsg IN (bucket)) TO STDOUT"' > /tmp/<f>+ localpsql -c "COPY <t> FROM STDIN" < /tmp/<f>. Source-side row filter eliminates the over-fetch class where leftover WSGs outside the bucket collided with destination data.bucket=is now REQUIRED per source. DROP SCHEMA on source replaced with bucket-scoped DELETE so out-of-bucket WSGs on source are preserved.Fix:
dest_conndefault routed to wrong DB.schema_consolidate(..., dest_conn = link::lnk_db_conn())default routed verification queries to M4’s tunnel:63333/bcfishpasswhile the COPY shellouts hardcode local:5432/fwapg.wgc_tablesreturned 0 rows → silent skip of every source. Default nowNULL; function constructs its ownlocalhost:5432/fwapgconnection internally to match the COPY hardcodes. Caught Peace Tier 2 first attempt — 12 of 16 Peace WSGs lost from consolidate (M1’s 5 recoverable post-fix; 7 from burned cyphers lost).Fix: per-source
wgc_tablesenumeration (#185). Previously enumerated tables on destination only. When source’s habitat-table set was a strict subset (cyphers’ Peace bucket = BT/GR/RB; M4 destination carriedstreams_habitat_ch/sk/stresidue from prior runs), the loop hitstreams_habitat_chon source →relation does not exist→break→ silently dropped_gr/_rbdata. Now enumerateswgc_tableson BOTH source AND destination via parallelinformation_schemaqueries; iterates the intersection. Per-table failures usenextoverbreak— one bad table no longer poisons the rest. Source-side post-COPY cleanup DELETEs only successfully-copied tables (errored tables stay intact for retry). Per-source result gainscopied,errored,skipped_source_only,skipped_dest_only.
link 0.38.1
-
wsgs_run_pipeline.sh:--cy-workspaces=A,B,Cpassthrough for #178 Tier 1/2 cypher integration tests (was hardcoded for the fulljob1,job2,job3set). CY_WS_ARR threaded through Steps 3/4/5/7/9 + trap-EXIT burn. Step 9 SOURCES_R built dynamically per-cypher. Tier 1 (1 cypher) validated live: 13/13 study-area WSGs, 22m wall, exit 0, cy1 burn clean.
link 0.38.0
Provincial-run autonomy CLI + 8 operational-script renames to noun_verb convention. Closes #172. Builds on v0.37.0’s #168 decouple — with PG-state resume in place, the autonomy surface stays thin and the renames stay mechanical.
-
Single-command autonomous run.
wsgs_run_pipeline.sh(wasprovince_run.sh) accepts--wsgs=A,B,C,--config=<name>,--schema=<name>,--no-cyphers,--force, forwards towsgs_dispatch.sh(wastrifecta_provincial.sh) which intersects the WSG subset in its LPT split. M4+M1-only baseline validated end-to-end: 16-WSG default-bundle dispatch lands 16/16 infresh_default.streamson M4, ~20 min wall, no operator prompts. -
Step 0 pre-clean. When
--schema=is set, umbrella firesstate_clean.sh --schemas=<schema>on every host before Step 1. Drops only the target schema (skips the canonical-fresh heuristic + snapshot reload). Eliminates a class of consolidate failures where stale leftover WSGs on a source host collided with destination data during pg_restore. -
Scoped
state_clean.sh(wasprovince_clean.sh). New--schemas=A,B,Cmode drops only the listed schemas. Empty--schemas=rejected loud to prevent dynamic-arg silent fall-through to the destructive default mode. -
Phantom-cy + error-surface fixes in
wsgs_dispatch.sh. R’spaste0("cy", integer(0))returns"cy"length-1 (constant recycling) — would put a non-existent cypher in the host plan under--no-cyphers. Three-branchedcy_host_keys. EmptyCY_WORKSPACESinit via explicitCY_WS_ARR=()(wasread -r -ayielding single-empty-element).SPLIT_OUT=$(Rscript ...)wrapped with explicit||block so R-sidestop()messages reach the operator (e.g.--wsgs=BOGUSsurfaces the R error verbatim instead of silent abort). -
8 rename mapping (
git mvpreservesgit log --follow). Names now describe scope honestly — these scripts work for any list of WSGs / any host count / any reference:
| Old | New |
|---|---|
data-raw/province_run.sh |
data-raw/wsgs_run_pipeline.sh |
data-raw/province_clean.sh |
data-raw/state_clean.sh |
data-raw/province_progress.sh |
data-raw/progress_check.sh |
data-raw/trifecta_provincial.sh |
data-raw/wsgs_dispatch.sh |
data-raw/run_provincial_parity.R |
data-raw/wsgs_run_host.R |
data-raw/consolidate_schema.R |
data-raw/schema_consolidate.R |
data-raw/archive_provincial_runs.sh |
data-raw/runs_archive.sh |
data-raw/balance_provincial_buckets.R |
data-raw/buckets_balance.R |
The wsg_* (singular, per-WSG functions from #168) vs wsgs_* (plural, collection-level orchestrators) distinction is now load-bearing in the naming. compare_bcfishpass_wsg.R → wsg_compare.R was renamed in #168.
Filed-but-not-closed follow-ups: cypher integration testing (issue #172 Phase 2 + 3 acceptance — defer until M4+M1 baseline lands repeatably); LPT-fallback empty-bucket edge case when N_WSGs ≤ N_hosts without timing CSVs (pre-existing, not a #172 regression).
link 0.37.0
Decouple bcfp comparison from the modelling pipeline. Closes #168. The link package’s deliverable — the per-WSG model in <persist_schema>.streams + per-species habitat + barriers — now runs and is observable independently of any comparison framework. Comparison vs bcfishpass (or any future reference) is a diagnostic overlay that reads the persisted state and never gates whether the model itself ran.
- New exported
lnk_pipeline_run(conn, aoi, cfg, loaded, schema, dams, cleanup_working)— modelling-only umbrella over the 7lnk_pipeline_*phases plus persist_init + barriers_unify + persist. Writes per-WSG segment data to PG.lnk_barriers_unifyis promoted from gated-behind-with_mapping_code to always-on so<persist_schema>.barriersis canonical state for any future reader. - New exported
lnk_compare_rollup(conn, aoi, cfg, reference, conn_ref, species)— reads persisted state + reference DB, returns the long-format rollup tibble. Reference-agnostic via thereferencearg ("bcfishpass"today). Species auto-discovered from PG viainformation_schemaprobe. -
lnk_compare_wsg()refactored as a thin wrapper over both new functions. Bundled behavior preserved forwith_mapping_code = TRUE(mapping_code decoupling deferred — follow-up). Active-species set is now PG-state-derived (post-persist) rather thancfg$species ∩ wsg_species_presence(pre-persist); equivalent on a fresh single-call run, future-proofs callers against config drift. -
data-raw/compare_bcfishpass_wsg.Rsplit intodata-raw/wsg_pipeline_run.R(modelling) +data-raw/wsg_compare.R(compare). 4 callers updated to the explicit two-call pattern (_targets.R,regress_dams_isolation.R,rule_flexibility_demo.R,run_provincial_parity.R). -
data-raw/run_provincial_parity.Rresume gate uses PG state as canonical: probes<persist_schema>.streamsvia internal.lnk_wsg_persisted(); RDS files are diagnostic side-artifacts that no longer silently mask an empty pipeline. Four-branch logic (force / fully-cached / compare-only / pipeline+compare). New--forceCLI flag bypasses all caching. New helpers.is_error_stub(re-runs WSGs whose previous attempt failed) and.rollup_has_mapping_code(invalidates bare-rollup cache when the mapping_code lens is requested). Closes the 2026-05-14 incident where 4 of 16 WSGs were silently skipped due to stale error-stub RDS files. - Phase 7 smoke matrix validates against live DB on DEAD WSG: empty state (57s pipeline+compare) → pipeline-cached (9s compare-only, ~6× speedup) → fully cached (2s skip) →
--force(56s re-fire). Confirms the resume gate value and the decoupled boundary.
Filed-but-not-closed follow-ups: lnk_compare_mapping_code as its own family member (promotes the with_mapping_code = TRUE flag to a stand-alone export), lnk_compare_wsg → lnk_compare_run family rename, persist family naming pass, the 8 data-raw/ script renames (stay in #172).
link 0.36.1
Operational hardening from the 2026-05-13 → 2026-05-14 provincial dispatch session. No R/ API changes — patches landed in data-raw/ operational tooling. Closes #171.
-
data-raw/trifecta_provincial.sh: M1 reverse-forward tunnel (ssh -R 63333:127.0.0.1:63333) — M1 no longer needs its own (passphrase-protected)db_newgraphidentity to reach bcfp. M4 idempotent inline-tunnel block. LPT fallback uses host_speeds-weighted alphabetical split when no_per_wsg_times.csvexists (was equal-split, ignored host_speeds).HOST_SPEEDSrecalibrated to time-multiplier semantics:m4=1.0,m1=0.79,cy=1.23(larger=slower=fewer WSGs assigned). Calibrated from per-WSG medians on the 5-host 2026-05-13 dispatch. - New
data-raw/province_run.sh— top-level 10-step wrapper (pre-flight, snapshot, spin, prep, archive, smoke, dispatch, acceptance, consolidate, burn) with trap-EXIT cypher burn that fires regardless of mid-flight failure. Drafted ready for a--smoke-onlyregression-test mode in a follow-up. - New
data-raw/province_clean.sh— idempotent multi-host state wipe (killsR --no-echo+Rscript+run_provincial, dropsfresh+working_*+fresh_<bundle>*schemas, reloadsfresh.modelled_stream_crossingsviasnapshot_bcfp.sh --force). <5 min wall. - New
data-raw/province_progress.sh— mtime-based per-host progress probe. Cross-host TZ-glob hell solved by usingfind -mmin -Nandls -t(newest by mtime, not filename) — cypher logs use UTC, M4/M1 use local; date-globbing across hosts broke at TZ rollover. -
research/post_compact_provincial_handoff.md— tunnel architecture gotcha section (how each host reaches bcfp) + LPT fallback gotcha section. -
planning/active/{task_plan,findings,progress}.md— full PWF capture: 12 distinct gotchas surfaced during the session, includingpkill -f Rscriptmissing theR --no-echosubprocess (caused concurrent dispatches), RDS-cache-skip inrun_provincial_parity.R, stale cypher snapshotfresh.*data, M1 SSH key passphrase + Keychain-only unlock, and M4 PG over-tuning. Wrapper test strategy documented.
Follow-up issues filed (not closed here): #167 tunnel autossh, #168 decouple bcfp compare from pipeline, #169 simplify lnk_persist_init after rtj#145, #170 S3-based consolidate. Plus rtj#145 (rebuild cypher snapshot with fwa-dump tables ONLY) and fresh#199 (reopened — M4 PG over-tuning evidence).
Run result: 217-WSG BC stream network model in M4 fresh schema. Annotated parity CSV at data-raw/logs/provincial_parity/20260514_0622_*_annotated.csv — 91 UNEXPLAINED rows at |diff_pct| >= 2% (acceptance bar still not met; investigation queue for next session).
link 0.36.0
Closes #162. Lifts two scattered data-raw/ scripts (linear rollup parity + per-segment mapping_code parity) into one package-level lnk_compare_wsg(), adds an annotated CSV pipeline (lnk_parity_annotate() against a YAML divergence taxonomy), modernizes the multi-host orchestrator to 5-host (M4 + M1 + N cyphers via tofu workspaces), and hardens the spin-up + smoke flow so failures fail loud + fail fast. Full per-phase summary: planning/archive/2026-05-link162-lnk-compare-wsg-annotated-csv/README.md.
- New exported
lnk_compare_wsg(conn, aoi, cfg, loaded, reference, with_mapping_code, ...). Per-WSG convenience wrapper around the existinglnk_pipeline_*phases. Returnslist(rollup, mapping_code).reference = "bcfishpass"only initially;with_mapping_code = TRUEadds the per-segment lens additive on top of the same network state (no double-pipeline). Defensive empty-merge handling for the 36 WSGs bcfp doesn’t model (warning + NA-filled tibble, not error).data-raw/compare_bcfishpass_wsg.Rcollapses from 432 → 77 lines as a thin wrapper;data-raw/compare_bcfp_mapping_code.Rdeleted. - New exported
lnk_parity_annotate(rollup, taxonomy, to, tolerance). First-match-wins lookup againstresearch/bcfp_divergence_taxonomy.yml. Tags each rollup row withtaxonomy_id, class, mechanism, status, refscolumns. Unmatched rows:class = UNEXPLAINED (|diff_pct| >= tolerance) | WITHIN_TOLERANCE | NOT_APPLICABLE. Accepts bothref_valueandbcfishpass_valuecolumn names. Optional CSV write. - New
research/bcfp_divergence_taxonomy.yml— single source of truth for known patterns. 11 entries covering Classes A (SETN stale), B (HORS fresh#158 bypass), C (SK new-geographies fresh#190/#191), D (BBAR + small 2026-05-11 residuals), MEASUREMENT_ASYMMETRY (lake/wetland centerline-vs-polygon). -
data-raw/trifecta_provincial.shextended for M4 + M1 + N cypher workspaces (--cy-workspaces=job1,job2,job3). Inline greedy LPT bucket allocation (reads prior_per_wsg_times.csv, uses CLI--host-speeds=m4=1.0,m1=0.83,cy=1.83for projection + back-normalization — no feedback loop). Pre-flight version check across all hosts before dispatch. Post-pull aggregate annotation against the taxonomy. Empty-bucket guard. Cypher-side R log pull-back at run end so cross-repo log boundary doesn’t hide errors. Truth-in-headline reports OK vs error-stub RDS counts (was misleadingN/N pulled). -
data-raw/trifecta_smoke.shrewritten as a 77-line shim over the production orchestrator — one small WSG per host (m4=DEAD, m1=ELKR, cyN=ADMS/BABL/BULL), passes--fail-fastautomatically, asserts every smoke RDS is a successful tibble (not error stub) before declaring pass. Exits non-zero with clear message + pointer to the cypher R log when any WSG fails. “Smoke passed” now means every smoke WSG produced a valid tibble, not just “scripts exited 0”. -
data-raw/archive_provincial_runs.sh— new helper. Moves prior-run_per_wsg_times.csv+*.rds+*_annotated.csvintoarchive/<TS>/so the LPT planner uses the most recent run only. -
data-raw/balance_provincial_buckets.R— dedup(wsg, host)and cross-host before LPT so multi-run CSV accumulation no longer double-assigns WSGs to buckets. Superseded for the N-host orchestrator (which computes LPT inline) but kept for standalone planning. -
data-raw/consolidate_schema.R— bucket-aware destination cleanup (DELETE FROM <schema>.<table> WHERE watershed_group_code IN (<bucket>)on eachwatershed_group_code-bearing table before pg_restore — prevents duplicate-key violations on re-consolidation). Pre/post row-count delta verification:ok = TRUErequirescount(*)post-restore > pre-restore (NOTpg_stat_user_tables.n_live_tupwhich lags asynchronously). -
lnk_persist_init(force_recreate = FALSE)— new flag + DDL drift detection via.lnk_validate_persist_table(). Errors loud when an existing target table has unexpectedGENERATED ALWAYScolumns (catches cypher snapshots baked whenfresh::frs_col_generate()had been run onfresh.streams).force_recreate = TRUEDROPs+recreates with correct DDL. 6 new tests cover detection, force-recreate, no-op, and arg validation. -
data-raw/run_provincial_parity.R—--with-mapping-codeflag passthrough; new--fail-fastflag (default FALSE preserves soft-fail for full provincial runs; smoke runner injects it automatically so WSG #1 failure on a host stops the loop instead of confirming the same failure 30 more times); post-loop annotation step writes<TS>_<host>_annotated.csv. - Updated
research/provincial_run_runbook.mdfor the 5-host flow + smoke-first cadence + DDL drift handling. The runbook is now the operational source of truth;data-raw/README.md#provincial-dispatchis the CLI reference. - 2026-05-12 → 13 live provincial run results in
research/provincial_parity_2026_05_12.md. Acceptance bar (zeroUNEXPLAINEDat|diff_pct| >= 2%) NOT YET MET (56 surviving UNEXPLAINED rows; 93 cypher WSGs lost to DDL drift now fixed bylnk_persist_inithardening — next provincial run should hit 217/217 OK and provide the full picture). Operational lessons documented inplanning/archive/2026-05-link162-lnk-compare-wsg-annotated-csv/findings.md.
Filed follow-up: #163 — adaptive host_speeds learning from observed wall times (LPT refinement; currently uses static CLI defaults).
link 0.35.1
-
data-raw/snapshot_bcfp.sh: replacegrep -qi parquetwithgrep -i parquet > /dev/nullin the Parquet prereq check (#160). Underset -euo pipefail,grep -qcloses the pipe on first match,ogr2ogrgets SIGPIPE (exit 141),pipefailpropagates,!flips it, and the script FATALs even though the Parquet driver IS present. Originally chased as a non-interactive ssh / conda env issue (NewGraphEnvironment/rtj#129) — that was a misdiagnosis; PATH from rtj#66/#123 was always correct.
link 0.35.0
Closes #152. New unified province-wide <persist_schema>.barriers table with a pre-computed blocks_species text[] predicate. Closes the cross-WSG dam_dnstr_ind defect — PARS BT mapping_code parity jumped from 60.64% → 98.63% (+38 pp) because dam barriers in upstream-of-PARS WSGs (Bennett in PCEA, Peace Canyon / Site C in UPCE) now resolve correctly via FWA-topology walks over the province-wide table. Other Phase A WSGs maintained ≥99% across all species (full 6-WSG matrix in research/bcfp_compare_mapping_code.md).
- New exported
lnk_barriers_unify(conn, aoi, cfg, loaded, schema). Consolidates four per-WSG barrier source families into<schema>.barriers: anthropogenic (PSCIS / CABD / MODELLED_CROSSINGS withbarrier_status IN ('BARRIER','POTENTIAL')), gradient (per-class,blocks_speciesderived fromparameters_fresh$access_gradient_max), falls, and opt-in subsurface_flow. Per-sourceid_barriernamespacing keeps rows unique within a WSG without coordinating sequence IDs (anthropogenic =aggregated_crossings_id; others get<SOURCE>-<rownum>text prefixes). - New exported
lnk_barriers_views(conn, schema, cfg). Emits per-species (<schema>.barriers_<sp>_unifiedfor the 8 mapping_code species) + per-source (<schema>.barriers_{anthropogenic,pscis,dams}_unified)CREATE OR REPLACE VIEWs over<persist_schema>.barriers. Each view re-exposesid_barrier AS barriers_<x>_unified_idso the existinglnk_pipeline_accessfeature_id_col = "<table>_id"convention works unchanged._unifiedsuffix avoids name collisions with the per-WSG tables.lnk_pipeline_prep_minimal+lnk_barriers_emitalready build (those stay — they’re useful primitives). -
lnk_persist_init()extended withcols_barriersDDL: 13 columns, PK on(id_barrier, watershed_group_code), GIN index onblocks_species, btree indexes on(watershed_group_code, barrier_source)and(blue_line_key, downstream_route_measure), GIST ongeom. -
lnk_pipeline_persist()extended with a<schema>.barriers→<persist_schema>.barriersDELETE-WHERE-WSG + INSERT branch (gated on staging-table presence so older orchestrators that don’t yet calllnk_barriers_unifykeep working without behaviour change). -
data-raw/compare_bcfp_mapping_code.R:barrier_sources$anthropogenic+barrier_sources$damsnow point at the unified views.barriers_per_spkeeps the bcfp-tunnel staging fallback (the unified-tableblocks_speciespredicate doesn’t encode per-species minimal-position semantics — that’s a separate scope expansion).
Closes #154. lnk_pipeline_crossings() now reproduces bcfp’s PSCIS-to-modelled auto-snap layer byte-identically via the fresh primitive composition (lnk_points_snap(num_features = 5L) + fresh::frs_candidates_pick() + bcfp-shape scoring/dedup SQL). Phase A mapping_code parity hits ≥99% on every in-WSG species across ADMS, BULK, WILL, PARS — BULK jumped ~80% → ~99.5%, WILL ~86% → ~99.7%. PARS BT 60% remains cross-WSG dam_dnstr territory (tracked under #152).
- New private helper
.lnk_pipeline_pscis_build(conn, aoi, schema, loaded, …)mirrors bcfp’s02_pscis_streams_150m.sql+04_pscis.sqlatsmnorris/bcfishpass@v0.7.14-125-g6e9cf1c. Five-step composition: multi-stream snap → enrich + score (name_score,width_order_score) → b-side modelled-collision dedup → per-PSCIS pick viafrs_candidates_pick+ AOI filter + DBSCAN 5m cluster + UNIQUE(blue_line_key, downstream_route_measure) dedup → xref-driven INSERT (two-branch UNION ALL:modelled_crossing_idlookup vslinear_feature_idlookup, mirroringreferenced_modelled_xing+referenced_streamsCTEs).lnk_pipeline_crossings()now calls this helper in place of the barelnk_points_snap(); minimumsnap_toleranceclamped to 150 m to match bcfp. -
lnk_points_snap(): bug fix in the segment-offsetdownstream_route_measureformula. Previous formST_LineLocatePoint * ST_Length(s.geom)computed position WITHIN the candidate segment, not the absolute drm on the blue line. Now adds+ s.downstream_route_measureand usess.length_metrewithGREATEST/LEAST/FLOOR/CEILclamping per bcfp’s pattern. Newnum_features = 1Larg (backwards-compatible) returns up to N candidate streams per input point for downstream scoring workflows. -
.lnk_crossings_union: modelled branch now LEFT JOINs<schema>.crossing_fixes(stageduser_modelled_crossing_fixes) and filtersWHERE cf.structure IS NULL OR cf.structure = 'OBS'— bcfp parity withload_crossings.sql:634. Without this filter, 275 NONE-fixed modelled crossings leaked through in BULK / 103 in WILL, breaking per-segmentmapping_codeparity for non-wct species. PSCIS branch now reads from<schema>.pscis(the canonical output of.lnk_pipeline_pscis_build); modelled-branch xref exclusion sources from the same table.
Closes #148. Wednesday-morning sync chain shifted earlier so a fully-fresh local fwapg lands before workday-start, and data-raw/snapshot_bcfp.sh is now schedulable per host without manual install gymnastics.
-
.github/workflows/sync-bcfishpass-csvs.ymlcron: Wed 6 AM PDT (13:00 UTC) → Wed 4 AM PDT (11:00 UTC). Runs 1 h after the upstream dump inNewGraphEnvironment/db_newgraph#7(which itself shifted to Wed 3 AM PDT). - New exported
lnk_baseline_current(log, host, path)predicate. ReturnsTRUEwhen this host’s most-recentdata-raw/logs/bcfp_baselines.csvrow already stamps the upstreambcfp_model_versioncarried inlog. Per-host scoped — M4 stamping a SHA must not gate M1 from snapshotting its own fwapg. -
data-raw/snapshot_bcfp.shupdates: self-anchors to repo root viacd "$(dirname "$0")/.."(so cron-default$HOMEcwd doesn’t break the relative ledger path); skip-guard runs FIRST vialnk_baseline_current()before any DB-credential resolution (a host with a stale env file can skip cleanly when this week’s ledger already matches); sources~/.config/snapshot-bcfp.envfor per-hostDATABASE_URL/PG*vars; xtrace removed fromset -euxo pipefail→set -euo pipefailto keep credentials out of~/.local/state/snapshot-bcfp/*.log. - New
data-raw/scheduler/directory with launchd plist (com.newgraph.snapshot-bcfp.plistfor M4 + M1, fires Wed 5 AM local), Linux crontab line (snapshot-bcfp.cronfor cypher,0 12 * * WEDUTC), andREADME.mddocumenting per-host install + uninstall + env file format.
link 0.32.1
Post-merge /code-check follow-up on #138 (v0.32.0). Three fragility fixes (no behaviour change for valid inputs) plus a stashed snapshot-script fix:
-
.lnk_crossings_union: castmodelled_crossing_idtobigintbefore adding1e9so values past int4 max can’t overflow. Override path (.lnk_crossings_apply_overrides) already did this; the union branch matches now. -
.lnk_crossings_union: switch CABD + modelled FWA joins fromLEFT JOINtoINNER JOIN. Missinglinear_feature_id(FWA refresh drift) previously NULL’dwatershed_keyand the row got silently dropped much later bybarriers_emit’sblue_line_key = watershed_keyfilter — drop at the union step instead so the count discrepancy is observable upstream. -
lnk_points_snap: pre-flight check on input columns.pts.*would otherwise produce aCREATE TABLE ASerror from a column-name collision deep in a 100-line statement; now errors out with a clear list of colliding columns before any DDL runs. -
data-raw/snapshot_bcfp.sh:bcdata bc2pg --refreshrequires the target table to already exist — drop-then-load instead so first-time snapshots succeed.
link 0.32.0
Closes #138. New lnk_pipeline_crossings() builds <schema>.crossings + <schema>.barriers_* from public-source primitives (BCDC PSCIS via bcdata bc2pg, CABD via the public API, bchamp modelled_stream_crossings.gpkg.zip) — no tunnel, no bcfishpass.barriers_* reads. Phase B of the self-sufficiency roadmap (#117 csv-sync + #137 snapshot script were Phase A).
Four new exports — three are generic enough that they may relocate to a future pac package once that’s scaffolded:
-
lnk_inputs_verify(conn, required)— fail-loud existence check for<schema>.<table>preconditions. Single round-trip viainformation_schema.tables. -
lnk_points_snap(conn, table_in, table_out, ...)— bulk lateral-KNN snap to FWA. Wraps the sameCROSS JOIN LATERAL ... ORDER BY <-> ... LIMIT 1pattern used by bcfp’sload_dams.sqland link’s existing CABD path. One SQL round-trip; scales province-wide. Handles MultiPoint inputs viaST_GeometryN(..., 1). -
lnk_barriers_emit(conn, schema)— emits<schema>.crossings_lookup(slim id + statuses projection) plus four<schema>.barriers_*tables (anthropogenic,pscis,dams,remediations). Filters mirror bcfp’smodel/01_access/sql/barriers_*.sqlandremediations_barriers.sql. -
lnk_pipeline_crossings(conn, aoi, cfg, loaded, schema, snap_tolerance, pscis_table, modelled_table, dams_table)— exported pipeline phase. Composes input verification + PSCIS snap + source-precedence union + override application + barriers emit. Source tables configurable via the*_tableargs.
Lean column set: only what lnk_barriers_emit() consumes — aggregated_crossings_id, crossing_source, crossing_feature_type, barrier_status, pscis_status, dam_name, network position columns, geom. Drops bcfp’s road tenure / FTEN / OGC / rail / UTM metadata that downstream non-barrier consumers need.
Live ADMS smoke against local fwapg loaded with data-raw/snapshot_bcfp.sh (link#137): 67 PSCIS + 3,584 modelled = 3,651 crossings unioned in <1s; barriers_emit produces 3,616 anthropogenic / 33 PSCIS / 5 remediations.
Tests: 94 new mocked unit-test expectations across the four exports + two internal helpers (.lnk_crossings_union, .lnk_crossings_apply_overrides). 903 PASS / 0 FAIL total.
link 0.31.1
Closes #137. New data-raw/snapshot_bcfp.sh shell script loads bcfp dependencies into a local Postgres from public sources only — no SSH tunnel, no DB pg_dump. Prepares the local fwapg for lnk_pipeline_crossings() (link#138, in flight) and parity comparisons.
- BCDC PSCIS via Python
bcdata bc2pg --refresh→whse_fish.pscis_*(4 tables). - CABD dams via
ogr2ogrfrom CABD’s public GeoJSON API →cabd.dams. - bchamp
modelled_stream_crossings.gpkg.zipviacurl+ogr2ogr→fresh.modelled_stream_crossings. - bchamp
observations.parquetviaogr2ogr /vsicurl/...→bcfishobs.observations(same artifact bcfp’sjobs/load_observationsconsumes). - Optional
--with-bcfp-views: pulls Simon’s bcfp output views (crossings_vw,streams_vw) froms3://newgraph/for parity comparison. - Stamps
data-raw/logs/bcfp_baselines.csvwith the bcfp build identifier froms3://fresh-bc/bcfishpass/log.jsonvialnk_baseline_append().
Documented in data-raw/README.md under a new ## Bootstrap section.
link 0.31.0
Closes #117. csv-sync flips from GitHub-API SHA-walking to reading from s3://fresh-bc/bcfishpass/ (populated weekly by NewGraphEnvironment/db_newgraph). Cadence drops from daily to weekly Wed afternoon. Eliminates the 1–7 day drift between bundle CSVs and the upstream tunnel rebuild SHA.
Four new exports support csv-sync + downstream parity drivers + future multi-build comparison (grayling / rainbow / ko / etc.):
-
lnk_bucket_get(name, prefix, to)— fetch any artifact from a versioned S3 prefix. Returns raw bytes by default (caller decodes —read.csv(),jsonlite::fromJSON(),arrow::read_parquet()); writes to disk whentois supplied. Default prefix is NGE’s bcfp dump. Format-agnostic. -
lnk_bucket_log(prefix)— sugar for the most common read: parses<prefix>/log.jsoninto a list withmodel_version,date_completed,head_sha. Validates required keys. -
lnk_baseline_read(path)— read the run-tracking ledger (data-raw/logs/bcfp_baselines.csvby default) as a tibble. Validatescols_baselineshape on read. -
lnk_baseline_append(log, run_label, ...)— append a stamped row from alnk_bucket_log()result. Used by csv-sync to record which build each sync ran against; reusable by parity-run drivers.
data-raw/sync_bcfishpass_csvs.R rewritten to consume the new exports; integrates a crate::crt_schema_validate() gate for provenance entries with canonical_schema: declared (escalates drift_kind to "shape" on validation failure).
httr + jsonlite added to Imports.
link 0.30.2
Closes #135. lnk_pipeline_access() now computes dam_dnstr_ind and (optionally) remediated_dnstr_ind from the same primitives that drive the per-species access codes, eliminating the bcfp-merge-in step needed for full BT/WCT parity in 0.30.0. Both lnk_pipeline_access() and lnk_pipeline_mapping_code() consume the new lnk_presence() helper (v0.30.1) to short-circuit absent species cleanly.
-
dam_dnstr_indis sequence-aware: TRUE iff the next-downstream anthropogenic barrier is also a dam. Mirrors bcfp’sarray[barriers_anthropogenic_dnstr[1]] && barriers_dams_dnstroverlap check. Bothbarriers_anthropogenicandbarriers_damspopulate their primary key fromcrossings.aggregated_crossings_id, so the IDs returned byfrs_network_featuresare in a shared space and%in%works directly. ADMS parity vsbcfishpass.streams_access.dam_dnstr_ind: 11803 FALSE / 3960 TRUE, zero off-diagonal differences. -
lnk_pipeline_access()gains an optionalcrossings_table = NULLarg. When supplied alongsidebarrier_sources$remediations, computesremediated_dnstr_indper the bcfp-intended logic — TRUE iff the next-downstream remediation is a crossing wherepscis_status = 'REMEDIATED' AND barrier_status = 'PASSABLE'. - bcfp’s own
streams_access.remediated_dnstr_indis FALSE for every row in the build due to a 2-year-old contradictory clause inload_streams_access.sql(introduced by smnorris/bcfishpass#339 and inlined in v070 by smnorris/bcfishpass#690). link computes the bcfp-intended dual-column semantics somapping_code_<bt|wct>may emitREMEDIATEDtokens where bcfp’s current output emitsDAM/MODELLED/ASSESSED. Upstream fix filed as smnorris/bcfishpass#891 + smnorris/bcfishpass#892. -
lnk_pipeline_access()andlnk_pipeline_mapping_code()accept an optionalpresencearg (anlnk_presenceobject). When supplied, absent species short-circuit cleanly:lnk_pipeline_accessskips thefrs_network_featuresquery and emitsaccess_<sp> = -9;lnk_pipeline_mapping_codeemits"". Eliminates the salmon-group-absent over-emission caught in the multi-WSG sweep on ELKR + HORS. - ADMS validation, no bcfp merge-in:
mapping_code_bt15733/15763 (30 REMEDIATED divergences, all the bcfp v070 regression),mapping_code_ch/cm/co/pk/sk15761/15763 (2 each),mapping_code_st/wct15763/15763. Stamped logs underdata-raw/logs/<TS>_link135_parity_*.txt.
link 0.30.0
Closes #124. Reproduces bcfishpass’s three classification surfaces (crossings.barrier_status, streams_access, streams_mapping_code) as additive layers — link’s existing severity and 5-bucket mapping_code are unchanged.
-
lnk_pipeline_access(conn, segments, aoi, ...)— composesfresh::frs_network_features()(fresh 0.29.0+) calls across species + observations into astreams_access-shape wide tibble. Per-segment per-speciesaccess_<sp>integer codes (-9 / 0 / 1 / 2) for absent / blocked / modelled / observed. Caches per-table dnstr queries — 5 species pointing at one grouped barriers table run the SQL once. Auto-NA propagation when a barriers source has zero rows in the AOI mirrors bcfp’sbarriers_<sp>_dnstr IS NULLsemantics for absent species. -
lnk_pipeline_mapping_code(access, habitat, feature_code, ...)— pure R derivation over the bcfp-shape access columns. Resident-flavor (BT, WCT) vs anadromous-flavor (CH/CM/CO/PK/SK/ST) handling formapping_code_barrier. Spawn-only species (CM, PK) emit onlyACCESS/SPAWNtoken1 (no REAR per bcfp).feature_code = "GA24850150"flagsINTERMITTENT. Optionalto=arg writes<schema>.streams_mapping_codefor downstream views. -
ADMS parity validation: 15762 / 15762 byte-identical to
bcfishpass.streams_mapping_codefor all 8 species (BT, CH, CM, CO, PK, SK, ST, WCT). Per-speciesaccess_<sp>≥99% match (1-row totals diff + ~13-row obs/modelled drift attributable to bcfp’s life_stage / activity / point_type observation filters not yet applied in link). -
barrier_status(Phase 1) — already populated correctly bylnk_pipeline_loadvia.lnk_pipeline_apply_fixes+.lnk_pipeline_apply_pscis. Roxygen note added distinguishingbarrier_status(bcfp-parity, PSCIS-field + CSV override) fromseverity(link’s culvert-geometry scoring). Both can coexist on the same crossings row. -
build_species_views.R --bcfpsibling view per species —streams_<sp>_bcfp_vwcarries the bcfp-shapemapping_code_<sp>string for QGIS A/B comparison against the existingstreams_<sp>_vw(link’s 5-bucket categories). Both views co-exist; symbology hint covers each. -
scripts/update_hosts.sh— pak-bug-bypass updater for trifecta hosts. UsesR CMD INSTALLfrom a GitHub source tarball, sidesteps r-lib/pak#658 which mis-reports cypher’s permission-denied installs as “empty archive” when the user’s first.libPaths()entry isn’t writable. -
data-raw/trifecta_provincial.sh—--rds-dir=pass-through arg for recovery runs that need to bypass the resume RDS cache (e.g. running cypher’s bucket on M4 after cypher destroy). - Caveat for full BT/WCT parity:
mapping_code_<bt|wct>uses bcfp’s pre-computeddam_dnstr_ind/remediated_dnstr_indvia merge-in. Computing those from link primitives requires sequence-aware “next downstream barrier IS a dam” logic — tracked as a follow-up issue. Anadromous species + non-resident BT/WCT in non-overlap WSGs are byte-identical without the merge.
link 0.29.1
Closes #121. Auto-stamps the bcfp comparison reference (model_run_id + version SHA + completion timestamp) into data-raw/logs/bcfp_baselines.csv from inside data-raw/run_provincial_parity.R. Tuesday weekly bcfishpass.* rebuilds shift the comparison reference; un-stamped runs were ambiguous after the fact. Orchestration tooling only — no public R API changes.
- New inline
stamp_bcfp_baseline()helper indata-raw/run_provincial_parity.R, called once per invocation between the per-WSG-timings setup and the WSG loop. Same wiring covers single-host and trifecta-dispatched per-host runs. -
data-raw/logs/bcfp_baselines.csvgains ahostcolumn betweenrun_started_pdtandrun_label. Three existing rows backfilled tohost=m4(single-host M4 runs). Trifecta runs now produce three rows per run, one per host, all with the samebcfp_model_run_id. - Host alias resolves via
LNK_HOST_ALIASenv var (e.g.LNK_HOST_ALIAS=m4in~/.Renviron); falls back toSys.info()[["nodename"]]when unset. - Tunnel-tolerant: connection failure or unset
PG_PASS_SHARElogs a warning and the build proceeds (per-WSG comparisons further down would fail too if the tunnel were genuinely broken, so the stamp is not the actual blocker). Idempotent on(host, link_schema, bcfp_model_run_id, run_started_pdt)— same-minute re-runs (resume scenarios) skip silently rather than duplicate.
link 0.29.0
Closes #118. DB hygiene to prevent the disk-full incident that crashed cypher’s fresh-db container during the 2026-05-04 default_extrabreaks provincial trifecta. Two-tier orchestrator-level cleanup; no in-package API changes.
-
compare_bcfishpass_wsg()gainscleanup_working = TRUEparameter — dropsworking_<aoi>schema after the rollup tibble is built. Default-on; passFALSEfor interactive debug. Saves ~10–15 GB per provincial run on every host (60+ working schemas accumulated otherwise). -
consolidate_schema()gainskeep_source = FALSEparameter — drops source schema on each remote host after a successful pg_restore. Default-on; rc-guarded (failed restore leaves source for retry); warn-but-don’t-fail on drop rc != 0. Saves ~25–30 GB per consolidated bundle on M1 + cypher. -
data-raw/README.mddocuments per-worker disk capacity: rough footprints (~30 GB single-bundle persistent + 10–15 GB per-WSG scratch + 30–40 GB fwapg base), 60 GB minimum free recommendation, 2026-05-04 cypher incident as cautionary tale. - Bit-identical bcfp parity by default. ADMS rollup tibble post-cleanup
identical()to pre-cleanup baseline (RDS file metadata differs but deserialized object is identical). - Approach: orchestrator-level cleanup, NOT in-package —
lnk_pipeline_persiststays scoped to one job; the rollup query reads working schema in long-form AFTER persist returns, so the natural lifecycle owner is the orchestrator script.
link 0.28.0
Orphan-class break source — fed-vector experiments now Just Work without a separate knob. When cfg$pipeline$gradient_classes (or the caller’s classes arg) contains thresholds below every modelled species’s access_gradient_max, those positions enter gradient_barriers_minimal as a barriers_orphan table — no per-species filter, no minimal reduction (every detected position splits the network for segmentation precision). Access semantics are unaffected: fresh’s per-species access label filter at classify time rejects any gradient_NNNN label below the species’s threshold, so orphan classes never block any species.
- New experimental bundle
inst/extdata/configs/default_extrabreaks/extendsdefaultwithpipeline.gradient_classesset to the union of access (0.15/0.20/0.25) + per-species spawn / rear gradient maxima from fresh’sparameters_habitat_thresholds.csv(0.0249–0.1049). Persists tofresh_default_extrabreaksschema for side-by-side compare against thefresh_defaultreference. - ADMS smoke test on the bundle: BT spawning +11.2 km (+3.1 %) vs default-bundle baseline; SK spawning +13.9 km (+6.4 %); RB spawning +8 km (+2.6 %). Rear shifts much smaller (±5 km). Effect is the “ceiling sub-segment” mechanism: when a generally-flat reach is broken at a low spawn/rear gradient threshold, the steep pocket separates and the remaining majority averages to a lower local gradient that newly passes the per-segment spawn predicate.
- Provincial run:
./trifecta_provincial.sh --config=default_extrabreaks --schema=fresh_default_extrabreaks(~2.5h wall, same shape as the v0.26.0 default trifecta). - Bit-identical to v0.27.0 on bcfp + default config (no orphans — both default vectors live at-or-above each species’s access threshold). Suite: 735 PASS / 0 FAIL.
link 0.27.0
Closes #45. Two coupled hardcodes in R/lnk_pipeline_prepare.R — the bcfishpass gradient class break vector and the per-model class filter list — are now configurable. Unblocks alternative-methodology experiments that need different break thresholds (e.g. breaking the network at the union of unique per-species rearing/spawning/access gradient values, or finer 0.05-step bins) while preserving bit-identical bcfishpass parity by default.
-
lnk_pipeline_prepare()gains aclassesargument — a named numeric vector of gradient class break thresholds. WhenNULL, falls back tocfg$pipeline$gradient_classesif set in the bundle, otherwise to the bcfishpass defaultc("1500" = 0.15, "2000" = 0.20, "2500" = 0.25, "3000" = 0.30). Optionalpipeline.gradient_classesknob documented (commented-out) inbcfishpass/config.yamlanddefault/config.yaml. -
.lnk_pipeline_prep_minimal()replaces the hardcoded per-modelmodelslist with per-species derivation: for each species incfg$species(withloaded$parameters_fresh$species_codefallback), classes whose value is>= access_gradient_maxform that species’s barrier filter. Per-species barrier tables becomebarriers_<sp>(lowercase species code, validated). Species with NA / zero / missingaccess_gradient_maxare skipped. - Bit-identical bcfp parity verified on ADMS/HARR/BABL/BULK (same digests as pre-#45 baseline). Override mechanism end-to-end demonstration: dropping the 0.25 break on ADMS expands BT habitat ~30% (BT@0.25 loses its barrier filter when no class >= 0.25 exists), CH/CO/SK unchanged.
- Empty species set (no presence-flagged species + no override) yields a structurally valid empty
gradient_barriers_minimaltable so downstream phases find the expected name. Defensivesp_amax[1L]handles the (unlikely) case of duplicatespecies_coderows inparameters_fresh.csv— would otherwise trip R 4.3+ length-1 enforcement on||. - 5 new + 2 updated mocked tests (
prep_gradientclasses threading;prep_minimalper-species derivation, skip path, custom-vector path;.lnk_resolve_classesprecedence; YAML→R round-trip throughlnk_config()). Full suite: 728 PASS / 0 FAIL.
link 0.26.0
Closes #112. Per-WSG output now persists into province-wide <schema>.streams + <schema>.streams_habitat_<sp> tables, mirroring bcfp’s bcfishpass.streams + bcfishpass.habitat_linear_<sp> pattern. Queryable across WSGs for cartography, intrinsic-potential maps, per-crossing rollups, and methodology comparisons — no more re-running 232 WSGs to look at one.
- New
lnk_persist_init(conn, cfg, species)— idempotentCREATE SCHEMA IF NOT EXISTS+CREATE TABLE IF NOT EXISTSfor the persistent tables. DDL driven bycols_streams(21 columns mirroring bcfp.streams + link’sid_segment) andcols_habitat(7 columns: id_segment + watershed_group_code + 5 booleans accessible/spawning/rearing/lake_rearing/wetland_rearing).geom geometry(MultiLineStringZM, 3005)— FWA streams are XYZM (X, Y, elevation, measure). - New
lnk_pipeline_persist(conn, aoi, cfg, species, schema)— DELETE-WHERE-WSG + INSERT for streams + per-species streams_habitat_. Long→wide pivot: per-species INSERT filters working_<aoi>.streams_habitat WHERE species_code = '<sp>'and projectscols_habitat(drops species_code from SELECT). Idempotent re-runs replace cleanly. - Pipeline rewire: per-WSG segment-level data (
streams,streams_habitat,streams_breaks) now lives inworking_<aoi>(the per-WSG schema where every other staging table already lived) instead of the previously-sharedfreshschema. ~12 hardcoded literals updated acrosslnk_pipeline_prepare/break/classify/connect+compare_bcfishpass_wsg.R. - New
pipeline.schemaconfig knob (REQUIRED, defaultfresh) — enables side-by-side bundle compare (schema: fresh_bcfpvsschema: fresh_default), within-host parallelism (schema: fresh_w1/fresh_w2), branch isolation, centralized vs distributed write target. -
compare_bcfishpass_wsg.Rorchestrator now callslnk_persist_init+lnk_pipeline_persistafterlnk_pipeline_connect. - Trifecta provincial run end-to-end (M4 + M1 + cypher, ~2h wall, pg_dump consolidation onto M4): 217 WSGs / 5.3M segments persistently in
fresh.streams. 5/5 test WSG rollups byte-identical to pre-#112 baseline (LRDO/SETN/ADMS/BULK/HARR on SK spawn+rear+lake). - New tests:
test-lnk_persist_init.R(28),test-lnk_pipeline_persist.R(4). Updated 3 stale literal-string assertions intest-lnk_pipeline_prepare.R+test-lnk_pipeline_classify.R. Full suite: 710 PASS / 0 FAIL. - Removed
data-raw/run_nge.R— superseded bycompare_bcfishpass_wsg(wsg, lnk_config("default")).
link 0.25.1
Pre-trifecta config homework — catches staleness in the config layer before the 3-host distributed run, so we’re not chasing ghosts later.
- Both bundles’
rules.yamlregenerated vialnk_rules_build()(date-stamp diff only — semantically identical to what was committed). -
provenance:checksums recomputed in bothconfig.yamlfor the 4 files modified across v0.21–v0.25 (rules.yaml, dimensions.csv, parameters_fresh.csv, overrides/wsg_species_presence.csv).lnk_config_verifynow reports drifted = 0 / 12 for both bundles. - Closes #108 —
compare_bcfishpass_wsgreturnsbcfishpass_value = NA(not 0) when bcfp doesn’t model a species. Distinguishes “real measured zero” from “not modelled by bcfp”;diff_pctcleanly resolves to NA. PARS run proves GR / KO / RB classify end-to-end on the default bundle (KO 377 ha lake-rearing, RB 1,839 ha lake + 7,796 ha wetland, GR 1,566 ha lake). -
compare_bcfishpass_wsgadds aspeciesfilter parameter — passc("BT","CH",…)to drop GR/KO/RB from the rollup entirely. - 4 stale tests in
test-lnk_rules_build.Rupdated for thestream_order→stream_order_min/stream_order_maxrename (fresh#198) and the per-speciesin_waterbodysemantics. Full suite: 668 PASS / 0 FAIL. - New
data-raw/audit_configs.Rreports drift across all layers — re-runnable before any trifecta or provincial run.
link 0.25.0
Closes #106. Drops the hardcoded species-presence column list in lnk_pipeline_species + lnk_pipeline_break — both now derive the column list from the wsg_species_presence.csv header via the new .lnk_wsg_species_present() helper. Adding a new species column propagates to every callsite without a code edit.
- Adds
ko(Kokanee) column to both bundles’wsg_species_presence.csvwith sentineltfor PARS, KOTL, NATR, CARP — interim until upstreambcfishpass.wsg_species_presenceships authoritative coverage (NewGraphEnvironment/bcfishpass#12). - Adds GR + KO species rows to
default/parameters_fresh.csv(already indefault/dimensions.csvandrules.yaml). - New tests assert column-propagation for newly-added species and
notes-column ignoring.
link 0.24.0
Closes #103. Ingests CABD dams as a parallel reporting dimension. .lnk_pipeline_prep_dams() replicates bcfishpass’s model/01_access/sql/load_dams.sql against the cabd.dams source over the db_newgraph tunnel and writes <schema>.dams mirroring bcfishpass.dams column-for-column. Both bcfishpass and default bundles ingest — the data is methodology-agnostic at the data layer.
- New optional
conn_tunnel = NULLarg tolnk_pipeline_prepare(). When NULL,prep_damsshort-circuits toDROP TABLE IF EXISTS <schema>.dams— zero-cost opt-out for CI / non-reporting workflows. - Four CABD edit CSVs (
cabd_exclusions,cabd_blkey_xref,cabd_passability_status_updates,cabd_additions) ship in both bundles’overrides/and are loaded throughlnk_load_overrides()like any other override. -
Habitat output is unchanged.
<schema>.damsand<schema>.cabd_*are not consumed by any break / classify / connect phase. HARR dams-ON / dams-OFF rollup is byte-identical to fp precision; confirms the parallel-data invariant. - LFRA verification: 65 dams / 59 barriers / 15 named, with Stave Falls (26 m), Alouette (22.5 m), Ruskin (59.4 m), Coquitlam (30.5 m), Northwest Stave + Upper Stave variants all present at the same
(blue_line_key, downstream_route_measure)asbcfishpass.damswithin fp precision. - Per-species methodology — “should some dam classes block which species in the default bundle?” — is intentionally out of scope; tracked at #83.
link 0.23.0
Closes #96. falls added as a segmentation break source — the FWA stream network is now broken at every fall position. Previously the <schema>.falls table was loaded and used for access gating + obs/habitat lift but not for segmentation, so close-paired falls (no other break source between them) produced segments that spanned the second fall and incorrectly classified its upper portion as accessible.
- New entry in
R/lnk_pipeline_break.R’ssource_tablesand the defaultbreak_order. Both bundle configs (bcfishpass,default) opt in viapipeline.break_order. - Falls are NOT minimal-reduced — each fall is its own barrier (unlike gradient barriers which go through
frs_barriers_minimal). - Closes the implementation drift from the docstring at
R/lnk_pipeline_break.R:10-13which already documented the bcfp order asobservations → gradient_minimal → falls → barriers_definite → habitat_endpoints → crossings. - 4-WSG regression vs pre-fix baseline (HARR/HORS/LFRA/BABL): all four show small expected reductions (BT ~0.6–1.5 km on HARR/HORS; 7 species × ~0.43 km each on LFRA; 4 species × 0.94–1.59 km each on BABL). All deltas negative — segments above falls correctly become inaccessible. See
research/bcfishpass_comparison.md§ “falls in break_order (#96)”. - HORS BLK 356357296 evidence case: pre-fix segment 12671 (1447 m straddling the fall at DRM 67565) split into 12677 (17 m below) + 12678 (1429 m above,
accessible=FALSE). - Map cache helper
data-raw/maps/_lnk_map_compare.Rhardened — stale 0-row caches (left when the pipeline runs for one WSG and the map is rendered for another) now refetch instead of erroring on missing CRS.
link 0.22.0
Wires fresh::frs_order_child into the pipeline as link methodology — small streams plugging directly into large rivers can be credited as rearing despite low/missing FWA channel-width estimates. Closes fresh#158 on the link side.
- Four new per-species columns in
dimensions.csv(both bundles), all opt-in viarear_stream_order_bypass: yes/no:-
rear_stream_order_parent_min— min order at the trib BLK’s mouth confluence (default 5, matches bcfp) -
rear_stream_order_child_min— lower bound on segment’s own stream_order (default 1) -
rear_stream_order_child_max— upper bound on segment’s own stream_order (default 1) -
rear_stream_order_distance_max— cap (m) on distance from trib mouth (empty = no cap)
-
-
lnk_rules_buildemits the values into achannel_width_min_bypass:block on the rear stream-edge rule.lnk_pipeline_classifyreads the block and callsfrs_order_childper species post-classification, gated onrear_stream_order_bypass. - Both bundles ship
bypass: nofor all species — infrastructure is parametric and tested but disabled by default. Re-enable per species viadimensions.csv. The 4-WSG regression (HARR / HORS / LFRA / BABL) is byte-identical to the pre-#96 baseline with bypass=off, confirming the wiring is purely additive when disabled. - Updates
inst/extdata/configs/dimensions_columns.csvxref doc with all four new columns and refreshes therear_stream_order_bypassentry (was stale — said “currently inert”). - Bumps fresh dep to
>= 0.27.5for the renamed bypass YAML schema (stream_order→stream_order_min+stream_order_max).
Related: link#23 (CH spawning misread, closed not-a-bug). PWF for the wire-up at planning/active/.
link 0.21.0
Closes #87. Default-bundle SK upstream-spawn now credits any spawn-eligible segment upstream of and accessible from a qualifying rearing waterbody, dropping bcfishpass’s restrictive cluster + lake-adjacency gate. bcfishpass-bundle SK keeps the gate (parity preserved).
- New
spawn_connected_lake_adjacentcolumn on bothdimensions.csvschemas. SK row:yes(bcfishpass) /no(default). Empty for non-SK species — inherits fresh’sTRUEdefault. -
lnk_rules_buildemits<sp>.spawn_connected.lake_adjacentwhen the dimension is non-empty. Older rules.yaml files without the key remain valid. - Bumps fresh dep to
>= 0.26.0(knob lives there).
link 0.20.1
Closes #92. Per-AOI observations filter mirrors bcfp’s wsg_species_presence + observation_key exclusions.
- New
.lnk_pipeline_prep_observations()builds<schema>.observationsper AOI, mirroring bcfp’smodel/01_access/sql/load_observations.sql. Filtersbcfishobs.observationsby the WSG’s species set (only species marked present count) and applies QA exclusions (data_error/release_excluderows removed, keyed onobservation_key— wasfish_observation_point_id, never present in the CSV; the empty intersect silently dropped all 1,182 exclusions). - Downstream consumers updated:
prep_overridesreads<schema>.observations(no longer takesobservationsparam);lnk_pipeline_break_obssimplified to a thin reader;lnk_barrier_overridesusesobservation_key. - TWAC pre-flight: BT spawning/rearing/rearing_stream collapsed from +21–30% over-credit to 0.0% across the board. 15-WSG
tar_make: HARR + LFRA BT tightened toward parity (LFRA BT rearing_stream -3.75% → -0.93%; HARR BT rearing_stream -4.19% → -1.29%); other 13 WSGs unchanged. HORS BT stays -7.68% (fresh#158 stream-order bypass — distinct mechanism). - Default bundle also tightens (6 rows on HARR/LFRA BT) — methodology correctness improvement, not a regression.
link 0.20.0
Closes #88. Subsurfaceflow folded into the natural-barrier set so per-species observation/habitat upstream lift fires on it.
-
.lnk_pipeline_prep_natural()now builds the full bcfishpass natural-barrier union (gradient + falls + opt-in subsurfaceflow). Subsurfaceflow positions land in<schema>.natural_barriers, whichlnk_barrier_overrides()consumes — so per-species observation/habitat upstream lift applies to subsurfaceflow exactly as it does to falls and gradient. -
.lnk_pipeline_prep_subsurfaceflow()deleted; its body absorbed intoprep_natural. Six prep helpers → five. - Default-bundle off-switch unchanged: omit
subsurfaceflowfromcfg$pipeline$break_orderand the entire code path skips. Verified bit-identical default rollup (0 of 581 rows changed). - bcfishpass-bundle parity: HARR CH/CO/ST rearing_stream gaps closed from -14.8/-13.3/-11.6% to within ±0.32%. LFRA CH/CO/ST closed to within ±0.6%. HARR blkey 356286055 BT credits 6.509 km (was 0).
- Reproducibility: two consecutive 15-WSG
tar_makeruns produced byte-identical rollup (digest::digest(link_value)matches across runs). - HORS rearing_stream gap (~7% on BT/CH/CO) is unchanged by this fix — separate mechanism, follow-up.
link 0.19.0
Closes #82. Subsurface-flow access barriers + parity claim retraction.
Subsurface-flow as opt-in access barrier. Closes the largest single gap surfaced when expanding the bcfishpass-config rollup from 5 to 10 watershed groups: NATR BT spawning +15.2% → +1.5%, NATR BT rearing +13.0% → -0.6% (10-WSG tar_make log: data-raw/logs/20260429_02_tar_make_subsurf.txt).
- New
.lnk_pipeline_prep_subsurfaceflow()materialises<schema>.barriers_subsurfaceflowfromwhse_basemapping.fwa_stream_networks_spfiltered toedge_type IN (1410, 1425). Honoursuser_barriers_definite_control. Mirrors bcfishpassmodel/01_access/sql/barriers_subsurfaceflow.sqlexactly. - New
subsurfaceflowentry inlnk_pipeline_break.Rsource_tablesmap; conditional UNION ALL inlnk_pipeline_classify_build_breaksso the new break source emits'blocked'intofresh.streams_breakswhen the config opts in. - Inclusion is gated on
cfg$pipeline$break_ordercontaining'subsurfaceflow'at every site (prepare, break, classify). Configs control the toggle, not code. -
inst/extdata/configs/bcfishpass/config.yamlopts in (parity with bcfishpass).inst/extdata/configs/default/config.yamldoes not opt in (NewGraph methodology decision pending). -
?lnk_pipeline_breakgains a## Break sourcestable covering every validbreak_orderentry — source table, role, classify-phase label. Both bundledconfig.yamlfiles carry an inline comment listing the available entries with one-line semantics so future-readers see the toggle without leaving the config file.
Parity claim retraction. Earlier framing (“all species within 5%”, “exact reproduction”) held only on a small set of pre-selected WSGs. The 10-WSG rollup surfaced systematic gaps. Vignette pulled, README and DESCRIPTION reframed as experimental.
-
vignettes/habitat-bcfishpass.Rmdremoved; bundled vignette data ininst/extdata/vignette-data/removed. -
README.mdrewritten as one-liner (“Experimental package — breaking all the time and loving the learning curve”) plus install + license. -
DESCRIPTIONTitle and Description reframed;bookdown,knitr,mapgl,rmarkdowndropped from Suggests;VignetteBuilderremoved. -
data-raw/_targets.Rextended to 10 WSGs (PARS, MORR, KISP, KOTL, NATR added). -
research/bcfishpass_comparison.mdretraction at top with the diagnosis tables and the natural-vs-anthropogenic two-tier classification reference; historical content preserved below. -
CLAUDE.mdStatus block flags remaining gaps.
Remaining departures (per research/bcfishpass_comparison.md): 7 of 210 spawning/rearing/rearing_stream rows >5%, six of seven link < bcfishpass. Concentrated on MORR ST (cluster connectivity), MORR SK and KISP SK (new geographies for the existing fresh#147 SK lake-proximity logic). Tracked separately; not in this release.
link 0.18.1
Closes #78. Adds attribution for redistributed upstream data and refreshes the package Title + Description to reflect the package’s current scope.
-
LICENSE-bcfishpassat root — verbatim copy of upstreamsmnorris/bcfishpassLICENSE governing the redistributed override CSVs -
NOTICE.mdat root — source/license table, names redistributed files -
inst/extdata/configs/{bcfishpass,default}/overrides/README.md— pointer files reachable viasystem.file() -
README.md“Acknowledgements” section above License -
Authors@R— Simon Norris added as[ctb] -
Title—Habitat and Connectivity Interpretation for Stream Networks(was the v0.6-eraCrossing Connectivity Interpretation) -
Description— refactored to mirror the README’s “fresh answers what the habitat is, link answers what the features mean for the network” framing; names the three habitat axes (intrinsic potential, accessibility under connectivity, per-feature rollups)
CITATION file and mirror to NewGraphEnvironment/crate (which also ships bcfishpass fixtures via crt_ingest examples) deferred — to be filed as their own work.
link 0.18.0
Closes #65. Decompose the config bundle into a manifest layer and a data-ingest layer, and route registered files through crate for source-agnostic canonicalization.
lnk_config() is now manifest-only. It reads config.yaml and returns paths, file declarations, pipeline knobs, and provenance — no parsed CSVs. Cheap to call. lnk_config_verify() and lnk_stamp() no longer pay for CSV parsing they don’t need.
New: lnk_load_overrides(cfg) materializes the data files declared in cfg$files and returns a named list of canonical-shape tibbles. Entries with source + canonical_schema declarations dispatch through crate::crt_ingest() (currently bcfp/user_habitat_classification); others fall through to local reads dispatched on path extension. New source families plug in by config edit alone — no link R code change.
New config.yaml schema. Top-level rules: and dimensions: paths replace files.rules_yaml / files.dimensions_csv (format follows from the path’s extension, not the key name). The previous files: and overrides: maps merge into one flat files: map keyed by filename stem (e.g. user_barriers_definite, pscis_modelledcrossings_streams_xref). Each entry carries path: and optionally source: and canonical_schema:. Configs may declare extends: to inherit from another config; child entries override same-key parent entries.
Pipeline phase signatures gain loaded. Every lnk_pipeline_* phase that reads a data table now takes cfg and loaded together. Callers (the bundled targets pipeline, project scripts) call lnk_load_overrides(cfg) once and thread the result through phases. cfg$overrides$X and cfg$habitat_classification access points become loaded$X. See data-raw/_targets.R and data-raw/compare_bcfishpass_wsg.R for the pattern.
Verification. tar_make() on 5 WSGs × 2 configs reproduces the v0.17.0 baseline rollup bit-identically (sha256 a82de9928809b9751213e08916c476b4ee3f99286bc9ea2dc53f9659eeb92097). Refactor introduces no behaviour change.
Migration
| Old | New |
|---|---|
cfg$rules_yaml |
cfg$rules |
cfg$dimensions_csv |
cfg$dimensions |
cfg$parameters_fresh (data frame) |
loaded$parameters_fresh |
cfg$habitat_classification |
loaded$user_habitat_classification |
cfg$observation_exclusions |
loaded$observation_exclusions |
cfg$wsg_species |
loaded$wsg_species_presence |
cfg$overrides$X |
loaded$X (e.g. loaded$user_barriers_definite) |
Out of scope (follow-up issues):
- crate schemas for the other 9 bcfp-sourced files (one issue per file as canonical-shape decisions concretize). Today they fall through to plain CSV read.
-
nge/localsource families (when project-experimental configs need them). - Type-aware variant matching in crate (planned crate v0.1.x roadmap).
link 0.17.0
Ship the Modelling spawning and rearing habitat using bcfishpass defaults vignette (vignettes/habitat-bcfishpass.Rmd) on top of the post-phase-3 codebase. Regenerated bundled artifacts (inst/extdata/vignette-data/{rollup, sub_ch, sub_ch_bcfp}.rds) reflect the corrected emit semantics and tighter parity.
bcfishpass-bundle parity (5 WSGs × 5 species, spawn + rear):
- 42 of 42 non-NA rows within ±5%
- 35 of 42 within ±2%
- median 1.1%; max 5.0%
Tighter than v0.13.1’s 100% within ±5% / median 1.5% claim because phase 1’s emit-semantics fix landed in main, and the regenerated rollup reflects it. Spawning rows that previously sat at +3-5% (BT/CH/CO/ST across multiple WSGs) are now at +0-2%.
The vignette text claim updated to match the new numbers. Cuts the v0.13.1 vignette’s residual-deltas paragraph that mentioned overlay-range-containment and stream-order-bypass — those were pre-phase-3 artifacts; with rule emission corrected, residual deltas are mostly segmentation-boundary rounding plus the documented stream-order bypass.
link 0.16.0
Phase 3 of #69 — proof artifact + emit-semantics fix.
Proof artifact: new research/rule_flexibility.md runs BABL × CO under three configs (use case 1, use case 2, bcfishpass) by swapping only dimensions.csv cells, with rules.yaml diffs side-by-side. Reproducible via data-raw/rule_flexibility_demo.R + data-raw/rule_flexibility_render.R. Demonstrates that every methodology dial is a CSV cell, no buried emission rules. The numbers prove the matrix:
- Use case 1 (default bundle): rearing 1388.90 km, lake_rearing 54507.85 ha, wetland_rearing 5786.74 ha. Counts polygon-mainlines as linear AND rolls up polygon area.
- Use case 2: rearing 1271.02 km, same area rollups. Excludes polygon-mainlines from linear via
in_waterbody: false+area_only: trueon L/W; areas still bucket via the polygon rules. - bcfishpass bundle: rearing 1271.02 km, no area rollup (no L/W polygon rules at all). Functionally identical rear predicate to use case 2 because
area_only: truemakes the L/W rules contribute to bucket flags only.
Emit-semantics fix in lnk_rules_build() (under #69 phase 1 banner — corrects a bug introduced in 0.14.0):
Previous behaviour: rear_stream_in_waterbody: yes emitted in_waterbody: true on the stream rule. fresh interprets that as “match segments inside polygons ONLY,” the opposite of the column’s intent (“include polygon-mainlines too”). The default bundle’s permissive rear was effectively only matching in-polygon segments — broken since 0.14.0 but never visible because the bcfishpass bundle (which set no for all species) was the only side tested for parity.
Corrected emit:
-
yes(or absent): omit thein_waterbodyfield. Rule matches segments inside AND outside polygons (today’s permissive default — polygon-mainlines count too). -
no: emitin_waterbody: false. Rule matches outside polygons only (strict partition).
The third grammar state (in_waterbody: true = inside polygons only) has no biological use case for stream rules and is no longer emitted by lnk_rules_build().
bcfishpass bundle output unchanged: the bundle ships rear_stream_in_waterbody: no for all species, so the fixed emit produces byte-identical rules.yaml to 0.15.0. Default bundle output changes (now actually permissive — pass-through stream rule).
Tests updated (3 cases): yes (or absent) omits the field; no emits in_waterbody: false; default bundle smoke tests assert the rear stream rule has no in_waterbody field.
link 0.15.0
Phase 2 of #69. Adds dimensions-driven area_only emission + polygon-rule mainlines edge filter. Default bundle now ships use case 1 (linear includes mainlines through L/W polygons; area rolls up via bucket flags) with the new edge filter restricting polygon-rule contributions to mainlines only (1000/1100). bcfishpass bundle output unchanged.
New per-species columns in dimensions.csv:
-
rear_lake_area_only— yes/no — emitarea_only: trueon the L polygon rule. Whenyes, fresh derives thelake_rearingbucket flag from the rule but excludes it from the mainrearpredicate (linear). Whennoor absent, the rule contributes to both (today’s behaviour). Both bundles shipnofor all species — default ships use case 1; bcfishpass ships parity-with-bcfp. -
rear_wetland_area_only— yes/no — same shape on the W polygon rule. Both bundles shipnofor all species.
Polygon-rule edge filter (edge_types_explicit: [1000, 1100] on L/W rules in the additive rear branch):
- Restricts the L/W polygon rule’s match to mainlines (single-line main flow + secondary flow) when emitted under
rear_lake: yesorrear_wetland: yes+rear_wetland_polygon: yes. Without the filter, polygon rules matched every segment in the polygon (shorelines 1700, banks 1800, island edges, construction lines), all crediting linearrearing. The bucket pred (lake_rearing/wetland_rearing) is unaffected — area still rolls up the polygon’s full area as long as any tagged segment exists in it. - The
rear_lake_onlybranch (SK / KO) is intentionally not filtered — the L rule there IS the rear classification, must continue matching the whole lake polygon.
Default bundle methodology shift — use case 1: linear km includes mainlines through wetlands and lakes, with area rollups (lake_rearing_ha, wetland_rearing_ha) populating from the polygon footprint. rear_wetland_polygon flipped from no (v0.14.0) back to yes for rear_wetland=yes species. The 2026-04-27 cut to no was the right call given the v0.14.0 grammar (no edge filter; W rule would over-emit), but with the mainlines edge filter shipped here, polygon-mainlines are the right thing to count for linear AND area.
Required: fresh ≥ 0.24.0 (#182, fresh#184) — area_only predicate decouples bucket-flag derivation from the main rear predicate.
Tests — test-lnk_rules_build.R 130 PASS (was 124 in 0.14.0): 6 new tests covering area_only emission per the columns + polygon-edge-types filter present on L/W rules (additive branch only) + rear_lake_only branch left untouched. Full suite 554 PASS / 0 FAIL.
BABL parity (bcfishpass bundle): unchanged from 0.14.0 — 8 of 10 rows within ±2%, 10 of 10 within ±5%. The new knobs are inert when set to today’s defaults, so bcfp bundle output is byte-identical to v0.14.0.
Coordinates with #69 phase 3 — research/rule_flexibility.md proof artifact runs BABL × CO under three configs (use case 1, use case 2, bcfishpass) by swapping only dimensions.csv cells, with rules.yaml diffs side-by-side.
link 0.14.0
Dimensions-driven in_waterbody + bcfishpass-bundle methodology fixes that bring 5-species BABL parity to ±5% (8 of 10 rows within ±2%) on the bcfishpass bundle. The methodology dials are now visible in dimensions.csv cells per species — no buried emission rules.
New per-species columns (#69 phase 1):
-
spawn_stream_in_waterbody— yes/no — emitin_waterbody: <bool>on the stream-spawn rule.noexcludes polygon-mainlines from spawn classification (the partition that pairs withwaterbody_type: R/L/Wpolygon rules);yesis permissive and matches polygon-mainlines too. Both bundles ship withnofor all species (biology — spawning happens in stream channels). -
rear_stream_in_waterbody— yes/no — same shape on the stream-rear rule. bcfishpass bundle shipsno(strict partition matches bcfishpass’s per-species access SQL); default bundle shipsyes(NewGraph permissive — counts polygon-mainlines asrearingfor species withrear_lake: yesetc., orthogonal to area rollups). -
rear_wetland_polygon— yes/no — gate emission of thewaterbody_type: Wpolygon rule. Whenno, only the 1050/1150 wetland-flow carve-out emits; whenyes(or absent), the W polygon rule emits too (sets thewetland_rearingflag for area rollups). Both bundles shipnofor all species — segments inside an FWA wetland polygon are wider than the fish-bearing channel and shouldn’t count as rearing habitat.
Methodology fixes carried in from earlier branch work (previously held in vignette-ship):
-
apply_habitat_overlay: falseflag inpipeline:block of bcfishpassconfig.yaml. Comparison-scope choice, not a behavioural claim about bcfishpass. bcfishpass ships both layers:habitat_linear_<sp>(per-species rule output) andstreams_habitat_linear(rule + known-habitat overlay blended). The bcfishpass bundle disablesfrs_habitat_overlay()so its output is rule-only and compares apples-to-apples against bcfishpass’s own rule layer (habitat_linear_<sp>). Comparing the rule slices in isolation keeps rule-emission drift from hiding behind known-habitat overlay drift; overlay parity is a separate question to revisit once rule parity is locked. Default bundle keeps overlay enabled (NewGraph methodology produces the blended output by default). -
lnk_barrier_overrides()habitat-confirmation SQL updated for bcfishpass’s authoritative CSV shape (post-2026-04-26:species_code+spawning+rearinginteger columns instead of the droppedhabitat_indcolumn). -
lnk_pipeline_prepare()empty-table fallbackCREATE TABLEmatches the new CSV shape.
Required: fresh ≥ 0.23.1 (#180, fresh#181, fresh#183) — adds the in_waterbody predicate to the rule grammar plus the validator hotfix.
Tests — test-lnk_rules_build.R 124 PASS (was 86): 6 new tests for in_waterbody emission across permutations + bundle-level smoke tests; 4 new tests for rear_wetland_polygon (yes/no/absent backward-compat). Full suite 516 PASS / 0 FAIL.
BABL parity (bcfishpass bundle): 8 of 10 spawning+rearing rows within ±2%; max 5.0%; max spawning drift 1.5% (was 4.8%). The remaining ±2-5% drift is a follow-up — phase 2 will add the area_only predicate (fresh#182) and edge_types_explicit: [1000, 1100] filter on polygon rules to support the use case 2 pattern (mainlines excluded from linear, area still rolls up).
Coordinates with #69 phase 2 — adds rear_lake_area_only / rear_wetland_area_only columns once fresh#182 lands. Phase 3 ships the proof artifact (research/rule_flexibility.md) running BABL × CO under three configs (use case 1, use case 2, bcfishpass) by swapping only dimensions.csv cells.
link 0.13.0
Shape fingerprint + halt auto-merge on shape drift (#64).
data-raw/sync_bcfishpass_csvs.R and the daily sync-bcfishpass-csvs.yml cron previously compared each bcfishpass-sourced CSV against a recorded sha256 byte checksum and auto-merged any drift. That worked for value drift (rows added/edited) but was blind to shape drift — bcfishpass’s 2026-04-26 long→wide reshape (with column type change) passed straight through and broke link’s pipeline downstream. This release adds a separate shape fingerprint alongside the byte checksum; the workflow auto-merges byte-only drift as before but halts shape drift for coordinated review.
- New
shape_checksumfield in theprovenance:block of each bundle’sconfig.yaml. Computed as sha256 of the file’s first line (whitespace-normalized). Catches column rename / add / remove / reshape — the dominant failure mode. Type changes within stable columns are out of scope (rarer; can extend later if needed). -
data-raw/sync_bcfishpass_csvs.Rcomputes shape fingerprint at sync time, classifies each file’s drift asbyteorshape, writes the overall drift kind to/tmp/sync_drift_kindfor the workflow to consume. -
.github/workflows/sync-bcfishpass-csvs.ymlreads the drift kind. Byte-only drift → auto-PR + auto-merge as today. Shape drift → auto-PR opens withschema-driftlabel, NOT auto-merged, workflow exits non-zero (red on Actions tab) so the change is visible. Coordinated review across link / fresh / crate is required before merging. -
lnk_config_verify()extended withshape_driftcolumn. Breaking (pre-1.0): old singledriftcolumn renamed tobyte_drift; existing tibble shape now(file, byte_expected, byte_observed, byte_drift, shape_expected, shape_observed, shape_drift, missing). -
lnk_stamp()markdown rendering surfaces both byte and shape drift counts in the provenance summary. - 15 new tests (468 total, was 453) —
.lnk_shape_fingerprint()helper + shape-drift detection + missing-file handling + backward-compat path for bundles withoutshape_checksum:field.
Coordinates with crate’s adapter pattern (link#65, crate#2) — when shape drift fires, crate’s normalize handler is the right place to absorb the upstream change before link’s pipeline sees it.
link 0.12.0
Pick up fresh 0.22.0 overlay simplification — caller-side update for the canonical-shape contract.
-
lnk_pipeline_classify()now callsfrs_habitat_overlay()withspecies_col = "species_code"+habitat_types = c("spawning", "rearing")instead offormat = "long"+long_value_col = "habitat_ind". Matches the shape bcfishpass’suser_habitat_classification.csvadopted on 2026-04-26 (row-per-(segment × species), per-habitat indicator columns). Three-line caller-side diff; no link API change. -
Suggests: fresh (>= 0.22.0). Coordinates with fresh#177. - Pipeline runs again. The vignette stays in
dev/until link#64 (sync workflow shape fingerprint) and link#65 (lnk_load_overrides()viacrate::crt_ingest()) land.
link 0.11.2
bcfishpass vignette pulled out of pkgdown until tighter.
-
vignettes/reproducing-bcfishpass.Rmd→dev/habitat-bcfishpass.Rmd.draft. Same pattern as scoring-crossings — out of build path, preserved for resumption when content lands clean. - Content updates applied before move: title now “Modelling spawning and rearing habitat using bcfishpass defaults”; new scope paragraph describing what bcfishpass covers beyond linear classification; entrypoint replaced with explicit
lnk_pipeline_*calls (wastar_make()); map section clarifies linear classification covers spawning/rearing/lake_rearing/wetland_rearing per species. -
README.md: “Full pipeline (reproducing bcfishpass)” → “Full pipeline (linear habitat classification)”; broken pkgdown vignette link removed. - Open follow-ups: rollup-query retarget to
streams_habitat_linearfor apples-to-apples post-overlay comparison; range-containment relaxation infresh::frs_habitat_overlay.
link 0.11.1
Vignette cleanup.
-
vignettes/scoring-crossings.Rmdmoved todev/scoring-crossings.Rmd.draft— out of build path until the scoring methodology lands. -
vignettes/reproducing-bcfishpass.Rmdupdated for the v0.9.0 overlay: added overlay step to the pipeline DAG, new “Known-habitat overlay” subsection, clarified rollup vs. map comparison. -
data-raw/vignette_reproducing_bcfishpass.R: bcfishpass-side map query readsstreams_habitat_linear(model + known) instead ofhabitat_linear_ch(model-only) for apples-to-apples comparison with link’s post-overlay output. - Regenerated bundled snapshots (
inst/extdata/vignette-data/{rollup,sub_ch,sub_ch_bcfp}.rds) from v0.10.0 + overlay state.
link 0.11.0
Config-bundle provenance + run stamps — closes the drift attribution loop. Pipeline outputs that shift between runs on the same DB state can now be traced back to which input changed. Closes #40; supersedes the narrower scope of #24.
-
inst/extdata/configs/{bcfishpass,default}/config.yamlcarryprovenance:blocks with sha256 checksums for every tracked file. Externally sourced files (bcfishpass overrides) recordsourceURL +upstream_sha(ea3c5d8, synced 2026-04-13) +pathwithin source repo. Generated files (rules.yaml) recordgenerated_from+generated_by+generator_sha. Hand-authored files record link’s git sha at edit time. -
lnk_config()exposes parsed provenance ascfg$provenance(named list, one entry per tracked file).print(cfg)shows the count of tracked files. - New
lnk_config_verify(cfg, strict)recomputes sha256 for every provenanced file and returns a tibble(file, expected, observed, drift, missing). Default warns on drift;strict = TRUEerrors.digestadded to Suggests. - New
lnk_stamp(cfg, conn, aoi, db_snapshot)returns anlnk_stampS3 list capturing the full set of inputs at run time: cfg provenance with current observed checksums, software versions and git SHAs (link, fresh, R), DB snapshot row counts (bcfishobs.observations,whse_basemapping.fwa_stream_networks_sp) when conn is provided, AOI + start_time.lnk_stamp_finish(stamp, result, end_time)finalizes;format(stamp, "markdown")renders for report appendix or run-log dump. -
data-raw/compare_bcfishpass_wsg.Rnow emits a stamp markdown at the head of every WSG run, captured intodata-raw/logs/*.txtvia the standard stderr redirect. - Tests: 93 new — provenance parsing, drift detection (clean / mutated / missing / strict), bundled-config drift = 0 invariants, stamp shape + markdown rendering + finalization + db-snapshot opt-out.
link 0.10.0
Default config bundle now uses explicit FWA edge_type codes for spawn and rear-stream predicates, matching bcfishpass’s 20-year-validated convention.
-
data-raw/build_rules.R: switched both default rule-builder calls (inst/extdata/parameters_habitat_rules.yamlandinst/extdata/configs/default/rules.yaml) fromedge_types = "categories"toedge_types = "explicit". Predicates now emitedge_types_explicit: [1000, 1100, 2000, 2300]in place ofedge_types: [stream, canal](which expanded to1000/1050/1100/1150+2000/2100/2300). - Drops
1050/1150(stream-thru-wetland) and2100(rare double-line canal) from spawn AND rear-stream rules. The dedicated wetland-rearing rule (edge_types_explicit: [1050, 1150]withthresholds: false) is unchanged —wetland_rearingflag still captures stream-thru-wetland segments for species withrear_wetland = yes. Netrearingflag (=rear_stream OR wetland_rearing OR rear_lake) is preserved for those species; species withrear_wetland = no(GR, KO) lose1050/1150from both spawn AND rearing. - ADMS preflight (M1, fresh 0.21.0): default-bundle spawning km drops 4-7% across all spawning species (BT 397→368, CH 296→279, CO 340→318, SK 98→94, RB 331→311). Rearing km essentially unchanged for
rear_wetland = yesspecies. Full per-WSG numbers inresearch/default_vs_bcfishpass.md. - Default and bcfishpass bundles now emit structurally aligned spawn predicates — confirms bcfishpass’s edge-type convention is what link ships by default.
-
tests/testthat/test-lnk_rules_build.R: regression tests added — default rules.yaml has no1050/1150/2100in spawn or rear-stream predicates; the dedicated wetland-rear rule still carries[1050, 1150].
link 0.9.0
lnk_pipeline_classify() now overlays known habitat from user_habitat_classification.csv onto fresh.streams_habitat after rule-based classification. Closes #55.
- After
frs_habitat_classify()finishes, callsfrs_habitat_overlay()(fresh ≥ 0.21.0) when the manifest declareshabitat_classification. Loaded long-format table is overlaid via a 3-way bridge join throughfresh.streams(range containment on[drm, urm]). - Closes the gap surfaced in research doc §5/§7: bcfishpass’s published
streams_habitat_linear.spawning_sk > 0blends model + observation-curated knowns; link’s pipeline previously only emitted the model side. - 5-WSG rerun (digest
0f00c713) shows BABL SK spawning under bcfishpass bundle rises from 57.6 → 85.2 km (+27.6 km from overlay). ADMS SK +5.14 km, BULK SK +0.8 km. Default bundle similar magnitudes. - Requires fresh ≥ 0.21.0 (overlay rename + bridge support; see fresh#175).
link 0.8.0
Default NewGraph habitat-classification config bundle ships alongside the bcfishpass reproduction bundle (#51).
- New
inst/extdata/configs/default/bundle — intentional methodological departures from bcfishpass: intermittent streams included in rearing, wetland rearing added for resident species, lake rearing extended to species beyond SK/KO with per-speciesrear_lake_ha_minthresholds,river_skip_cw_min = yes. Loadable vialink::lnk_config("default"). - Per-species
rear_lake_ha_minvia a new column inconfigs/default/dimensions.csv.lnk_rules_build()prefers that value over the sharedfresh::parameters_habitat_thresholdsdefault when present, keeping bcfishpass bundle at its 200 ha threshold for SK/KO while letting default express species-specific biology (CO 2 ha, BT/WCT/RB/CT/DV 10 ha, GR 40 ha, ST 60 ha, CH 100 ha, SK/KO 200 ha). Non-numeric entries in the dimensions CSV fall through to the fresh fallback rather than silently disabling it. - Per-species
rear_wetland_ha_minvia a new column inconfigs/default/dimensions.csv.lnk_rules_build()now emits bothedge_types: wetland(for rearing km) ANDwaterbody_type: W(driveswetland_rearing_harollup) rules whenrear_wetland = yes. Thresholds: CO 0.5 ha (beaver complexes), BT/CH/CT/DV/RB/ST/WCT 1 ha. - SK + KO spawn_connected block — added five columns to
configs/default/dimensions.csv(rear_stream_order_bypass,spawn_connected_direction,spawn_connected_gradient_max,spawn_connected_cw_min,spawn_connected_edge_types) solnk_rules_build()emits thespawn_connected:block withdirection: downstreamfor lake-obligate species.spawn_lake = nofor SK/KO to prevent lake-centerline inflation (Babine Lake alone is 177 km). -
data-raw/compare_bcfishpass_wsg()emits a compound rollup with 7 rows per species × WSG × config:spawning/rearingkm,lake_rearing/wetland_rearingha, plus three edge-type slice rows (rearing_stream,rearing_lake_centerline,rearing_wetland_centerline) for decomposing the rearing total. Reference side uses the samehabitat_linear_<sp>+fwa_{lakes,wetlands}_polymethodology as link, so both sides are apples-to-apples. -
data-raw/_targets.Rruns both bundles side-by-side across all 5 validation WSGs (ADMS, BULK, BABL, ELKR, DEAD) — 10 comparison targets, unified rollup with aconfigidentity column. Rollup digeste3eaf5f62df44d6713bfed32cd08fc5d(357 rows) on M1 with fresh 0.17.1. - New research doc
research/default_vs_bcfishpass.md— methodology comparison, per-WSG per-species results, 9 observations covering the debugging journey (SK spawning over-inflation root causes, bcfishpass known-habitat overlay viastreams_habitat_known, gradient-floor calibration, segment-averaging risk). - Three companion maps (
data-raw/maps/sk_spawning_BABL*.R) — mapgl overlays of SK spawning BABL comparing bundle-vs-bundle and default-vs-bcfishpass-published (model + known); per-layer toggle, popups withid_segment/segmented_stream_id/ plain-language edge_type / gradient / length. - Requires
fresh >= 0.17.1forwaterbody_type: L/Wrear-rule honouring +lake_ha_min/wetland_ha_minthresholds. -
tests/testthat/test-lnk_rules_build.R— new suite with 56 tests covering lake + wetland rule emission (per-config ha_min, fresh fallback, rear_lake=no / rear_wetland=no), spawn rules (stream+canal vs explicit codes, spawn_lake, spawn_requires_connected, spawn_connected block), rear precedence (no_fw, lake_only, all_edges), river polygon + river_skip_cw_min, species skipping, rear_stream_order_bypass, non-numeric ha_min fallthrough.
link 0.7.0
user_barriers_definite no longer eligible for observation-based override (#48).
-
.lnk_pipeline_prep_natural()previously unionedbarriers_definiteintonatural_barriers, whichlnk_barrier_overrides()iterates over. Net effect: the 227 reviewer-added user-definite positions (EXCLUSION zones, MISC detections the model misses) could be re-opened by observations clearing the species threshold. Confirmed active on ELKR pre-fix — 4 override rows at Erickson Creek exclusion and Spillway MISC positions that bcfishpass keeps as permanent barriers. - bcfishpass’s
model_access_*.sqlbuilds the barriers CTE from gradient + falls + subsurfaceflow only and appendsbarriers_user_definitepost-filter viaUNION ALL. Observations and habitat filters never see user-definite rows, so they’re never overridable. link now matches this shape:natural_barriersis gradient + falls only;barriers_definitestays consumed separately as a break source inlnk_pipeline_break()and as a directUNION ALLentry intofresh.streams_breaksvialnk_pipeline_classify(). - ELKR rollup shifts toward bcfishpass: BT spawning +3.4% → +2.8%, WCT spawning +4.0% → +2.6%, WCT rearing +1.6% → +0.3%. Other four WSGs unchanged (ADMS/BABL/DEAD have empty
barriers_definite; BULK has 87 rows but no observation-threshold matches to any of them).
link 0.6.0
Honour user_barriers_definite_control.csv at the observation-override step.
-
lnk_barrier_overrides()now excludes observations upstream of control-flagged positions from counting toward the override threshold, matching bcfishpass’s access SQL. Previously controlled positions (concrete dams, long impassable falls, diversions) could be re-opened by upstream historical observations (#44). - Gated per-species by a new
observation_control_applycolumn inparameters_fresh.csv— TRUE for CH/CM/CO/PK/SK/ST; FALSE for BT/WCT; NA for CT/DV/RB. Residents routinely inhabit reaches upstream of anadromous-blocking falls (post-glacial headwater connectivity, no ocean-return requirement), so their observations still override. Matches bcfishpass’s per-model application. - Habitat-confirmation override path intentionally bypasses the control table — expert-confirmed habitat is higher-trust than observations, and bcfishpass’s
hab_upstrCTE has no control join either. -
.lnk_pipeline_prep_overridesnow passes the control table tolnk_barrier_overrides()when the config manifest declaresbarriers_definite_control. Manifest key is the contract; no DB probe. -
.lnk_pipeline_prep_load_auxnow always creates a schema-valid (possibly empty)barriers_definite_controltable when the manifest declares the key — fixes an asymmetric gating bug that would have raised “relation does not exist” on AOIs with zero control rows. - End-to-end validation WSG: DEAD (Deadman River) added to
data-raw/_targets.R. It has a singlebarrier_ind = TRUEcontrol row at FALLS (356361749, 45743) with six anadromous observations upstream and zero habitat coverage — the unique combination that actively exercises the filter. All four prior WSGs (ADMS/BULK/BABL/ELKR) were rescued by either the observation threshold or habitat path, making them parity checks rather than filter tests.
link 0.5.0
Documentation and narrative for the targets pipeline.
- New vignette: “Reproducing bcfishpass with link + fresh” — three-line entrypoint, rollup interpretation, BULK chinook habitat map (mapgl), reproducibility framing. Data-prep script at
data-raw/vignette_reproducing_bcfishpass.Rgeneratesinst/extdata/vignette-data/{rollup,bulk_ch}.rdsfrom a real run; vignette loads the.rdsso pkgdown builds don’t need fwapg access. Follows the CLAUDE.md convention for vignettes that need external resources (#38) - Research doc (
research/bcfishpass_comparison.md) updated with bit-identical rollup numbers from 2026-04-22 and a new “Targets orchestration” section showing how_targets.Rcomposes the per-WSG runs. -
mapgl,sfadded to DESCRIPTION Suggests. - Retired
data-raw/compare_bcfishpass.R—data-raw/_targets.R+data-raw/compare_bcfishpass_wsg.Rsupersede it. Git history preserves the prior form.
link 0.4.0
Targets-driven comparison pipeline for all four validated watershed groups.
- Add
data-raw/_targets.R—tar_map(wsg = c("ADMS", "BULK", "BABL", "ELKR"))over a per-AOI target function, synchronous execution,dplyr::bind_rowsrollup.fresh.streamsis a shared schema so single-host parallelism would collide — runs serially today; distributed runs (M4 + M1) are a follow-up alongside a fresh upstream change for per-AOI output paths (#38) - Add
data-raw/compare_bcfishpass_wsg(wsg, config)— per-AOI target function. Wraps the sixlnk_pipeline_*phases, diffs the output againstbcfishpass.habitat_linear_*reference on the tunnel DB, returns a ~10-row tibble (wsg × species × habitat_type × link_km × bcfishpass_km × diff_pct). KB-scale — safe to ship over SSH. - Promote
.lnk_pipeline_classify_speciesto an exportedlnk_pipeline_species(cfg, aoi)— canonical public API for “species this config classifies in this AOI.” Used bylnk_pipeline_classifyandlnk_pipeline_connectinternally and by the targets per-AOI function externally. Removes the duplicate private helper that was briefly inlined indata-raw/. - End-to-end verification (
data-raw/logs/20260422_11_tar_make_final.txt) — 4 WSGs / 34 rows produced over 8.5 minutes wall clock (serial). Reproducibility: consecutivetar_make()invocations on the same DB state produce bit-identical rollup tibbles. Parity to bcfishpass (informational): all 34diff_pctvalues within 5% of reference; research-doc drift (BT rearing: -0.7 → -1.1 pp) traces to env state between 2026-04-15 and today, not to pipeline non-determinism.
link 0.3.0
Pipeline phase helpers extract the bcfishpass comparison orchestration into composable building blocks. The 635-line data-raw/compare_bcfishpass.R is now 136 lines of sequenced helper calls.
- Add
lnk_pipeline_setup()— create the per-run working schema (#38) - Add
lnk_pipeline_load()— load crossings and apply modelled-fix and PSCIS overrides - Add
lnk_pipeline_prepare()— load falls / definite / control / habitat CSVs, detect gradient barriers, compute per-species barrier skip list, reduce to minimal set viafresh::frs_barriers_minimal(), load base segments - Add
lnk_pipeline_break()— sequentialfrs_break_applyover observations / gradient / definite / habitat / crossings in config-defined order - Add
lnk_pipeline_classify()— assemble access-gating breaks table and runfresh::frs_habitat_classify() - Add
lnk_pipeline_connect()— per-species rearing-spawning clustering and connected-waterbody rules - Canonical signature
(conn, aoi, cfg, schema)—aoifollows fresh convention (WSG code today; extends to ltree / sf polygons / mapsheets later),schemais the caller’s per-run namespace (working_<aoi>by convention) so parallel runs do not collide -
cfg$speciesparsed from the rules YAML atlnk_config()load — intersects withcfg$wsg_speciespresence to pick per-AOI classify targets - Requires fresh 0.14.0 (for
frs_barriers_minimal)
link 0.2.0
Config bundles for pipeline variants.
- Add
lnk_config(name_or_path)— load a config bundle (rules YAML, dimensions CSV, parameters_fresh, overrides, pipeline knobs) as one list object. Bundles live atinst/extdata/configs/<name>/with aconfig.yamlmanifest, or any directory containingconfig.yamlfor custom variants (#37) - Relocate bcfishpass config files into
inst/extdata/configs/bcfishpass/(rules.yaml, dimensions.csv, parameters_fresh.csv, overrides/). All R scripts and data-raw/ references updated.
