6 Step 4: Record Processing
6.1 Objective
Transform selected records into analysis-ready values and leave a clear QC trail showing how you got there.
6.3 Required outputs from Step 4
processed_records.csvprocessing_log.mdexception-register.csvqc-artifact-review.md- species-specific tracking or decomposition files written by the prep repo
6.4 Processing actions (in order)
- Apply approved transformations only.
- Preserve raw values and adjusted values, or leave a clear link between them.
- Add method tags and adjustment flags.
- Write intermediate QC artifacts.
- Summarize exceptions that require reviewer sign-off.
6.5 Treat intermediate artifacts as first-class outputs
Do not review only the final flat files.
Intermediate artifacts often contain the evidence that the method behaved as intended, for example:
- matching checks,
- decomposition tables,
- CU-specific prep tables,
- unmatched-site reports,
- historical-layer comparison tables,
- pop-versus-CU comparison checks.
If a reviewer cannot see those files, they cannot really review the run.
6.6 Minimum columns for exception-register.csv
speciesoutput_layerobjectyears_affectedrule_typeimplemented_inrationalereview_requiredreviewer_notes
Typical rule_type values include rename, suppress, merge, timing_override,
gap_fill, decomposition, and manual_value.
6.7 Keep output-layer semantics explicit
| Output layer | Typical fields | What they mean | Common trap |
|---|---|---|---|
cu_timeseries |
SpnForTrend_*, SpnForAbd_* |
canonical CU series for status and benchmarks | assuming trend and abundance fields are always identical |
pop_representation |
pop IDs, pop names, spawner fields | representation or context layer | assuming pop sums must equal CU totals |
historical_context |
historical stream or aggregate rows | continuity/context only | treating context rows as authoritative CU estimates |
| status bundle | CU series + metric specs | downstream compute contract | mixing authoring notes into machine tables |
6.8 Species-pattern notes from current production repos
- Sockeye: multiple CU-specific gap-fill families exist. Treat them as declared methods with reviewable outputs, not as mysterious script magic.
- Coho: natural/hatchery decomposition and brood-year derivation are central processing steps. Review the decomposition tables, not just the final CU file.
- Chum: CU totals are assembled from major systems plus Harrison logic. Review the composition assumptions and document expected CU/pop non-equality.
- Pink: keep the official CU series distinct from the historical NuSEDS representation layer. The layers answer different questions.
6.9 Minimum QC expectations
- every material adjustment has a flag or method tag,
- raw-versus-adjusted traceability is preserved,
- intermediate QC artifacts are reviewed and logged,
- intentional
NApatterns are explained, - expected non-equality patterns are explained before release.
6.10 Escalate when
- adjustments materially change interpretation but the rationale is weak,
- manual fixes are introduced without a reproducible rule expression,
- the only evidence for a method choice lives inside code comments, or
- you cannot explain a major output change using the intermediate artifacts.