
5-Minute Quickstart
metasalmon.RmdInstallation
install.packages("remotes")
remotes::install_github("dfo-pacific-science/metasalmon")One-shot Workflow
Load the built-in NuSEDS Fraser Coho sample and create a review-ready Salmon Data Package in one call.
library(metasalmon)
sample_path <- system.file("extdata", "nuseds-fraser-coho-sample.csv", package = "metasalmon")
fraser_coho <- readr::read_csv(sample_path, show_col_types = FALSE)
pkg_path <- create_sdp(
fraser_coho,
dataset_id = "fraser-coho-2024",
table_id = "escapement",
overwrite = TRUE
)
pkg_path
list.files(pkg_path, recursive = TRUE)If path is omitted, create_sdp() writes to
your working directory using a default folder name like
fraser-coho-2024-sdp. In interactive use it can also
mention when a newer metasalmon release is available; set
check_updates = FALSE to skip that check.
Review In Excel
Open the output folder and review these files:
README-review.txt-
semantic_suggestions.csv(when suggestions were found) metadata/dataset.csvmetadata/tables.csvmetadata/column_dictionary.csv-
metadata/codes.csv(when present) -
data/*.csvresource files
create_sdp() seeds semantic suggestions by default and
auto-fills the top-ranked column-level suggestions into
missing dictionary fields (term_iri,
property_iri, entity_iri,
unit_iri, etc.). It does not overwrite existing non-empty
IRI values. Table-level suggestions are still available when table
metadata needs them, while code-level suggestions default to
factor/categorical source columns only. Use
semantic_code_scope = "all" if you want broader code-level
seeding.
The inferred metadata includes REVIEW REQUIRED:
placeholders for required fields so the package is immediately
reviewable in Excel. Replace those placeholders before publishing. The
metadata/*.csv files are the canonical package metadata;
datapackage.json is a derived export for
interoperability.
How To Decide If term_iri Is Correct
Use plain-language checks for each measurement column:
- Does the suggested label describe exactly what the column measures?
- Does the definition match your intent (not just a similar word)?
- Is the scope right (for example species-level vs population-level)?
- Is the unit consistent with your values and
unit_iri?
Keep the IRI only when all checks pass.
Replace it when the term is close but not exact.
Remove it (leave blank) when no candidate is reliable yet.
When the top auto-applied suggestion is wrong, use
semantic_suggestions.csv to pick a better alternative and
copy that IRI into metadata/column_dictionary.csv.
Finalize
After Excel edits, save the metadata back to CSV, share the whole folder (or a zip of the whole folder) when you hand it to someone else, then run validation again before publishing:
pkg <- read_salmon_datapackage(pkg_path)
validate_dictionary(pkg$dictionary)
validate_semantics(pkg$dictionary)For a staged, fully explicit workflow (manual artifact inference and controlled semantic merges), see the publication guide: