Skip to contents

Primary one-shot wrapper: infer dictionary/table metadata/codes/dataset metadata from raw data tables and immediately write a review-ready Salmon Data Package.

Usage

create_sdp(
  resources,
  path = NULL,
  dataset_id = "dataset-1",
  table_id = "table_1",
  guess_types = TRUE,
  seed_semantics = TRUE,
  semantic_sources = c("smn", "gcdfo", "ols", "nvs"),
  semantic_max_per_role = 1,
  seed_verbose = TRUE,
  seed_codes = NULL,
  seed_table_meta = NULL,
  seed_dataset_meta = NULL,
  semantic_code_scope = c("factor", "all", "none"),
  check_updates = interactive(),
  format = "csv",
  overwrite = FALSE,
  include_edh_xml = FALSE,
  edh_profile = c("dfo_edh_hnap", "iso19139"),
  edh_xml_path = NULL
)

Arguments

resources

Either a named list of data frames (one per resource table) or a single data frame (converted internally to a one-table list).

path

Character; directory path where package will be written. If omitted, defaults to file.path(getwd(), paste0(<dataset_id>-sdp)) using a filesystem-safe dataset id slug.

dataset_id

Dataset identifier applied to all inferred metadata rows.

table_id

Fallback table identifier when resources is a single data frame.

guess_types

Logical; if TRUE (default), infer value_type for each dictionary column.

seed_semantics

Logical; if TRUE (default), seed semantic suggestions during inference.

semantic_sources

Vector of vocabulary sources passed to suggest_semantics().

semantic_max_per_role

Maximum number of suggestions retained per I-ADOPT role.

seed_verbose

Logical; if TRUE, emit progress messages while seeding semantic suggestions.

seed_codes

Optional codes.csv-style seed metadata.

seed_table_meta

Optional tables.csv-style seed metadata.

seed_dataset_meta

Optional dataset.csv-style seed metadata.

semantic_code_scope

Character string controlling which codes.csv rows are sent through suggest_semantics() during one-shot seeding. "factor" (default) only analyzes codes sourced from factor/categorical columns in the original data frame(s); "all" analyzes all inferred or supplied code rows; "none" skips code-level semantic suggestions.

check_updates

Logical; if TRUE, run a short, non-fatal check_for_updates() call after writing the package and mention newer releases only when one is available. Defaults to interactive().

format

Character; resource format: "csv" (default, only format supported)

overwrite

Logical; if FALSE (default), errors if path exists

include_edh_xml

Logical; when TRUE, writes an EDH XML metadata file into metadata/ using edh_build_iso19139_xml().

edh_profile

One of "dfo_edh_hnap" (default) or "iso19139". Determines whether the richer HNAP-aware profile or compact fallback profile is written when include_edh_xml = TRUE.

edh_xml_path

Optional file path for the EDH output when include_edh_xml = TRUE. If NULL, defaults to metadata/metadata-edh-hnap.xml for edh_profile = "dfo_edh_hnap" and metadata/metadata-iso19139.xml for edh_profile = "iso19139".

Value

Invisibly returns the package path.

Details

This one-shot helper creates a review-ready package by default: semantic suggestions are seeded and the top-ranked column-level suggestions are auto-applied only into missing dictionary IRI fields. Table-level suggestions remain available when table metadata is present. To reduce review noise conservatively, code-level suggestions default to factor/categorical source columns only; set semantic_code_scope = "all" to broaden that or "none" to disable it. The package root contains README-review.txt, semantic_suggestions.csv (when available), datapackage.json, metadata/, and data/. To keep review files usable, semantic_suggestions.csv trims code-level suggestions that do not have enough human-readable context to review safely. Required-field review placeholders are also inserted into the inferred metadata files. In interactive use, create_sdp() can also mention an available package update; set check_updates = FALSE to skip that network check.

Examples

if (FALSE) { # \dontrun{
data_path <- system.file("extdata", "nuseds-fraser-coho-sample.csv", package = "metasalmon")
fraser_coho <- readr::read_csv(data_path, show_col_types = FALSE)

pkg <- create_sdp(
  fraser_coho,
  dataset_id = "fraser-coho-2024",
  table_id = "escapement",
  overwrite = TRUE
)
} # }