Skip to contents

Copies selected IRIs from a semantic_suggestions tibble into the matching dictionary fields. Suggestions remain separate by default; this helper gives you an explicit merge step when you decide the top candidates are good enough.

Usage

apply_semantic_suggestions(
  dict,
  suggestions = attr(dict, "semantic_suggestions"),
  strategy = "top",
  columns = NULL,
  roles = NULL,
  min_score = NULL,
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

dict

A dictionary tibble, typically returned by infer_dictionary() or suggest_semantics().

suggestions

A suggestions tibble, usually attr(dict, "semantic_suggestions"). If omitted, the function reads that attribute from dict.

strategy

Selection strategy per column-role pair. Currently only "top" is supported, which uses the first suggestion in each matched group.

columns

Optional character vector limiting application to specific column_name values.

roles

Optional character vector limiting application to specific suggestion roles: "variable", "property", "entity", "unit", "constraint", "method".

min_score

Optional numeric threshold. Only available when suggestions includes a score column; otherwise the function errors.

overwrite

Logical; if FALSE (default), only missing fields are filled. Set TRUE to intentionally replace existing IRIs.

verbose

Logical; if TRUE (default), print a short summary.

Value

The dictionary tibble with selected semantic IRI fields filled in.

Details

Matching is done by both column_name and dictionary_role. When the suggestions tibble also includes dataset_id and table_id, those keys are honored too. Suggestions that target non-column destinations (for example codes.csv, tables.csv, or dataset.csv) are ignored by this helper and remain review-only.

Examples

if (FALSE) { # \dontrun{
dict <- infer_dictionary(my_data, dataset_id = "example", table_id = "main")
dict <- suggest_semantics(my_data, dict)

# Fill only the missing semantic fields for one measurement column
dict <- apply_semantic_suggestions(dict, columns = "SPAWNER_COUNT")

# Require stronger lexical matches when score is available
dict <- apply_semantic_suggestions(dict, min_score = 2)
} # }