
Detect missing semantic terms that are not covered by SMN
detect_semantic_term_gaps.RdGiven semantic suggestions (typically attached to a dictionary as
semantic_suggestions), this function summarizes candidate fields that appear to
need ontology support but do not have a direct smn match.
Usage
detect_semantic_term_gaps(
dict = NULL,
suggestions = NULL,
include_target_scopes = c("column", "code", "table", "dataset"),
include_dictionary_roles = NULL,
min_score = NA_real_
)Arguments
- dict
A dictionary tibble. Used only when
suggestionsisNULL.- suggestions
Optional semantic suggestion table. If omitted, this function uses
attr(dict, "semantic_suggestions").- include_target_scopes
Target scopes to inspect. Defaults to all supported scopes.
- include_dictionary_roles
Optional vector of dictionary roles to restrict the gap scan (for example
c("variable", "property", "entity")).- min_score
Optional minimum score filter. Rows with score below this value are ignored when score is available.
Value
A tibble with one row per target that has no SMN match. Key columns:
dataset_id,table_id,column_name,target_scope,target_sdp_file,target_sdp_field,target_row_key,dictionary_role;search_querytext used for lookup;top_non_smn_source,top_non_smn_label,top_non_smn_iri,top_non_smn_score;non_smn_sources,candidate_count,placement_recommendation,placement_confidence,placement_rationale.
Details
It is designed to support a practical workflow:
generate semantic suggestions with
suggest_semantics();detect unresolved gaps with
detect_semantic_term_gaps();render request payloads with
render_ontology_term_request();optionally submit issues with
submit_term_request_issues().
Examples
suggestions <- tibble::tibble(
dataset_id = c("d1", "d1"),
table_id = c("t1", "t1"),
column_name = c("run_id", "run_id"),
code_value = NA_character_,
column_label = c("Run ID", "Run ID"),
column_description = "Run identifier from local monitoring pipeline",
dictionary_role = c("variable", "variable"),
target_scope = c("column", "column"),
target_sdp_file = c("column_dictionary.csv", "column_dictionary.csv"),
target_sdp_field = c("term_iri", "term_iri"),
target_row_key = c("run_id", "run_id"),
search_query = c("run_id", "run_id"),
label = c("Run ID", "Run ID"),
iri = c(NA_character_, NA_character_),
source = c("gbif", "worms"),
ontology = c("gbif", "worms"),
match_type = c("label", "label"),
definition = NA_character_,
score = c(0.9, 0.85)
)
gaps <- detect_semantic_term_gaps(
suggestions = suggestions,
include_dictionary_roles = "variable"
)
#> Error in detect_semantic_term_gaps(suggestions = suggestions, include_dictionary_roles = "variable"): could not find function "detect_semantic_term_gaps"
gaps
#> Error: object 'gaps' not found