
Glossary of Terms
glossary.RmdThis glossary defines technical terms used throughout the metasalmon documentation in plain English. You don’t need to memorize these - refer back here whenever you encounter an unfamiliar term.
Core Concepts
Data Package
A folder containing your data files plus a “recipe file” (called
datapackage.json) that explains what’s in them. Think of it
like shipping a box with a detailed packing list - the recipient knows
exactly what they’re getting.
Example: A folder containing: -
escapement.csv (your data) - datapackage.json
(the recipe/packing list) - column_dictionary.csv (column
descriptions)
Data Dictionary
A table that describes each column in your data - what it means, what type of values it contains, and what units it uses.
Example:
| Column | Description | Type |
|---|---|---|
| POP_ID | Unique population identifier | text |
| SPAWNERS | Estimated spawner count | number |
| YEAR | Survey year | integer |
Column Role
A category that describes what kind of information a column contains. metasalmon uses these roles:
| Role | What it means | Examples |
|---|---|---|
| identifier | Uniquely identifies a record | POP_ID, SITE_CODE |
| attribute | Describes a characteristic | STREAM_NAME, SPECIES |
| measurement | A numeric observation | SPAWNER_COUNT, TEMPERATURE |
| temporal | A date or time | SURVEY_DATE, YEAR |
| categorical | A code from a fixed list | RUN_TYPE, QUALITY_FLAG |
Semantic Web Terms
These terms relate to linking your data to standard scientific definitions. You don’t need to understand these to use metasalmon - they’re handled automatically. This section is for those who want to understand what’s happening behind the scenes.
IRI (Internationalized Resource Identifier)
A web address (like a URL) that points to an official definition of a term. When we say “spawner count” means the same thing across all DFO datasets, we prove it by pointing to a shared IRI.
Example:
https://w3id.org/gcdfo/salmon#NaturalSpawnerCount
Think of it like a library catalog number - it uniquely identifies a specific concept so there’s no confusion.
Ontology
A shared vocabulary where scientists agree on what terms mean - like a specialized dictionary for salmon science. An ontology defines not just terms, but also how they relate to each other.
Example: The DFO Salmon Ontology defines terms like “Conservation Unit”, “Escapement”, and “Run Timing”, and explains how they relate (e.g., a Conservation Unit contains multiple Populations).
SKOS (Simple Knowledge Organization System)
A system for organizing vocabulary terms into lists. Used for: - Code lists (all valid species codes) - Measurement definitions (what “escapement” means) - Vocabulary schemes (all run timing categories)
Plain English: Think of SKOS as a way to create organized lists with definitions.
OWL (Web Ontology Language)
A system for defining categories of things and their relationships. Used for: - Defining what a “Conservation Unit” is - Saying that “Coho” is a type of “Pacific Salmon” - Describing that a “Population” belongs to a “Conservation Unit”
Plain English: Think of OWL as a classification system, like how biology classifies species into genus, family, order, etc.
Semantic
“Meaning-aware” - data that carries its definitions with it so computers and humans can understand it the same way.
Non-semantic data: A column called
SPAWN_EST with no explanation Semantic
data: The same column linked to a definition that says
“Estimated count of naturally spawning adults”
I-ADOPT Framework
I-ADOPT (InteroperAble Descriptions of Observable Property Terminology) is a framework for precisely describing what a measurement represents. It answers the question: “What exactly did you measure?”
The Core Components
I-ADOPT defines observable properties using four components:
| Component | Question it answers | Example |
|---|---|---|
| Variable | What compound concept? | “Sea surface temperature” (the full term) |
| Property | What characteristic? | Temperature |
| Entity | Of what thing? | Sea surface (water at ocean surface) |
| Constraint | Any qualifiers or limits? | Maximum, daily average, etc. |
Why this matters: Two researchers might both measure “temperature” but mean different things (air vs. water, surface vs. depth). I-ADOPT removes ambiguity by requiring you to specify exactly what property of what entity you measured.
Understanding the Components
-
Variable: The complete, compound term that
describes what you measured (e.g., “Natural spawner count”). In
metasalmon, this maps to
term_iri. -
Property: The measurable characteristic (e.g.,
“count”, “temperature”, “length”). Maps to
property_iri. -
Entity: The thing being measured (e.g., “spawning
salmon”, “stream water”). Maps to
entity_iri. -
Constraint: Optional qualifiers that narrow the
scope (e.g., “maximum”, “annual”, “wild-origin only”). Maps to
constraint_iri.
Units (Not Part of I-ADOPT Core)
While I-ADOPT focuses on describing what you measured,
you’ll also need to record the units (how the
measurement is expressed). metasalmon includes unit_iri for
this purpose, typically linking to vocabularies like QUDT.
Note: Method (how the measurement was made) is important metadata but is not part of the I-ADOPT framework itself. You can document methods in your column descriptions or other metadata fields.
Frictionless Data
A standard format for data packages that works across different software and programming languages. metasalmon creates Frictionless-compatible packages, which means:
- Your packages work with Python, R, JavaScript, and other tools
- Other researchers can open your packages without installing metasalmon
- The format follows an international standard (not a DFO-specific format)
Learn more: frictionlessdata.io
Quick Reference
| Term | One-line definition |
|---|---|
| Data Package | Folder with data + documentation |
| Data Dictionary | Table describing your columns |
| Column Role | What type of information a column contains |
| Code List | Definitions for categorical codes |
| IRI | Web address pointing to a definition |
| Ontology | Shared vocabulary with relationships |
| SKOS | System for organizing term lists |
| OWL | System for classification hierarchies |
| Semantic | Data that carries its meaning with it |
| I-ADOPT | Framework for describing observable properties (variable, property, entity, constraint) |
| Frictionless | International data package standard |
| DwC-DP | Darwin Core Data Package profile (Frictionless-based) |
| Assertion (DwC-DP) | MeasurementOrFact pattern: assertionType/value/unit |
| ZOOMA | EBI text-to-ontology annotator with confidence tiers |
| NVS SPARQL | NERC Vocabulary Server SPARQL endpoint (P01/P06) |
| QUDT | Quantities, Units, Dimensions and Types ontology (preferred for units) |
| GBIF | Global Biodiversity Information Facility (taxon backbone) |
| WoRMS | World Register of Marine Species (marine taxa resolver) |
| Query expansion | Automatic expansion of search terms based on role context |
| Cross-source agreement | Boost for terms appearing in multiple vocabulary sources |
| Diagnostics attribute | Per-source search status/timing attached to
find_terms() results |
alignment_only |
Flag for Wikidata terms (useful for crosswalks, not canonical IRIs) |
include_dwc |
Parameter to include optional DwC-DP mappings in suggestions |
Still Confused?
That’s okay! The most important terms for getting started are:
- Data Package - what you’re creating
- Data Dictionary - what describes your columns
- Code List - what explains your categorical codes
Everything else is handled automatically by metasalmon. You can create excellent, shareable data packages without understanding IRIs, SKOS, OWL, or I-ADOPT.
Ready to get started? See the 5-Minute Quickstart.