Skip to contents

This glossary defines technical terms used throughout the metasalmon documentation in plain English. You don’t need to memorize these - refer back here whenever you encounter an unfamiliar term.

Core Concepts

Data Package

A folder containing your data files plus a “recipe file” (called datapackage.json) that explains what’s in them. Think of it like shipping a box with a detailed packing list - the recipient knows exactly what they’re getting.

Example: A folder containing: - escapement.csv (your data) - datapackage.json (the recipe/packing list) - column_dictionary.csv (column descriptions)

Data Dictionary

A table that describes each column in your data - what it means, what type of values it contains, and what units it uses.

Example:

Column Description Type
POP_ID Unique population identifier text
SPAWNERS Estimated spawner count number
YEAR Survey year integer

Column Role

A category that describes what kind of information a column contains. metasalmon uses these roles:

Role What it means Examples
identifier Uniquely identifies a record POP_ID, SITE_CODE
attribute Describes a characteristic STREAM_NAME, SPECIES
measurement A numeric observation SPAWNER_COUNT, TEMPERATURE
temporal A date or time SURVEY_DATE, YEAR
categorical A code from a fixed list RUN_TYPE, QUALITY_FLAG

Code List

A table that explains what each code value means for categorical columns.

Example for a SPECIES column:

Code Label
CO Coho Salmon
CH Chinook Salmon
PK Pink Salmon
SO Sockeye Salmon

Semantic Web Terms

These terms relate to linking your data to standard scientific definitions. You don’t need to understand these to use metasalmon - they’re handled automatically. This section is for those who want to understand what’s happening behind the scenes.

IRI (Internationalized Resource Identifier)

A web address (like a URL) that points to an official definition of a term. When we say “spawner count” means the same thing across all DFO datasets, we prove it by pointing to a shared IRI.

Example: https://w3id.org/gcdfo/salmon#NaturalSpawnerCount

Think of it like a library catalog number - it uniquely identifies a specific concept so there’s no confusion.

Ontology

A shared vocabulary where scientists agree on what terms mean - like a specialized dictionary for salmon science. An ontology defines not just terms, but also how they relate to each other.

Example: The DFO Salmon Ontology defines terms like “Conservation Unit”, “Escapement”, and “Run Timing”, and explains how they relate (e.g., a Conservation Unit contains multiple Populations).

SKOS (Simple Knowledge Organization System)

A system for organizing vocabulary terms into lists. Used for: - Code lists (all valid species codes) - Measurement definitions (what “escapement” means) - Vocabulary schemes (all run timing categories)

Plain English: Think of SKOS as a way to create organized lists with definitions.

OWL (Web Ontology Language)

A system for defining categories of things and their relationships. Used for: - Defining what a “Conservation Unit” is - Saying that “Coho” is a type of “Pacific Salmon” - Describing that a “Population” belongs to a “Conservation Unit”

Plain English: Think of OWL as a classification system, like how biology classifies species into genus, family, order, etc.

Semantic

“Meaning-aware” - data that carries its definitions with it so computers and humans can understand it the same way.

Non-semantic data: A column called SPAWN_EST with no explanation Semantic data: The same column linked to a definition that says “Estimated count of naturally spawning adults”

I-ADOPT Framework

I-ADOPT (InteroperAble Descriptions of Observable Property Terminology) is a framework for precisely describing what a measurement represents. It answers the question: “What exactly did you measure?”

The Core Components

I-ADOPT defines observable properties using four components:

Component Question it answers Example
Variable What compound concept? “Sea surface temperature” (the full term)
Property What characteristic? Temperature
Entity Of what thing? Sea surface (water at ocean surface)
Constraint Any qualifiers or limits? Maximum, daily average, etc.

Why this matters: Two researchers might both measure “temperature” but mean different things (air vs. water, surface vs. depth). I-ADOPT removes ambiguity by requiring you to specify exactly what property of what entity you measured.

Understanding the Components

  • Variable: The complete, compound term that describes what you measured (e.g., “Natural spawner count”). In metasalmon, this maps to term_iri.
  • Property: The measurable characteristic (e.g., “count”, “temperature”, “length”). Maps to property_iri.
  • Entity: The thing being measured (e.g., “spawning salmon”, “stream water”). Maps to entity_iri.
  • Constraint: Optional qualifiers that narrow the scope (e.g., “maximum”, “annual”, “wild-origin only”). Maps to constraint_iri.

Units (Not Part of I-ADOPT Core)

While I-ADOPT focuses on describing what you measured, you’ll also need to record the units (how the measurement is expressed). metasalmon includes unit_iri for this purpose, typically linking to vocabularies like QUDT.

Note: Method (how the measurement was made) is important metadata but is not part of the I-ADOPT framework itself. You can document methods in your column descriptions or other metadata fields.

Frictionless Data

A standard format for data packages that works across different software and programming languages. metasalmon creates Frictionless-compatible packages, which means:

  • Your packages work with Python, R, JavaScript, and other tools
  • Other researchers can open your packages without installing metasalmon
  • The format follows an international standard (not a DFO-specific format)

Learn more: frictionlessdata.io

Quick Reference

Term One-line definition
Data Package Folder with data + documentation
Data Dictionary Table describing your columns
Column Role What type of information a column contains
Code List Definitions for categorical codes
IRI Web address pointing to a definition
Ontology Shared vocabulary with relationships
SKOS System for organizing term lists
OWL System for classification hierarchies
Semantic Data that carries its meaning with it
I-ADOPT Framework for describing observable properties (variable, property, entity, constraint)
Frictionless International data package standard
DwC-DP Darwin Core Data Package profile (Frictionless-based)
Assertion (DwC-DP) MeasurementOrFact pattern: assertionType/value/unit
ZOOMA EBI text-to-ontology annotator with confidence tiers
NVS SPARQL NERC Vocabulary Server SPARQL endpoint (P01/P06)
QUDT Quantities, Units, Dimensions and Types ontology (preferred for units)
GBIF Global Biodiversity Information Facility (taxon backbone)
WoRMS World Register of Marine Species (marine taxa resolver)
Query expansion Automatic expansion of search terms based on role context
Cross-source agreement Boost for terms appearing in multiple vocabulary sources
Diagnostics attribute Per-source search status/timing attached to find_terms() results
alignment_only Flag for Wikidata terms (useful for crosswalks, not canonical IRIs)
include_dwc Parameter to include optional DwC-DP mappings in suggestions

Still Confused?

That’s okay! The most important terms for getting started are:

  1. Data Package - what you’re creating
  2. Data Dictionary - what describes your columns
  3. Code List - what explains your categorical codes

Everything else is handled automatically by metasalmon. You can create excellent, shareable data packages without understanding IRIs, SKOS, OWL, or I-ADOPT.

Ready to get started? See the 5-Minute Quickstart.