Hosting Data and Code

This guide is intended for scientists and technical staff in the DFO Pacific Region Science Branch. It outlines what is allowed, restricted, or prohibited when using internal and external platforms (e.g., Sharepoint, GitHub, Zenodo, Google Drive) to manage and share DFO Science data, code, and metadata, including during Fishery Science Advisory Report (FSAR) workflows.

For FSAR workflows, we recommend authors save their data in the Sharepoint for the Pacific Salmon Data MS Team. All data should be accompanied by a data dictionary with a README. For resources on creating a data dictionary, please see our blurb on data standards. Github can be used to host the code associated with analyses.


Table 1. External Platform Use Guidelines for three data classifications of both data and code: Unclassified, Protected A, and Protected B. Note that this table does not cover Protected C or Classified data which are not permitted on external platforms.

Platform Unclassified Protected A Protected B Notes
SharePoint βœ… βœ… βœ… SharePoint can also be used to share protected data with external CSAS participants or reviewers by submitting a request for email approval from the Regional Director of Science. Attach approval email to this Assystnet ticket
GitHub (public repo) Code: βœ… yes. Data:⚠️ only allowed after internal DFO release approval ❌ Prohibited ❌ Prohibited Public GitHub repos are only for non-sensitive code or metadata. Must go through internal DFO public data release process even for unclassified data.
GitHub (private repo) βœ… Before release approval and sharing with specific collaborators ⚠️ Discouraged ❌ Prohibited Useful for sharing non-sensitive (usally derived) data with external reviewers or CSAS participants; not for Protected data (ie. raw data).
GitHub Codespaces βœ… ❌ ❌ Don’t use for Protected data or model outputs.
GitHub Copilot βœ… ❌ ❌ Never input Protected code/data.
Zenodo / Figshare / Dryad βœ… after FSAR approval ❌ ❌ Good for quickly getting datasets published with DOIs. Must be followed up with publication to Open Government Portal
GC Data Portal (Open.Canada.ca) βœ… Required for final release ❌ ❌ Official platform for open datasets. Requires release review.
GCcollab / GCshare βœ… Internal sharing βœ… With care ⚠️ Rare; PB sharing requires approval Use for internal collaboration

πŸ” Data Classification Quick Reference

  • Unclassified: No expected harm from disclosure. Includes open data, public metadata, and code that contains no sensitive logic or credentials.
    • Examples: Published R packages, fish counts aggregated by region and year, habitat model source code without raw data. Note, however, even unclassified data has to be approved for release by the Regional Director of Science.
  • Protected A: Could cause low-level injury if disclosed (e.g., minor confidentiality breach, operational disruption).
    • Examples: Unpublished but non-sensitive biological metrics tied to location or project; fishing effort by vessel class if potentially re-identifiable; unreviewed site-specific model outputs.
  • Protected B: Could cause serious injury if disclosed (e.g., violation of legal obligations, significant economic harm).
    • Examples: Commercial catch by vessel or license number; sensitive species occurrences; data provided under confidentiality agreements.

πŸ”— Refer to the Policy on Government Security for classification definitions and examples.
πŸ”Ž Unsure if data is protected? Consult your divisional data contact or ATIP office.


🧭 Platform-by-Platform Guidance

βœ… GitHub (Public and Private)

  • Recommended for: Non-sensitive, unclassified code, metadata schemas, documentation, workflows.
  • Allowed: Unclassified data/code only. Data must go through the Approvals and Internal publishing process before being posted anywhere public.
  • Private repos: Use for drafting and collaboration before public release. Do not use for long-term data storage.
  • FSAR Guidance: Posting of data in a public GitHub repo before FSAR is formally approved by CSAS and before the release has beeing approved by the Regional Director of Science is likely a breach of the Access to Information Act and/or Privacy Act.

πŸ”— TBS Directive on Open Government, Section 6.1 – Open information by default, subject to security/privacy. πŸ”— DFO Policy for Scientific Data - ” t is the responsibility of the Regional Director of Science to designate specific Data as classified for preventing its open sharing.”


🚫 GitHub Codespaces / Copilot

  • Codespaces: Cloud-hosted dev environments. Not GC-approved for sensitive data. Fine for non-sensitive code.
  • Copilot: Do not enter Protected data/code. Avoid unless using for generic logic.

πŸ”— TBS Guide on Generative AI


βœ… Canadian Integrated Ocean Observing System, Zenodo, Dryad, Figshare

  • Use after FSAR or publication approval.
  • Good for creating DOIs and enabling citation and disseminating broadly.
  • Data must be fully unclassified, reviewed, and approved.

βœ… GC Open Data Portal

  • Official repository for open DFO datasets.
  • Requires metadata, data steward approval, and Open Government Licence.

πŸ”— Open Government Licence
πŸ”— Open Data Portal Submission Guide


🧾 Internal DFO Requirements

πŸ“€ Internal release process

FOr the steps specific to Approvals and Publishing see step 4 for the Release Criteria Checklist

⏳ When to Release?

  • For FSAR-related data: Only after advice is finalized and report is approved.
  • For code/metadata tools: May be shared publicly once confirmed to contain no sensitive content.

πŸ”— CSAS Procedural Manuals – Supporting material must not be released until advice is finalized.


πŸ“¬ Need Help?

  • Data Classification or Release Questions: DFO.PACIFIC.SCIENCE.DATA@dfo-mpo.gc.ca
  • CSAS Advice Process Questions: Contact your regional CSAS Coordinator
  • Open Science & GitHub Use: Data Stewardship Unit

ℹ️ This document will be updated periodically. Please check back for revised guidance as policies evolve.