
Read a CSV from a GitHub repository
read_github_csv.RdReads a CSV file directly from a GitHub repository (public or private) and
returns it as a tibble. Authentication is handled via the GitHub PAT stored
by ms_setup_github(); the token is sent via HTTP headers, not embedded in
the URL.
Arguments
- path
Path to the CSV file inside the repository (e.g.,
"data/observations.csv"), or a full GitHub URL (blob or raw format).- ref
Git reference: branch name, tag, or commit SHA. Defaults to
"main". For reproducible analyses, prefer tags or commit SHAs. Ignored whenpathis already a full URL with a ref embedded.- repo
Repository slug in
"owner/name"form. Required whenpathis a relative path; optional whenpathis a full URL.- token
Optional GitHub PAT override. If
NULL(default), uses the token fromgh::gh_token(), which is typically set byms_setup_github().- ...
Additional arguments passed to
readr::read_csv(), such ascol_types,skip,n_max, etc.
Details
This function supports automatic retries with exponential backoff for transient network errors.
Before using this function, run ms_setup_github() once to configure
authentication. For private repositories, your PAT must have the repo
scope.
For reproducible analyses, pin to a specific tag or commit SHA rather than
a branch name like "main", since branch contents can change over time.
See also
ms_setup_github() for authentication setup,
github_raw_url() for getting the raw URL without fetching data.
Examples
if (FALSE) { # \dontrun{
# First, set up authentication (run once)
ms_setup_github(repo = "myorg/myrepo")
# Read a CSV from the main branch
data <- read_github_csv("data/observations.csv", repo = "myorg/myrepo")
# Pin to a release tag for reproducibility
data_v1 <- read_github_csv(
"data/observations.csv",
ref = "v1.0.0",
repo = "myorg/myrepo"
)
# Pin to a specific commit
data_exact <- read_github_csv(
"data/observations.csv",
ref = "a1b2c3d",
repo = "myorg/myrepo"
)
# Pass arguments to read_csv
data_typed <- read_github_csv(
"data/observations.csv",
repo = "myorg/myrepo",
col_types = "ccin"
)
# Read from a full GitHub URL
data_url <- read_github_csv(
"https://github.com/myorg/myrepo/blob/main/data/observations.csv"
)
} # }