collect_tern_data() — Vectorized Data Collection
Adam H. Sparks
Source:vignettes/collect_tern_data.Rmd
collect_tern_data.RmdCollecting TERN Data Over Time and Space
The collect_tern_data() function provides a vectorized
interface to extract values from one or more TERN datasets at given
location(s) over a date range. Unlike individual wrapper functions
(read_smips(), read_asc(), etc.) which return
raster objects, collect_tern_data() returns a data
table with extracted point values — ideal for analysis
workflows.
Basic Usage
Extract SMIPS soil moisture for a single location over multiple dates:
library(nert)
dates <- seq(as.Date("2024-01-01"), as.Date("2024-01-10"), by = "day")
dt <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = dates,
datasets = "SMIPS",
verbose = TRUE
)
head(dt)When verbose = TRUE, you’ll see: 1. Dataset
information table showing what will be collected (name, ID,
layers, temporal resolution, spatial resolution, description) 2.
Progress messages as data is downloaded 3.
Summary of rows and columns collected
Default Behavior
By default, collect_tern_data() extracts all
available variants/depths:
- SMIPS: All 6 soil moisture variants (totalbucket, SMindex, bucket1, bucket2, deepD, runoff)
- SLGA datasets (AWC, CLY, SND, SLT, BDW, PHC, PHW, NTO): All 6 soil depths (0–5 cm through 100–200 cm)
This ensures you get comprehensive data without specifying each option:
# Gets all 6 SMIPS variants + all 6 AWC depths in one call
dt <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = "2024-01-15",
datasets = c("SMIPS", "AWC"),
verbose = FALSE # Hide table for brevity
)
names(dt)
# [1] "date" "SMIPS_totalbucket" "SMIPS_SMindex"
# [4] "SMIPS_bucket1" "SMIPS_bucket2" "SMIPS_deepD"
# [7] "SMIPS_runoff" "AWC_000_005" "AWC_005_015"
# [10] "AWC_015_030" "AWC_030_060" "AWC_060_100"
# [13] "AWC_100_200"Multiple Locations
Extract data for multiple coordinates (returns one row per location per date):
# Vector notation
dt_multi <- collect_tern_data(
lon = c(138.6, 139.5, 140.1),
lat = c(-34.9, -35.2, -34.5),
date_range = c("2024-01-01", "2024-01-05"),
datasets = c("SMIPS", "CANOPY")
)
nrow(dt_multi) # 3 locations × 5 dates = 15 rows
ncol(dt_multi) # date + lon + lat + SMIPS_variants(6) + CANOPY(1) = 11 colsOr use xy notation with a data.frame:
locations <- data.frame(
lon = c(138.6, 139.5, 140.1),
lat = c(-34.9, -35.2, -34.5)
)
dt_xy <- collect_tern_data(
xy = locations,
date_range = "2024-01-15",
datasets = c("ASC", "PHENOLOGY")
)Selective Extraction
Extract specific variants or depths instead of all:
# Only specific SMIPS collection
dt_smi <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = "2024-01-15",
datasets = "SMIPS",
smips_collection = "SMindex" # Just soil moisture index (0-100%)
)
# Returns: date, SMIPS_SMindex
# Only specific depth for SLGA
dt_surface <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = "2024-01-15",
datasets = "CLY",
depth = "000_005" # Top 5 cm only
)
# Returns: date, CLYAll 14 Available Datasets
By default, collect_tern_data() collects all 14 datasets
when called without specifying datasets:
# Collect everything
dt_all <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = "2024-01-15"
# datasets = NULL (or omitted — defaults to all)
)This returns:
| Category | Datasets | Columns |
|---|---|---|
| Time-series | SMIPS (6 variants), AET | 7 |
| Soil | ASC, AWC, CLY, SND, SLT, BDW, PHC, PHW, NTO (6 depths each) | 1 + 54 |
| Vegetation/Canopy | CANOPY, PHENOLOGY, SOILDIV | 3 |
| Total | 14 datasets | ~65 columns |
Understanding Column Names
Output columns follow a naming pattern for clarity:
-
Time-series:
{DATASET}_{VARIANT}or{DATASET}_{LAYER}-
SMIPS_totalbucket,SMIPS_SMindex,SMIPS_bucket1, … -
AET(single layer)
-
-
SLGA (soil):
{COLLECTION}_{DEPTH}-
AWC_000_005,AWC_005_015, …,AWC_100_200 -
CLY_000_005,CLY_005_015, …, etc.
-
-
Static (single-layer): Dataset name
-
ASC(character soil order, e.g., “2 - Sodosol”) -
CANOPY(height in metres) -
PHENOLOGY(day-of-year) -
SOILDIV(NMDS ordination score)
-
Handling Missing Data
If a request fails (network error, API unavailable, etc.), that
column will contain NA for affected rows. Use
na.rm = TRUE to remove rows where all dataset columns are
NA:
dt <- collect_tern_data(
lon = 138.6,
lat = -34.9,
date_range = dates,
datasets = c("SMIPS", "AET"),
na.rm = TRUE # Remove dates with no successful extractions
)Data Types
- Numeric columns (default): SMIPS, AET, SLGA, CANOPY, PHENOLOGY, SOILDIV
- Character column: ASC returns soil order descriptions (e.g., “2 - Sodosol”, “15 - Vertosol”)
Use as.numeric() on ASC column if you need just the
numeric code.
Performance Tips
- Request only needed datasets to reduce download time
-
Use specific variants/depths (e.g.,
depth = "000_005") if all 6 depths aren’t needed - Batch multiple locations instead of looping — vectorized operations are faster
-
Set
verbose = FALSEafter testing to suppress messages
Examples
Agricultural Analysis: Soil Properties at a Field
# Extract soil properties at one location for a growing season
field <- list(lon = 138.6, lat = -34.9)
dt_soil <- collect_tern_data(
lon = field$lon,
lat = field$lat,
date_range = "2024-01-01", # Static datasets, so date doesn't matter
datasets = c("AWC", "CLY", "PHC"), # Water capacity, clay %, pH
depth = "all" # All 6 soil depths
)
# Wide format for each soil attribute across depths
head(dt_soil)Climate Monitoring: Multiple Sites
# Monitor evapotranspiration and canopy height at 5 monitoring stations
stations <- data.frame(
name = c("Adelaide", "Murray Bridge", "Naracoorte", "Coonalpyn", "Kingoonya"),
lon = c(138.6, 139.3, 140.8, 140.7, 137.2),
lat = c(-34.9, -35.3, -37.3, -35.5, -30.5)
)
et_data <- collect_tern_data(
xy = stations,
date_range = c("2023-01-01", "2023-12-31"),
datasets = "AET",
verbose = TRUE
)
# Pivot to wide format for comparison
pivot_wider(et_data,
names_from = name,
values_from = AET,
id_cols = date)