W.50 Dataset¶
Description¶
Target Soil Properties: SOC, pH, Clay
Groups of Features: DEM, ERa, VI, XRF
Sample size: 50
Number of Features: 15
Coordinates: Without coordinates because of privacy concerns
Location: Wisconsin, USA
Sampling Design: Conditioned latin hypercube sampling based on electrical conductivity, terrain parameters, and normalized difference vegetation index
Study Area Size: 80 ha
Geological Setting: Glacial outwash and sediments of the Johnson End Moraine
Previous Data Publication: None
- Contact Information:
Jingyi Huang (jhuang426@wisc.edu), University of Wisconsin-Madison
License: CC BY-SA 4.0
Publication/Modification Date (d/m/y): 28.02.25, version 1.0
- Changelog:
Version 1.0 (28.02.25): Initial release
Details¶
Dataset¶
The dataset contains the following target soil properties and features:
Target Soil Properties:¶
- SOC - Soil Organic Carbon
Code:
SOC_targetUnit: %
Protocol: Measured CO₂ release during dry combustion after removing inorganic carbon with an acid (Nelson and Sommers 1996)
Sampling Date: July 2019
Sampling Depth: 0 – 10 cm
- pH
Code:
pH_targetUnit: Unitless
Protocol: Measured in water suspension with a glass electrode with a 1:1 liquid:soil gravimetric ratio (Burt 2014)
Sampling Date: July 2019
Sampling Depth: 0 – 10 cm
- Clay
Code:
Clay_targetUnit: %
Protocol: Hydrometer method; separation of the fractions by sieving and sedimentation. Measurement of the separated fractions by weighing the density of the suspension (Gee and Bauder 1979)
Sampling Date: July 2019
Sampling Depth: 0 – 10 cm
Groups of Features:¶
- DEM – Digital Elevation Model and Terrain Parameters
Number of Features: 2
Code(s):
Altitude,SlopeUnit:
Altitudein m,Slopein °Sensing: Digital elevation model raster (3 m) based on LiDAR from the “Wisconsin Department of Natural Resources”
Processing: Calculating
Slopewithterrainfunction of therasterpackage, extracting DEM values from raster at soil sampling locations, resampled from the original 3 m resolution to 5 m resolutionSampling Date: Unknown
- ERa – Apparent Electrical Resistivity
Number of Features: 1
Code(s):
ERaUnit: Ω m
Sensing: DUALEM-1HS instrument (DUALEM Inc., Milton, Canada) with exploration depth of 0 - 30 cm, in-situ
Processing: Ordinary Kriging to align sensing- with soil sampling locations
Sampling Date: July 2019
- VI - Vegetation Indices
Number of Features: 2
Code(s):
NDVI,GNDVIUnit: Unitless
Sensing: Sentinel-2 image during vegetative period (Level-2A) from “Copernicus Open Access Hub”
Processing: Calculating
NDVIas (B08 - B04) / (B08 + B04) andGNDVIas (B08 - B03) / (B08 + B03), extracting VI values from raster at soil sampling locationsSampling Date: July 2019
- XRF – X-ray Fluorescence Derived Elemental Concentrations
Number of Features: 10
Code(s):
XRF_Mg,XRF_Al,XRF_Si,XRF_Ca,XRF_Ti,XRF_Mn,XRF_Fe,XRF_Zn,XRF_Sr,XRF_ZrUnit: ppm (estimated through XRF Geochem not ground truth)
Sensing: Delta Premium PXRF spectrometer (Olympus Scientific Solutions Americas Inc., Waltham, USA), on dried and sieved samples (<2 mm) in the laboratory
Processing: Compton normalization method to transform full spectra into estimates of elemental concentrations with accompanied software of the sensor (Geochem mode)
Sampling Date: July 2019
Examples¶
from LimeSoDa import load_dataset, split_dataset
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error
import numpy as np
# Load and explore the dataset
data = load_dataset("W.50")
dataset = data["Dataset"]
folds = data["Folds"]
coords = data["Coordinates"] # Note: No coordinates available
# Split into train/test using fold 1
X_train, X_test, y_train, y_test = split_dataset(
data=data,
fold=1,
targets=["pH_target", "SOC_target", "Clay_target"]
)
# Fit model and get predictions
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
# Calculate performance metrics
r2 = r2_score(y_test, predictions)
rmse = np.sqrt(mean_squared_error(y_test, predictions))
print(f"R-squared: {r2:.7f}")
print(f"RMSE: {rmse:.7f}")
References¶
Burt, R. (Ed.) (2014). Kellogg soil survey laboratory methods manual. United States Department of Agriculture, Natural Resources Conservation Service, National Soil Survey Center, Kellogg Soil Survey Laboratory.
Gee, G. W., & Bauder, J. W. (1979). Particle size analysis by hydrometer: a simplified method for routine textural analysis and a sensitivity test of measurement parameters. Soil Science Society of America Journal, 43(5), 1004-1007.
Nelson, D.W. & Sommers, L.E. (1996) Total Carbon, Organic Carbon, and Organic Matter. In: Sparks, D.L., Page, A.L., Helmke, P.A., Loeppert, R.H., Soltanpour, P.N., Tabatabai, M.A., Johnston, C.T. & Sumner, M.E., Eds., Methods of Soil Analysis. Part 3. Chemical Methods, Soil Science Society of America, Madison, WI, 961-1010.