SC.50 Dataset¶
Description¶
Target Soil Properties: SOC, pH, Clay
Groups of Features: DEM, ERa
Sample size: 50
Number of Features: 3
Coordinates: With coordinates (EPSG: 32722)
Location: Santa Catarina, Brazil
Sampling Design: Regular grid sampling
Study Area Size: 13 ha
Geological Setting: Heavily weathered soils originating from Mesozoic basalt rocks
Previous Data Publication: Full dataset published in Bottega & Safanelli (2024)
- Contact Information:
Eduardo Bottega (bottega.elb@gmail.com), Federal University of Santa Maria
José Lucas Safanelli (jsafanelli@woodwellclimate.org), Woodwell Climate Research Center
License: CC BY-SA 4.0
Publication/Modification Date (d/m/y): 28.02.25, version 1.0
- Changelog:
Version 1.0 (28.02.25): Initial release
Details¶
Dataset¶
The SC.50 dataset contains soil measurements and features organized in a dataframe with the following properties:
Target Soil Properties:¶
- SOC - Soil Organic Carbon
Code:
SOC_targetUnit: %
Protocol: Measured through titration after oxidization of the organic carbon (Walkley & Black 1934)
Sampling Date: November 2013
Sampling Depth: 0 – 20 cm
- pH
Code:
pH_targetUnit: Unitless
Protocol: Measured in water suspension with a glass electrode with a 5:1 liquid:soil volumetric ratio
Sampling Date: November 2013
Sampling Depth: 0 – 20 cm
- Clay
Code:
Clay_targetUnit: %
Protocol: Sieve-Pipette method, measured through fractioning the soil into the sand fractions by sieving, and the silt and clay fractions by sedimentation in water, German adaptation (DIN ISO 11277)
Sampling Date: November 2013
Sampling Depth: 0 – 20 cm
Groups of Features:¶
- DEM – Digital Elevation Model and Terrain Parameters
Number of Features: 2
Code(s):
Altitude,SlopeUnit:
Altitudein m,Slopein °Sensing: Digital elevation model raster (30 m) based on synthetic aperture radar from “Copernicus Open Access Hub”
Processing: Calculating
Slopewithterrainfunction of the raster R-package, extracting DEM values from raster at soil sampling locationsSampling Date: October 2011
- ERa – Apparent Electrical Resistivity
Number of Features: 1
Code(s):
ERaUnit: Ω m
Sensing: LandMapper ERM-02 conductivity meter (Landviser, League City, USA) with exploration depth of 0 - 20 cm, in-situ
Processing: None
Sampling Date: November 2014
Examples¶
from LimeSoDa import load_dataset, split_dataset
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error
import numpy as np
# Load and explore the dataset
data = load_dataset("SC.50")
dataset = data["Dataset"]
folds = data["Folds"]
coords = data["Coordinates"]
# Split into train/test using fold 1
X_train, X_test, y_train, y_test = split_dataset(
data=data,
fold=1,
targets=["pH_target", "SOC_target", "Clay_target"]
)
# Fit model and get predictions
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
# Calculate performance metrics
r2 = r2_score(y_test, predictions)
rmse = np.sqrt(mean_squared_error(y_test, predictions))
print(f"R-squared: {r2:.7f}")
print(f"RMSE: {rmse:.7f}")
References¶
Bottega, E. L. & Safanelli J. L. (2024). Data for “Site-Specific Management Zones Delineation Based on Apparent Soil Electrical Conductivity in Two Contrasting Fields of Southern Brazil”. Zenodo repository. https://doi.org/10.5281/zenodo.13770031
Walkley, A. & Black, I. A. (1934). An examination of the Degtjareff method for determining soil organic matter, and a proposed modification of the chromic acid titration method. Soil science, 37(1), 29-38.