SC.93 Dataset¶
Description¶
Target Soil Properties: SOC, pH, Clay
Groups of Features: vis-NIR
Sample size: 93
Number of Features: 2,146
Coordinates: With coordinates (EPSG: 32722)
Location: Santa Catarina, Brazil
Sampling Design: Conditioned latin hypercube sampling based on terrain parameters
Study Area Size: 108 ha
Geological Setting: Heavily weathered soils originating from volcanic rock of the Serra Geral Formation (basalt and dacite)
Previous Data Publication: None
- Contact Information:
Taciara Zborowski Horst (taciaraz@utfpr.edu.br), Federal University of Technology – Paraná
Ricardo Simão Diniz Dalmolin (dalmolin@ufsm.br), Federal University of Santa Maria
License: CC BY-SA 4.0
Publication/Modification Date (d/m/y): 28.02.25, version 1.0
- Changelog:
Version 1.0 (28.02.25): Initial release
Details¶
Dataset¶
The SC.93 dataset contains soil measurements and features organized in a dataframe with the following properties:
Target Soil Properties:¶
- SOC - Soil Organic Carbon
Code:
SOC_targetUnit: %
Protocol: Measured through light absorption after oxidization of the organic carbon in suspension (Tedesco et al. 1995)
Sampling Date: December 2016
Sampling Depth: 0 - 20 cm
- pH
Code:
pH_targetUnit: Unitless
Protocol: Measured in water suspension with a glass electrode ratio with a 1:1 liquid:soil volumetric ratio
Sampling Date: December 2016
Sampling Depth: 0 - 20 cm
- Clay
Code:
Clay_targetUnit: %
Protocol: Sieve-Pipette method, measured through fractioning the soil into the sand fractions by sieving, and the silt and clay fractions by sedimentation in water (Gee and Bauder 1986)
Sampling Date: December 2016
Sampling Depth: 0 - 20 cm
Groups of Features:¶
- vis-NIR – Visible and Near Infrared Spectroscopy
Number of Features: 2,146
Code(s):
wl_355,wl_356,wl_357…wl_2500Unit: % (Reflectance)
Sensing: ASD FieldSpec 4 (Analytical Spectral Devices Inc., Boulder, USA), on dried and sieved samples (<2 mm) in the laboratory, spectral range was 355 - 2,500 nm at 3 - 8 nm intervals
Processing: Resampling to 1 nm intervals
Sampling Date: March 2017
- Spectral Information (After Data Processing):
Data Representation: Wavelength (in nm)
Spectral Resolution: 1 nm
Spectral Range: 355 – 2500 nm
Examples¶
from LimeSoDa import load_dataset, split_dataset
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error
import numpy as np
# Load and explore the dataset
data = load_dataset("SC.93")
dataset = data["Dataset"]
folds = data["Folds"]
coords = data["Coordinates"]
# Split into train/test using fold 1
X_train, X_test, y_train, y_test = split_dataset(
data=data,
fold=1,
targets=["pH_target", "SOC_target", "Clay_target"]
)
# Fit model and get predictions
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
# Calculate performance metrics
r2 = r2_score(y_test, predictions)
rmse = np.sqrt(mean_squared_error(y_test, predictions))
print(f"R-squared: {r2:.7f}")
print(f"RMSE: {rmse:.7f}")
References¶
Gee, G.W. & Bauder, J.W. (1986) Particle-Size Analysis. In: Klute, A., Ed., Methods of Soil Analysis, Part 1. Physical and Mineralogical Methods, Agronomy Monograph No. 9, 2nd Edition, American Society of Agronomy/Soil Science Society of America, Madison, WI, 383-411.
Tedesco, M.J., Gianello, C., Bissani, C., Bohnen, H. & Volkweiss, S.J. (1995) Análise de solo, plantas e outros materiais. [Analysis of soil, plants and other materials.] 2nd Edition, Departamento de Solos da Universidade Federal do Rio Grande do Sul, Porto Alegre, 174.