SM.40 Dataset

Description

  • Target Soil Properties: SOC, pH, Clay

  • Groups of Features: DEM, ERa

  • Sample size: 40

  • Number of Features: 3

  • Coordinates: With coordinates (EPSG: 32633)

  • Location: South Moravia, Czechia

  • Sampling Design: Stratified sampling from previous regular grid sampling, stratification was handpicked to cover contrasting areas

  • Study Area Size: 53 ha

  • Geological Setting: Weichselian sandy loess

  • Previous Data Publication: None

  • Contact Information:
  • License: CC BY-SA 4.0

  • Publication/Modification Date (d/m/y): 28.02.25, version 1.0

  • Changelog:
    • Version 1.0 (28.02.25): Initial release

Details

Dataset

The dataset contains the following target soil properties and features:

Target Soil Properties:

SOC - Soil Organic Carbon
  • Code: SOC_target

  • Unit: %

  • Protocol: Measured through titration after oxidization of the organic carbon following slight adjustments of the Walkley & Black (1934) method (Zbíral et al. 2004)

  • Sampling Date: May 2004

  • Sampling Depth: 0 - 30 cm

pH
  • Code: pH_target

  • Unit: Unitless

  • Protocol: Measured in KCl suspension with a glass electrode with unspecified liquid:soil ratio (Zbíral 2002)

  • Sampling Date: May 2004

  • Sampling Depth: 0 - 30 cm

Clay
  • Code: Clay_target

  • Unit: %

  • Protocol: Hydrometer method, measured through fractioning the soil into the sand fractions by sieving, and the silt and clay fractions by measuring suspension density using a hydrometer following slight adjustments of the Bouyoucos (1927) method

  • Sampling Date: April 2006

  • Sampling Depth: 0 - 30 cm

Groups of Features:

DEM – Digital Elevation Model and Terrain Parameters
  • Number of Features: 2

  • Code(s): Altitude, Slope

  • Unit: Altitude in m, Slope in °

  • Sensing: Digital elevation model raster (~2 m) based on LiDAR from “Geoportal of the Czech Office for Surveying, Mapping and Cadastre”

  • Processing: Calculating Slope with terrain function of the raster R-package, extracting DEM values from raster at soil sampling locations

  • Sampling Date: Unknown

ERa – Apparent Electrical Resistivity
  • Number of Features: 1

  • Code(s): ERa

  • Unit: Ω m

  • Sensing: EM38 sensor (Geonics Ltd., Mississauga, Canada) drawn by a vehicle with exploration depth of 0 - 75 cm, in-situ

  • Processing: Ordinary Kriging to align sensing- with soil sampling locations

  • Sampling Date: May 2004

Examples

# Load and explore the dataset
data = load_dataset("SM.40")
dataset = data["Dataset"]
folds = data["Folds"]
coords = data["Coordinates"]

# Split into train/test using fold 1
X_train, X_test, y_train, y_test = split_dataset(
    data=data,
    fold=1,
    targets=["pH_target", "SOC_target", "Clay_target"]
)

# Calculate model performance
predictions = model.predict(X_test)
metrics = calculate_performance(y_test, predictions)
print(f"R2: {metrics['r2']:.3f}, RMSE: {metrics['rmse']:.3f}")

# Visualize soil properties
soil_map = plot_soil_map(data, "pH_target", zoom_start=14)
soil_map.save("SM40_pH_map.html")

References

Bouyoucos, G. J. (1927). The hydrometer as a new method for the mechanical analysis of soils. Soil science, 23(5), 343-354.

Walkley, A. & Black, I. A. (1934). An examination of the Degtjareff method for determining soil organic matter, and a proposed modification of the chromic acid titration method. Soil science, 37(1), 29-38.

Zbíral, J., Honsa, I., Malý, S. & Čižmář, D (2004). Analýza půd III : jednotné pracovní postupy [Soil Analysis III : Unified working procedures]. Brno: UKZUZ, 199.

Zbíral, J. (2002). Analýza půd I : jednotné pracovní postupy [Soil analysis I: Integrated work procedures]. Brno: UKZUZ, 197.