SL.125 Dataset ============= Description ----------- * Target Soil Properties: SOM, pH, Clay * Groups of Features: ERa, vis-NIR * Sample size: 125 * Number of Features: 2,082 * Coordinates: Without coordinates because of privacy concerns instead with dummy coordinates (EPSG: 4326) * Location: Skåne Län, Sweden * Sampling Design: Three sampling designs over multiple adjacent fields: (1) regular grid sampling, targeted sampling through surface tortoise sampling (Persson et al. 2023) based on (2) ERa and (3) reflectance from remote sensing * Study Area Size: 78 ha * Geological Setting: High spatial variability of sandy till, clay till with chalk and glacial clay * Previous Data Publication: None * Contact Information: * Johanna Wetterlind (Johanna.Wetterlind@slu.se), Swedish University of Agricultural Sciences * Bo Stenberg (Bo.Stenberg@slu.se), Swedish University of Agricultural Sciences * License: CC BY-SA 4.0 * Publication/Modification Date (d/m/y): 28.02.25, version 1.0 * Changelog: * Version 1.0 (28.02.25): Initial release Details ------- Dataset ^^^^^^^ The SL.125 dataset contains soil measurements and features organized in a dataframe with the following properties: Target Soil Properties: """""""""""""""""""""" SOM - Soil Organic Matter * Code: ``SOM_target`` * Unit: % * Protocol: Measured through the weight difference before and after ignition of the soil with additional correction for structural water from clay by using the formula: SOM = LI − 0.46 − 0.047 × clay content (%) (Ekström 1927) * Sampling Date: September 2006 * Sampling Depth: 0 - 20 cm pH * Code: ``pH_target`` * Unit: Unitless * Protocol: Measured in a water suspension with a glass electrode with a 5:1 liquid:soil volumetric ratio * Sampling Date: September 2006 * Sampling Depth: 0 - 20 cm Clay * Code: ``Clay_target`` * Unit: % * Protocol: Sieve-Pipette method, measured through fractioning the soil into the sand fractions by sieving, and the silt and clay fractions by sedimentation in water (Gee and Bauder 1986) * Sampling Date: September 2006 * Sampling Depth: 0 - 20 cm Groups of Features: """"""""""""""""" ERa – Apparent Electrical Resistivity * Number of Features: 1 * Code(s): ``ERa`` * Unit: Ω m * Sensing: EM38 sensor (Geonics Ltd., Mississauga, Canada) with exploration depth of 0 - 150 cm, in-situ * Processing: Ordinary Kriging to align sensing- with soil sampling locations * Sampling Date: April 2006 vis-NIR – Visible and Near Infrared Spectroscopy * Number of Features: 2,081 * Code(s): ``wl_420``, ``wl_421``, ``wl_422`` ... ``wl_2500`` * Unit: % (Reflectance) * Sensing: FieldSpec Pro FR scanning instrument (Analytical Spectral Devices Inc., Boulder, USA), on dried and sieved samples (<2 mm) in the laboratory, spectral range was 350 – 2,500 nm at 1.4 – 2.0 nm intervals * Processing: Discarding noisy edges of the spectrum (350 - 420 nm), resampling to 1 nm intervals * Sampling Date: September 2006 * Spectral Information (After Data Processing): * Data Representation: Wavelength (in nm) * Spectral Resolution: 1 nm * Spectral Range: 420 - 2,500 nm Examples -------- .. code-block:: python from LimeSoDa import load_dataset, split_dataset from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score, mean_squared_error import numpy as np # Load and explore the dataset data = load_dataset("SL.125") dataset = data["Dataset"] folds = data["Folds"] coords = data["Coordinates"] # Note: Contains dummy coordinates # Split into train/test using fold 1 X_train, X_test, y_train, y_test = split_dataset( data=data, fold=1, targets=["pH_target", "SOM_target", "Clay_target"] ) # Fit model and get predictions model = LinearRegression() model.fit(X_train, y_train) predictions = model.predict(X_test) # Calculate performance metrics r2 = r2_score(y_test, predictions) rmse = np.sqrt(mean_squared_error(y_test, predictions)) print(f"R-squared: {r2:.7f}") print(f"RMSE: {rmse:.7f}") References ---------- Ekström, G. (1927). Klassifikation av Svenska Åkerjordar (Classification of Swedish arable soils). Sveriges Geologiska Undersökning, Ser C. 345, 161 pp. Gee, G.W. & Bauder, J.W. (1986) Particle-Size Analysis. In: Klute, A., Ed., Methods of Soil Analysis, Part 1. Physical and Mineralogical Methods, Agronomy Monograph No. 9, 2nd Edition, American Society of Agronomy/Soil Science Society of America, Madison, WI, 383-411. Persson, K., Söderström, M. & Mutua, J. (2023). SurfaceTortoise: Find Optimal Sampling Locations Based on Spatial Covariate(s). R package version 2.0.1.