PC.45 Dataset ============= Description ----------- * Target Soil Properties: SOC, pH, Clay * Groups of Features: CSMoist, ERa * Sample size: 45 * Number of Features: 4 * Coordinates: Without coordinates as coordinates could not be found anymore * Location: Pest County, Hungary * Sampling Design: Stratified systematic sampling, where three 70 m wide transects were selected based on contrasting environmental settings and soil types ((1) agricultural land, (2) salt affected grassland, (3) forest) * Study Area Size: 4.5 ha * Geological Setting: Alluvial plain of the Danube (2 transects) and wind-blown dune region, where the calcareous sediments are originating from the Danube * Previous Data Publication: None * Contact Information: * Csilla.Farkas (Csilla.Farkas@nibio.no), Norwegian Institute of Bioeconomy Research (NIBIO) * Tibor Tóth (tibor@rissac.hu), HUN-REN Centre for Agricultural Research * License: CC BY-SA 4.0 * Publication/Modification Date (d/m/y): 28.02.25, version 1.0 * Changelog: * Version 1.0 (28.02.25): Initial release Details ------- Dataset ^^^^^^^ The dataset contains the following target soil properties and features: Target Soil Properties: """""""""""""""""""""" SOC - Soil Organic Carbon * Code: ``SOC_target`` * Unit: % * Protocol: Measured through titration after oxidization of the organic carbon (Tyurin 1935) * Sampling Date: November 2004 * Sampling Depth: 0 – 20 cm pH * Code: ``pH_target`` * Unit: Unitless * Protocol: Measured in water suspension with a glass electrode with a 2.5:1 liquid:soil volumetric ratio (MSz-08-0206/2-1978) * Sampling Date: November 2004 * Sampling Depth: 0 – 20 cm Clay * Code: ``Clay_target`` * Unit: % * Protocol: Sieve-Pipette method, measured through fractioning the soil into the sand fractions by sieving, and the silt and clay fractions by sedimentation in water, Hungarian adaptation (MSZ-08-0205-1978) * Sampling Date: November 2004 * Sampling Depth: 0 – 20 cm Groups of Features: """"""""""""""""" CSMoist – Capacitive Soil Moisture * Number of Features: 1 * Code(s): ``CSMoist`` * Unit: % (volumetric moisture content) * Sensing: Capacitive soil moisture sensor (BR-30, Research Institute of Soil Science and Agricultural Chemistry, Hungary, Budapest) with exploration depth of 10 cm, in-situ * Processing: None * Sampling Date: November 2004 ERa – Apparent Electrical Resistivity * Number of Features: 3 * Code(s): ``ERa_EM``, ``ERa_ERS``, ``ERa_P`` * Unit: Ω m * Sensing: Three different devices * ERa_EM from Electromagnetic induction sensor (EMRC-120, Geoelectro, Nagykovácsi, Hungary) with exploration depth of 100 cm, in-situ * ERa_ERS from four electrode resistivity sensors (Martek SCT, Martek Instruments Inc., USA, Raleigh) with exploration depth of 20 cm, in-situ * ERa_P from Dielectric probe (Percometer, Adek Ltd, Estonia, Tiskre) with exploration depth of 10 to 50 cm, in-situ * Processing: None * Sampling Date: November 2004 Examples -------- .. code-block:: python from LimeSoDa import load_dataset, split_dataset from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score, mean_squared_error import numpy as np # Load and explore the dataset data = load_dataset("PC.45") dataset = data["Dataset"] folds = data["Folds"] coords = data["Coordinates"] # Will be NA for PC.45 # Split into train/test using fold 1 X_train, X_test, y_train, y_test = split_dataset( data=data, fold=1, targets=["pH_target", "SOC_target", "Clay_target"] ) # Fit model and get predictions model = LinearRegression() model.fit(X_train, y_train) predictions = model.predict(X_test) # Calculate performance metrics r2 = r2_score(y_test, predictions) rmse = np.sqrt(mean_squared_error(y_test, predictions)) print(f"R-squared: {r2:.7f}") print(f"RMSE: {rmse:.7f}") References ---------- Tyurin, I. V. (1935). Comparative study of the methods for the determination of organic carbon in soils and water extracts from soils. Materials on genesis and geography of soils, ML Academy of Sci USSR, 139-158.