Emerging investigator series: predicted losses of sulfur and selenium in european soils using machine learning: a call for prudent model interrogation and selection

Gerrad D. Jones*a, Logan Insingaa, Boris Drozabc, Aryeh Feinbergd, Andrea Stenkeeg, Jo Smithf, Pete Smithf and Lenny H. E. Winkeleg
aDepartment of Biological & Ecological Engineering, Oregon State University, Corvallis, Oregon 97331, USA. E-mail: gerrad.jones@oregonstate.edu
bSchool of Biological, Earth and Environmental Sciences, University College Cork, Cork, Ireland
cWater and Environment Research Group, Environmental Research Institute, University College Cork, Lee Road, Cork, Ireland
dInstitute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
eInstitute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland
fInstitute of Biological and Environmental Sciences, School of Biological Sciences, University of Aberdeen, Aberdeen AB24 3UU, UK
gEawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland

Received 8th June 2024 , Accepted 28th July 2024

First published on 5th August 2024


Abstract

Reductions in sulfur (S) atmospheric deposition in recent decades have been attributed to S deficiencies in crops. Similarly, global soil selenium (Se) concentrations were predicted to drop, particularly in Europe, due to increases in leaching attributed to increases in aridity. Given its international importance in agriculture, reductions of essential elements, including S and Se, in European soils could have important impacts on nutrition and human health. Our objectives were to model current soil S and Se levels in Europe and predict concentration changes for the 21st century. We interrogated four machine-learning (ML) techniques, but after critical evaluation, only outputs for linear support vector regression (Lin-SVR) models for S and Se and the multilayer perceptron model (MLP) for Se were consistent with known mechanisms reported in literature. Other models exhibited overfitting even when differences in training and testing performance were low or non-existent. Furthermore, our results highlight that similarly performing models based on RMSE or R2 can lead to drastically different predictions and conclusions, thus highlighting the need to interrogate machine learning models and to ensure they are consistent with known mechanisms reported in the literature. Both elements exhibited similar spatial patterns with predicted gains in Scandinavia versus losses in the central and Mediterranean regions of Europe, respectively, by the end of the 21st century for an extreme climate scenario. The median change was −5.5% for S (Lin-SVR) and −3.5% (MLP) and −4.0% (Lin-SVR) for Se. For both elements, modeled losses were driven by decreases in soil organic carbon, S and Se atmospheric deposition, and gains were driven by increases in evapotranspiration.



Environmental significance

Sulfur (S) and selenium (Se) are essential elements for human health, and dietary intake occurs primarily through cereal grain consumption. Changes in soil organic carbon and S and Se atmospheric deposition are predicted to decrease S and Se concentrations in Europe soils, which could reduce the nutritional quality of European crops. Given the global importance of European agricultural production, reductions of essential elements in European soils could increase the prevalence of malnutrition worldwide.

Introduction

Global, climate, and environmental change have affected a variety of marine, aquatic, and terrestrial environments, including soils.1,2 For some soil constituents, such as organic carbon, the impacts of environmental change have received considerable attention,3,4 but its effects on the distributions of essential elements, such as sulfur (S) and selenium (Se), have largely been ignored. Historically, changes in soil element concentrations have been hypothesized to occur over thousands to millions of years, driven by long-term processes such as weathering, pedogenesis, S and Se atmospheric deposition, and translocation.5,6 However, environmental variables influencing trace element concentrations are dynamic across time scales ranging from hours to decades. For example, fluctuations in trace element concentrations in wetland soils and streams have been reported over hours to months due to fluctuations in pH, temperature, and redox conditions.7,8 Within the last decade, soil S concentrations have decreased throughout Europe due to reduced atmospheric emissions and subsequent deposition of S.9–12 Finally, in long-term (∼150 years) agricultural experiments, increases in soil Se were measurable over decadal time scales in response to increases in soil organic carbon content following conversion of agricultural soils to grasslands and forests.13–15 These observations suggest that elements and the retention drivers respond rapidly, in some cases, to changes in environmental conditions, which raises the question, how likely will broad-scale soil element distributions, including those in this study (i.e., S and Se), change in the future?

Soil concentrations of S and Se are controlled by a wide variety of chemical processes, e.g., soil pH and redox conditions (controlling elemental speciation, bioavailability, sorption, and solubility), biological processes (important for the distribution in inorganic and organic soil pools), and physical processes (controlling leaching and soil structure).16–20 Because these processes interact, are dynamic, and span multiple temporal scales, changes in spatial distributions of elements are expected when changing conditions alter the relative importance of element sources and retention mechanisms in soil. Changes in element cycling are expected to be linked to climate and environmental change, such as changes in soil organic C.21 Furthermore, soil pH and redox conditions are affected by several interacting variables including soil moisture, soil carbon, plant and microbial respiration, evapotranspiration (ET), and temperature, all of which are linked to climate and environmental change.22–28 In addition to climate and land use change, reductions in atmospheric emissions (e.g., SO2, NOx, Se) from fossil fuel burning in recent decades in Europe and North America have changed soil chemical properties, including pH and carbon content.29,30 Accordingly, sulfur fertilizer application has substantially increased in some areas (e.g., midwestern US) in response to the reductions in S atmospheric depositions.31 Coupled with land use change, and its effect on soil physicochemical properties (e.g., altered carbon content,32 soil pH33), changes in anthropogenic activities are also expected to influence element cycling in soils. These processes are dynamic and span multiple temporal scales, which suggests that (1) biogeochemical cycling is more dynamic than previously hypothesized, and (2) changes in broad-scale element distributions in soils during this century are possible and should even be expected.

Previously, Jones et al. (2017) made global predictions of soil Se concentrations using an ensemble of machine-learning (ML) regression techniques.14 Mechanistically, losses in soil Se were driven by increases in aridity, which promote oxidizing conditions and thus increased leaching in soils. Although sensitivity analyses were used to ensure that the models were consistent with known mechanisms reported in the literature, some shortcomings of their analyses are apparent. The models utilized low resolution (2.5° × 2.5°) climate data, which could potentially affect prediction accuracy and miss more local processes.34 In addition, the ensemble model (i.e., average between ML prediction) used in the global model exhibited some overfitting based on a moderate difference (ΔR2 = 0.18) in the training and testing performance. Better development of machine learning models is essential to determine whether models are capturing trends consistent with expected mechanisms or simply fitting data.

This study seeks to fill in these two research gaps: namely, evaluating the likelihood that soil concentrations of S and Se are affected by climate change and investigating soil element predictions using high-resolution European-specific data. We predict that S and Se have similar biogeochemical cycles, therefore will exhibit similar concentration reductions in soils. Furthermore, predicted losses in soil Se on a global scale were particularly pronounced in Europe.14 Given the international importance of Europe's agricultural industry,35,36 reductions in (micro)nutrients in Europe could have a global impact on human health. As a result, our objectives were to (1) model current soil S and Se concentrations in Europe that are consistent with literature regarding known biogeochemical mechanisms, (2) predict changes in soil S and Se concentrations in European soils over the 21st century, and (3) highlight the need to thoroughly scrutinize the result for different ML techniques to ensure the highest quality predictions. Addressing these objectives will help farmers and soil fertility experts to manage soils to meet the food quantity and quality challenges of the future.

Materials and methods

Data processing

Soil element data were obtained from the GEMAS dataset,37 which is composed of soils collected in 2008 from both grazing lands (sample depth: 0–10 cm, n = 2023 samples) and plowed lands (sample depth: 0–20 cm, n = 2108 samples). Sample sites were randomly selected taking one sample within a grid of 50 × 50 km from 33 European countries. Detailed sampling and analysis methods can be found elsewhere,37 but briefly, all soil samples were air-dried, manually disaggregated, sieved (2 mm mesh) and digested in aqua regia. Then, samples were analyzed by inductively coupled plasma-mass spectrometry with a detection limit of 4 and 0.04 mg kg−1 for S and Se, respectively.

Predictive variables (n = 46; Table 1 for data descriptions and sources) that characterize soil retention mechanisms including climate, soil properties, anthropogenic impacts, geology and vegetation were obtained from several previous modelling studies. Linear support vector regression (Lin-SVR) analysis was used to pre-screen the importance of each predictive variable. Variables with low importance to the machine learning models elicit a small response in soil element concentrations. No critical threshold was used, but variables with low coefficient weights relative to other variables were removed from analyses. In addition, we included variables that are known to affect the fate and transport of S and Se in soils (e.g., pH, precipitation) regardless of their overall importance ranking. This was advantageous because these variables helped us verify that the ML techniques captured expected mechanistic patterns based on literature knowledge (detailed in ESI). Ultimately, we retained the following set of predictive variables for further modelling (Table 1): clay content, soil pH, soil organic carbon (SOC) content, average daily precipitation (Precip), average daily evapotranspiration (ET), aridity index (AI), average chemical index of alteration (CIA), and total S or Se atmospheric deposition (S and Se Dep). For each element, only the corresponding deposition data were included in each respective model. The temporal availability of all the data varied (Table 1), with some data having no temporal information and other data having up to 30 years of historical data with different temporal resolution. Therefore, we harmonized the data by averaging them over the period of coverage and used a single value to predict soil element concentrations. Recently, changes in soil S and Se concentrations were measurable over a decadal scale,14,31 so averaging decadal climate, carbon, and deposition data is appropriate based on observations. More detail on soil element data and predictive variables pre-processing prior to ML analysis are explained in detail in ESI.

Table 1 Final predictive variables used in the machine-learning analysis
Variable Unit Description Scenario Year coverage Temporal resolutiond Min–max values for EU Res. (km) Ref.
a Held constant in future predictions.b Variables collected/modelled specifically from Europe.c Indicates when original data were in degree and converted to the equivalent in km at the equator, and.d Indicate the finest available resolution from the data source, which was annually averaged for this study. Res. and Ref. stand for resolution and references, respectively. All data were obtained from freely available sources or were obtained by contacting authors of published papers. MODIS: moderate resolution imaging spectroradiometer. SRTM shuttle radar topography mission. DEM digital elevation model. PET: potential evapotranspiration.
Clay contenta % Pred. using soil profiles and satellites data (MODIS & SRTM DEM)   1990–2016 No 5.0–37.0 0.25 34
Soil pHa Unitless Pred. using soil profiles and satellites data (MODIS & SRTM DEM)   1990–2016 No 3.8–7.6 0.25 34
Soil organic carbon (SOC)b,c t C ha−1 Mechanism based model pred. Current 1990–2000 Monthly 0.8–29.5 1 38
Future (A1FI) 2080 Monthly 0.8–31.1 1 38
Daily precipitation (Precip)b mm per year Ensemble of regional climate model Current 1971–2000 Daily 0.6–13.0 12.5 39
Future (RCP 8.5) 2071−2100 Daily 0.4–12.3 12.5 39
Daily evapotranspiration (ET)b mm per year Ensemble of regional climate model Current 1971–2000 Daily 0.42–2.26 12.5 39
Future (RCP 8.5) 2071–2100 Daily 0.52–2.35 12.5 39
Aridity index (AI)b Unitless AI = PET/precip. Current 1971–2000 Daily 0.09–5.46 12.5 39
Future (RCP 8.5) 2071−2100 Daily 0.11–9.52 12.5 39
Chemical index of alteration (CIA)a,b % CIA = (Al2O3/(Al2O3 + CaO + Na2O + K2O)) × 100   2008 No 31.5–87.2 Point 37
Total S atmospheric depositionc mg per (m2 year) Aerosol-chemistry-climate model Current 2005–2009 Yearly (2.4–195.6) × 10−12 ∼310 9
Future (SSP5−8.5) 2095–2099 Yearly (1.9–185.0) × 10−12 ∼310 9
Total Se atmospheric depositionc mg per (m2 year) Aerosol-chemistry-climate model Current 2005–2009 Yearly (7.8–565.6) × 10−16 ∼310 9
Future (SSP5−8.5) 2095–2099 Yearly (6.6–526.2) × 10−16 ∼310 9


Machine-learning (ML) analysis

Because no single ML technique is best suited for all datasets, four ML algorithms were used to model the spatial distributions of S and Se element in European soils. These four ML techniques account for non-linear relationship and potential interaction between predictive variables. The script was programmed in Python using algorithms developed by scikit-learn version 1.1.2[thin space (1/6-em)]40 and made available on Zenodo.41 The four models included Multi-layer Perceptron (MLP) Regressor, Support Vector Regression using the linear (Lin-SVR) and radial basis function (RBF-SVR) kernels, and Random Forest Regressor (RFR). Detailed model setup and analysis is included in the ESI, including hyperparameters and their ranges of tuning values (Fig. S1–S4). To preserve idiosyncrasies in predictions associated with each model technique and to better evaluate overfitting, we evaluated the results of each model individually and did not average results.42

We divided the dataset into training (80%) and testing (20%) subsets to help identify suitable parameters. For each tuning iteration (n = 1000), the training and test dataset was chosen randomly with replacement. Hyperparameter values of each model were chosen randomly within a set range, and after each iteration, the parameter values were rank sorted based on the root mean squared error (RMSE) of the testing dataset (detailed in ESI). The hyperparameter values associated with the lowest RMSE were selected for the final run. During the final runs (n = 100 iterations), the data were randomly split into training and testing to assess overfitting of the model. In addition, current and future predictions along with model sensitivity analyses were performed for each iteration; thus, all predictions are directly comparable. The results from the final runs from the same ML technique were averaged (detailed in ESI).

Models were assessed for overfitting, and individual pixels were assessed for both accuracy and precision (detailed in ESI). Briefly, overfitting was assessed by comparing performance (i.e., R2) of the training and testing datasets, scrutinizing sensitivity analyses to ensure that variables within the model behaved as expected, and examining current and future prediction for extreme patterns. Finally, to be as conservative as possible, all predictions were screened for accuracy (i.e., how close predicted concentrations matched measured concentrations) and precision (i.e., how variable was the predicted concentration). Pixel accuracy (i.e., relative residual = (measured concentration – modeled concentration)/measured concentration) and precision (i.e., prediction stdev/prediction mean) thresholds were set at ±30% and 0.3, respectively. All pixels more extreme than either of these thresholds were not interpreted; however, it is important to note that all points, regardless of their precision or accuracy, were used to model both current and future changes on soil element concentrations to avoid biasing the data. Accuracy and precision filters only influenced the data visualization and summary statistics.

Mechanistic evaluation

Each variable within the model has one or more mechanisms that influence element concentrations in the environment. Model sensitivity analyses and partial dependence plots were used to identify the dominant predictive variable influencing spatial patterns of each element (detailed in ESI).43,44 With the sensitivity analysis, we generalized the independent effect of each predictive variable on element concentrations. Based on the direction of the relationship, we inferred the potential mechanisms driving element concentrations based on trends reported within the literature. If a model fits the data well (i.e., high R2, low RMSE) but does not match mechanistic expectations, it suggests that the model overfits the data; however, we explored the possibility that other mechanisms may exist or that our model is unable to capture local scale processes.

Future predictions

For each ML technique, predicted future soil concentrations for S and Se were made based on projected values of all climates (i.e., AI, ET and Precip), SOC, and S and Se atmospheric deposition data. Therefore, it does not account for other factors that may change, e.g., sources such as Se and S fertilisation. Clay content, pH, and CIA future predictions were unavailable in the literature, therefore the data were fixed at their current values for the future scenario. Extreme climate change scenarios were used (A1FI for SOC,38 RCP 8.5 for climate,39 and SSP5-8.5 for deposition9). Future data for SOC was unavailable for many countries in Eastern Europe, so these pixels were excluded for the future prediction. Similarly as described elsewhere14 and in the previous section, predicted pixel that do not meet accuracy (±30%) and precision (0.3) thresholds were excluded. The results from the final runs (n = 100 iterations) from the same ML technique were averaged on the final future prediction.

Results and discussion

Machine learning (ML) technique validation

Our evaluation of ML model performance involved cross-validation, sensitivity analyses, and examination of geographic patterns in predicted element distributions. The training and testing R2 for all models were >0.5 (Table S1 & Fig. S6); however, a closer inspection revealed instances of poor fitting and overfitting in the RFR model (R2 difference between training and testing ≥0.2). Furthermore, univariate sensitivity analyses displayed unexpected trends in MLP, RBF-SVR, and RFR models for S and RBF-SVR and RFR models for Se, deviating from expected monotonic relationships (Fig. S8). At continental scales, there is no mechanistic reason for highly nonlinear or modal relationships. For models with non-monotonic sensitivity analyses, the future predictions were the most extreme and highly discontinuous with pixels of high loss adjacent to pixels of high gain (Fig. S10). Qualitatively, we expected continuous transitions in element concentrations across the continent, similar to shifts in broad scale changes in Precip, SOC, and deposition. Among the models, only the Lin-SVR models for both elements and the MLP model for Se were considered suitable for interpretation. The discarded models are presented in the ESI for comparison purposes (Table S1 & Fig. S6–S10).

Current predictions

The Lin-SVR and the MLP models visually captured the broad geographic patterns in element distributions with symmetrical residuals around 0 (Fig. 1, S5, & S7). For Lin-SVR predicting current S concentrations, the average training performance for each iteration was R2 = 0.52, RMSE = 1833 mg kg−1, and the average cross-validation performance was R2 = 0.55, RMSE = 1549 mg kg−1 (Table S1). The median absolute residuals for S from points that met accuracy and precision thresholds was 26.0 mg kg−1, or 8.9% of the median measured concentration (Fig. S5). The median residual is relatively small as the range of the resampled S data was between 61 and 26[thin space (1/6-em)]491 mg kg−1. For Lin-SVR predicting current Se concentrations, the average training performance for each iteration was R2 = 0.64 and RMSE = 0.16 mg kg−1, and the average cross-validation performance was R2 = 0.62 and RMSE = 0.16 mg kg−1. For the MLP Se model, the average training performance was R2 = 0.66 and RMSE = 0.14 mg kg−1, and the average cross validation performance was R2 = 0.64 and RMSE = 0.15 mg kg−1 (Table S1). The median absolute residual for Se from points that met accuracy and precision thresholds was 0.04 mg kg−1 for both the Lin-SVR and MLP models, which was 11.0% of the median measured concentration (Fig. S5). The median residual is relatively small as the range of the resampled Se data was between 0.1 and 2.6 mg kg−1. The model performance herein exceeds that of previous global scale modeling results (cross validation R2 = 0.49);14 however, all models under predicted high concentrations and over predicted low concentrations (Fig. 1a and 2a). For sulfur, ∼10 pixels located in Spain were under predicted by 1 to 2 orders of magnitude (Fig. 2a). These points were poorly predicted in all ML techniques. It is likely that local source factors such as mineralogy are important here, for which data was not available on the spatial scale of our study. Visual inspection of the locations of these pixels revealed no unexpected relationship to other potential source or retention variables (e.g., distance to mines or smelter) that could be spatialized and included in this analysis. Overall, no regional patterns in the residuals were detected (Fig. 1b & S7), which suggests that systematic spatial biases in the models were low.
image file: d4em00338a-f1.tif
Fig. 1 Geographic distribution of observed and modeled soil sulfur (S) and selenium (Se) concentrations (a) and the modeled residuals (b). Abbreviations include linear support vector regression (Lin-SVR) and multilayer perceptron (MLP). Pixels that exceed accuracy and precision thresholds were excluded. Panels in gray relate to Se.

image file: d4em00338a-f2.tif
Fig. 2 Measured vs. predicted concentration plots for sulfur (S) and selenium (Se) (a) and their associated machine-learning (ML) sensitivity analyses after z-score transformation (b). All data are based on average values (n = 100) generated during final runs. Abbreviations include linear support vector regression (Lin-SVR), multilayer perceptron (MLP), root mean squared error (RMSE), chemical index of alteration (CIA), soil organic carbon (SOC), aridity index (AI), evapotranspiration (ET), precipitation (Precip), and S or Se deposition (Dep). Panels in gray relate to Se.

Mechanistic evaluation

Sensitivity and partial dependence analyses highlighted the positive relationship between S and Se with changes in SOC (Fig. 2b, S8, S9, S11, & S12). SOC is a proxy for soil organic matter content, which is known to incorporate large fractions of S (for most soils as much as 95% of S can be present in the organic pool)45 and Se.46 Because of the importance of SOC, we screened for interactive effects between SOC and all other variable, but we found only modest evidence of synergistic interactions with SOC (Fig. S9). While SOC was important in this analysis, in previous modeling, changes in soil Se were least sensitive to changes in SOC.14 Various explanations exist for the difference in this study compared to the global study. We used European specific data (e.g., climate, SOC) whenever possible in this study, which is expected to lead to differences in predicted outputs. Additionally, in the global study, the model was influenced by processes and mechanisms captured on different continents that are not necessarily relevant in Europe.

Although the models were dominated by the influence of SOC, all variables were critical for evaluating whether the model outputs matched our mechanistic expectations from the literature. AI, i.e., the ratio between potential ET and Precip, exhibited a negative relationship with both elements, influencing soil redox conditions.47–49 Because S and Se mobility are strongly affected by their redox speciation,50,51 AI is expected to affect S and Se retention within the soil. In low AI environments where Precip exceeds the ET potential, the soil water content is expected to be relatively high, resulting in low oxygen in the soils, thus facilitating reducing conditions. Consequently, reduced S or Se may form co-precipitates with mineral phase or sorb onto organic matter.46 Conversely, in high AI environments, low Precip coupled with high ET potential results in dry soils, thus facilitating oxidizing conditions.52 Oxidized species of S and Se (i.e., sulfate, selenate & selenite) can be mobilized.16,19 Thus, we expect a negative relationship between both elements and AI, which was observed in the models. In addition to changes in redox conditions, weathering contributes to the formation of clay minerals and metal oxides in soils,53 where both can increase inorganic S and Se sorption in soils.54,55 While increases in clay content and CIA (a metric of soil weathering) increased soil Se, this relationship was largely neutral for soil S. Finally, atmospheric deposition is a source of S and Se elements to soils,9 which exhibited expected positive effects in sensitivity analyses for both elements. We recognize that other variables could influence redox state and thus element retention in soils. For example, wildfire could have various effects on soil biogeochemical properties, which may increase the S and Se direct release or leaching.56,57

While ET was positively related to Se in both the Lin-SVR and MLP models, there was virtually no relationship between ET and S (Fig. 2b, S11 & S12). ET can have multiple effects on soil element concentrations. ET can reduce transport by removing water from the soil column.24 Plant transpiration can translocate elements from lower in the soil column and deposit them on the surface.51 Both of these mechanisms are predicted to increase element concentration in surface soils. Conversely, similar to AI, ET can increase soil drying, thus increasing the redox state.24 Oxidized inorganic species of S and Se (i.e., sulfate, selenate and selenite) are most mobile, so assuming conditions are suitable to promote biotic and abiotic oxidation of S and Se,16,19,58 increased ET could promote leaching in soils. Ultimately, the elemental responses to changes by 2100 in ET are thus dependent on the relative importance of these competing processes. With the data available, we are unable to determine which, if any, mechanisms may be at play, or if competing mechanisms cancel out the effect of ET on S. Further observational or experimental investigation is needed to better identify the role ET has on S and Se retention in soils.

In previous univariate sensitivity analyses, Precip exhibited a negative relationship with soil Se concentration.14 This negative relationship was attributed to leaching: as precip increases, the leaching potential in a soil column increases on a global scale. However, a weak positive relationship was observed for both elements in both the Lin-SVR and MLP models (Fig. 2b, S11 & S12), which suggests that Precip is a contributing source of both elements at continental scale.9 The only direct positive effect that Precip can have on soil element concentrations is wet deposition, but this variable was accounted for in the sensitivity and partial dependence analyses suggesting that precipitation has a positive effect that is independent of wet deposition (Fig. 2b, S11 & S12). Precip is a master variable that can indirectly control soil element retention through its relationships with other variables (e.g., AI, CIA, ET, pH & SOC),52,59 and with the available data, we are unable to identify the most likely mechanism driving this pattern. Upon visual inspection of residuals (Fig. 1b & S7), there are no apparent spatial trends in modeled errors, which suggests that the model fits the data well and no critical information was missing from the model. However, we acknowledge that information could be missing that better explains the points with the largest residuals for both elements.

We expected complex relationships with pH as it related to organic and inorganic speciation of Se and S, which might vary regionally.46 Oxidized inorganic species of S and Se are oxyanions, therefore sorption of S and Se on SOC is favored in low pH conditions where more positively charged sorption sites are present on organic matter.54,60,61 However, in our modeled sensitivity analyses and partial dependence plots (Fig. 2b, S11 & S12), element concentrations increased or remained constant with increasing pH. This suggests the important implication of soil organic-Se compounds which could be affected by pH.

Future changes on soil element concentrations

Overall, the best performing models (i.e., L in-SVR models for S and Se, and MLP model for Se) project continuous transitions in soil element concentrations across the continent, influenced by broad-scale gradients of climate, soil organic carbon (SOC), and atmospheric S and Se deposition (Fig. 3). S and Se concentrations are predicted to increase in Scandinavia and decrease in the Mediterranean by 2100 (Fig. 3). Losses in soil S were strongly driven by decreases in SOC and moderately by decreases in deposition, but changes in AI, ET, and Precip play a minor role on the changes in soil S concentration (Fig. 4). In contrast, gains in soil Se in Scandinavia were moderately driven by increasing ET. This is notable because broad scale increases in soil concentrations of either element were rare in the model predictions. Overall, changes in AI and Precip resulted in minimal changes in soil Se by 2100. The precision of the future predictions was lowest in the Balkans area in the Lin-SVR model for S, in the Balkans and Germany for the Lin-SVR model for Se, and in the Balkans and Spain for the MLP model for Se (Fig. 3d). The higher uncertainty in the Balkans is likely due to changes in S deposition that overlap with the predicted increases due to changes in SOC (Fig. 4). A similar pattern was also observed for Se in both models (Fig. 3d, 4, & S13). The higher uncertainty in Germany is likely a result of predicted decreases in Se concentrations due to changes in Se deposition and SOC that overlap with the predicted increases due to changes in ET. Finally, in Spain, modeled Se uncertainty may be due to overlapping losses and no change due to deposition and aridity index, respectively (Fig. 4 & S13).
image file: d4em00338a-f3.tif
Fig. 3 Percent change in soil sulfur (S) and selenium (Se) concentrations by 2100 assuming an extreme climate change scenario (A1FI/RCP 8.5). Percent changes were calculated using the future and current predictions. The mean change (a), 5th percentile (least extreme change; (b)), and 95th percentile (most extreme change; (c)). The similarity in the 5th and 95th percentiles indicate little variability in the calculated change across all final iterations (n = 100). Precision of the future predictions are illustrated as standard deviations (stdev.) of the predictions (d). Abbreviations include linear support vector regression (Lin-SVR), multilayer perceptron (MLP), and maximum (max). Pixels that exceed accuracy and precision thresholds or with missing data were excluded. The pixel resolution of each map is 111 km. Panels in gray relate to Se.

image file: d4em00338a-f4.tif
Fig. 4 Independent contributions of aridity index (AI), precipitation (Precip.), soil organic carbon (SOC), and sulfur (S) or selenium (Se) atmospheric deposition to the percent change in soil S and Se concentrations by 2100. Percent changes were calculated using the future and current predictions. In each panel, predictions were made using future values for each variable of interest while holding all other variables at their current values. Only results from the linear support vector regression (Lin-SVR) are illustrated. Multilayer perceptron (MLP) results for Se are present in Fig. S13. The pixel resolution of each panel is 111 km. Panels in gray relate to Se.

The percent changes in soil S and Se concentrations for 2100 were calculated based on changes in AI, ET, Precip, S or Se atmospheric deposition, and SOC. The mean and median percent change (i.e., (modeled future concentration − modeled current concentration)/modeled current concentration × 100) in soil S from the Lin-SVR model were −5.1% and −5.5%, respectively (n = 413 pixels), while mean and median percent change in soil Se concentrations were −3.7% and −4.0%, respectively (n = 370 pixels). Only pixels that had future data and passed all accuracy and precision filters were considered. In the MLP model, mean and median percent changes in soil Se concentrations were −2.6% and −3.5% (n = 376 pixels), respectively. In pixels where losses were predicted (80% for S, 71% for Se MLP, and 77% for Se Lin-SVR), the mean and median percent change in soil S concentrations were −7.3% and −7.1%, respectively, while those for soil Se were −6.2% and −5.6%, for the Lin-SVR model and −5.5% and −5.2% for the MLP model, respectively. Mean and median gains were 3.6% and 3.3% for soil S, respectively, and 3.6% and 3.3% (Lin-SVR) and 4.4% and 4.2% (MLP) for soil Se.

The 90% uncertainty interval of the future prediction is relatively narrow. The median calculated change in future soil S concentrations for the 5th (i.e., 5th lowest predicted increase) and 95th (i.e., 5th highest predicted increase) percentiles were −6.0% and −4.3%, respectively. The magnitude and spatial pattern of both extremes of the prediction interval were like those of the mean percent change, highlighting little uncertainty in the future prediction. Similarly, for Se, the median predicted future change for the 5th and 95th percentiles were −5.1% and −3.2% for the SVR model and −4.8% and −1.9% for the MLP model.

It is always important to scrutinize predictive models. Although the future predictions are precise, the magnitude of change for 2100 is approximately half the magnitude of the relative residuals, the median of which were 8.9% and 11.0% for the S and both Se models, respectively. We found no evidence of overfitting in the retained models (Table S1), providing further confidence that future predictions are not the result of overfitting noise. In the retained models the relationships of the variables with Se and S could partly be explained by known processes (Fig. 2) with residuals that were symmetrical distributed around 0 (Fig. S5) with no spatial patterns (Fig. 1b & S7). In addition, the residuals appear randomly distributed in space, which is desirable, the predicted future changes by 2100 are non-random and match changes in broad-scale environmental gradients (Fig. 3, 4, & S13). Together, these factors suggest that the models capture soil S and Se behavior that is consistent with literature. Therefore, these results represent our best approximation for an extreme climate scenario of how soil S and Se concentrations will change in response to atmospheric deposition, AI, ET, Precip and SOC. It is important to emphasize that other non-included variables may change in the future (e.g., fertilization, management practices, etc.) and affect element concentrations in soils, which could be the focus of future work.

Spatiotemporal trend comparisons

Previous global scale models have predicted changes in soil Se concentrations at the end of the 21st century.14 In this global scale model, 8 variables (AI, pH, Precip, ET, clay content, lithology, and SOC) were used to describe soil Se. In addition, future climate (RCP 6.0) and organic carbon (ECHAM5-A1b) data were used to model a moderate climate change scenario. While the pixel resolution for both global and European modeling were analyzed at a 111 km, original climate variables resolution for the global and European specific models were 280 km and 12 km, respectively. Furthermore, the global scale model did not consider anthropogenic emission reductions as a driver of soil Se concentrations, which was a dominant driver of element losses in this study. Nevertheless, the trends between the global scale model and the current European model corroborate each other. When examining overlapping pixels from the global modeling (moderate climate change scenario), the Pearson's correlation was moderately high between the global and the MLP (R2 = 0.63) and Lin-SVR (R2 = 0.57) models. Changes were more extreme for the global study with a mean soil selenium change of −8.0%. For the same pixels, mean soil selenium changes for the MLP and Lin-SVR models were −2.4% and −3.5%, respectively. This is not necessarily surprising. The data used to generate the predictions are different, but theoretically, this model uses European-specific data and should better characterize patterns across the continent patterns compared to the global model. This is the first study modeling changes in soil S concentrations over the 21st century, so comparisons to other studies could not be made.

Strengths and limitations

The three selected models each possess unique strengths that render them suitable for predicting element concentrations in soils. Linear SVR stands out for its advantage in handling linear relationships between input features and output, simplifying the model and reducing susceptibility to overfitting, thereby enhancing interpretability. In contrast, RBF SVR and MLPs offer increased flexibility, particularly in capturing intricate non-linear relationships. MLPs provide the highest degree of architectural flexibility, allowing for fine-tuning of the number of layers, neurons, and activation functions to better align with the specific problem at hand. The adaptability of these models enables a more nuanced exploration of relationships and interactions between variables, which allows us to evaluate whether the anticipated mechanisms are effectively captured in the sensitivity analysis.

It is crucial to recognize that while each of the four machine learning techniques employed possesses specific strengths, the accuracy of our predictions is not solely dependent on these techniques' capabilities. Rather, the primary advantage of our analysis lies in the comprehensive probing of all models through diverse methodologies. This approach has yielded unique insights into model performance and the reliability of the predictions. For example, potential overfitting in the S RBF and both RFR models was identified by assessing the difference in training and testing performance (Table S1) and identifying unexpected trends in the sensitivity analyses (Fig. 2, S6, S8, S9, S11, & S12). Also, disjointed spatial patterns of predictions helped identify overfitting in the models and regions where competing mechanisms resulted in uncertainty in the SVR and MLP models (Fig. 1, 3, 4, S7, S10, & S13). Each approach we employed provided different insights into model performance and overfitting, and surprisingly, all methods were needed to identify poor fitting models. Such a diverse technique is rarely employed in the literature. We argue that this multifaceted approach is more robust because it incorporates quantitative data-driven measures along with mechanistic insights from the literature into the expected behavior of the model. Especially when models are used to make policy decisions, identifying the practical limits of a predictive model is necessary. There is a high need to interrogate the different ML techniques because our predictions highlight that similarly performing models based on RMSE or R2 can lead to drastically different predictions and conclusions (Fig. S6). Therefore, we strongly recommend that the discipline invests more time and energy going beyond simple model performance measures when using machine learning techniques.

Although we thoroughly interrogated the models, various limitations exist in this analysis such as local over or under predictions together with data limitations. The MLP and Lin-SVR models over predicted low concentrations and under predicted high concentrations for both elements. We are unsure of the exact reason, but poor model fitting at the lower concentration for the Se could be influenced by the lower detection limit of the instrument (0.04 mg kg−1). This issue can only be resolved by improved soil extraction and instrument sensitivity. Conversely, at high concentrations, various explanations potentially exist for poor model fitting. Because little data exist for extreme observations, there is always a risk of overfitting. For example, it is possible that soil concentrations are highest for a specific suite of conditions, or a temporary condition, and not enough data exists to capture this synergism or dynamism in the models. Furthermore, although this is always a limitation with any geospatial analysis, it is worth mentioning that some data are known to be important (e.g., fertilizer is a dominant soil S source)12 but are not available at a continental scale. Similarly, Kirk et al.30 observed an increase of soil pH as high as 1 pH unit over 25 years across England and Wales in response to decreased acid deposition from atmospheric emissions. While this is an important variable, future soil pH predictions do not exist. We considered whether these observations adversely influenced future predictions, but no discernible changes were observed when these extreme points were removed from the dataset, so they were ultimately retained within this analysis.

Finally, as with all geospatial analyses, we are limited by data availability. This analysis represents a broad scale assessment of environmental variability on soil elements; however, some mechanisms that may dominate at continental scales may not be relevant at local scales.62 Future soil geochemical surveys should be conducted over multiple spatial and temporal scales to determine which mechanisms are important at the appropriate scales. In addition to scale, error estimates were not available for all variables (e.g., SOC); therefore, a propagation of error analysis through our ML techniques model was not possible. Therefore, while our model results largely make mechanistic sense, they represent expected outcomes based on the average value. We are unsure how errors in the predictor variables will translate in deviations in the predicted soil concentrations for both S and Se. For soil managers to best meet the challenges of changing environmental conditions and their effect on soil fertility, propagation of error analyses should be conducted. As additional data together with temporal and spatial increase resolution become available, future modeling could reveal different patterns that we are currently unable to capture.

Environmental implications

Within this study, changes in soil S and Se concentration were primarily driven by changes in S and Se atmospheric deposition and SOC. These findings are consistent with current knowledge on anthropogenic atmospheric emissions and subsequent deposition for both elements. In addition, SOC is a major sink of both elements, and changes in both have been linked to dynamic cycling of S and Se.9,63 Therefore, modeled losses of soil concentrations for both elements by the end of the 21st century are in line with our mechanistic expectations. It is widely accepted that increased S atmospheric deposition from fossil fuel burning during the industrial revolution fertilized agricultural soils and recent reductions of S emissions at the end of the 20th century have resulted in, – and could continue to exacerbate – S deficiencies in crops.9–12 Therefore, it is not unexpected that future losses in soil elements will be linked to current emission reductions.

While S and Se sources are important drivers of soil element concentrations, the interaction between sources and sinks are what ultimately determine soil element concentrations.62 Despite dramatic changes in anthropogenic emissions between 1860 and 2000, soil S and Se concentrations remained near constant in agricultural plots where organic carbon content was steady.14,64 Instead, soil S and Se increased in plots only when accompanied by increases in SOC, either through additions of farmyard manure to plots or when agricultural plots were converted into a maintained grassland and woodland. Therefore, soil element concentrations, and potentially nutrient quality of crops, are linked to climate and land use change.3,15,65 Several agricultural practices have been evaluated based on their ability to increase organic carbon pools in the soil,3,66 and while better soil management in agricultural systems may slow or even reverse losses in essential elements,67 the nutritional quality in plants is based on more factors than just element concentration in soils. For instance, Se content in wheat grain collected from unfertilized plot (Rothamsted, United Kingdom) was inversely related to SO2 emissions and atmospheric deposition.68 This illustrates the uptake competition between selenate and sulfate via a specific sulfate transporter,69 where reductions in S nutritional content in crops may be driven by losses in soil. However, it is important to emphasize that nutrient uptake from the soil is highly complex and is a function of climate and physiological interactions between plants, microbes, and fungi in soil environments. Further research is needed to quantify the mechanisms driving S and Se uptake in plants in response to changes in the magnitude of cycling drivers.

Conclusion

Our machine learning analysis, which we vetted rigorously, revealed that a surprising level of attention was needed to identify overfitting beyond comparing training and testing performance. While most models exhibit satisfactory performance, highly nuanced predictions were contrary to mechanistic expectations. The Lin-SVR and MLP models stand out, aligning well with current data and showcasing interpretable patterns for S and Se, and for Se only, respectively. Examining future extreme climate projections, we anticipate continental shifts in the S and Se soil concentration by 2100, with Scandinavia gaining and the Mediterranean losing S and Se concentrations supporting that soil element changes could occur on the order of years to decades instead of centuries.14,63 Despite the precision on the soil concentration prediction, uncertainties persist, reflecting environmental gradients. Comparison with global models for Se14 emphasizes the importance of localized data for predictive accuracy. This study not only forecasts S and Se futures in European soils for extreme climate scenario but also calls for continuous model refinement in space and time. In practical terms, agricultural producers face constraints in altering climate conditions. Although management practices can potentially alleviate climate-related declines in S and Se, the intricate nature of nutrient uptake by plants implies diverse strategies contingent upon local climate and soil characteristics. This considerable complexity underscores the necessity for substantial investments in agricultural research to effectively address future nutritional requirements.

Code availability

The open-source annotated updated Python-code with a description on usage is available on GitHub (https://github.com/EcoChem-OSU/) and the specific version used in this paper is available in Zenodo (https://doi.org/10.5281/zenodo.12695649).41

Data availability

The predictor variables used to model trace element concentration, and their associated citations, are presented in Table 1. The trace element data used in this analysis was obtained in digital format from the following text: C. Reimann, M. Birke, A. Demetriades, P. Filzmoser and P. O'Connor, Distribution of elements/parameters in agricultural and grazing land soil in Europe. Chemistry of Europe's agricultural soils. Part A: methodology and interpretation of the GEMAS data set, Bundesanstalt für Geowissenschaften und Rohstoffe, Hannover, Germany, 2014.

Conflicts of interest

The authors are no conflicts to declare.

Acknowledgements

We acknowledge that Oregon State University in Corvallis, Oregon, is located within the traditional homelands of the Marys River or the Ampinefu Band of Kalapuya. Following the Willamette Valley Treaty of 1855, Kalapuya people were forcibly removed to reservations in Western Oregon. Today, living descendants of these people are a part of the Confederated Tribes of Grand Ronde Community of Oregon and the Confederated tribes of the Siletz Indians. We acknowledge Lara Cayo for assistance with an early version of this manuscript, and Sonia Seneviratne and Martin Hirschi for contributing climate data. This work was supported by Oregon State University-Agricultural Research Service (#9048A), Swiss National Science Foundation Grants PP00P2_133619 and PP00P2_163747, and Eawag, the Swiss Federal Institute of Aquatic Science and Technology.

References

  1. L. Gudmundsson, S. I. Seneviratne and X. Zhang, Anthropogenic climate change detected in European renewable freshwater resources, Nat. Clim. Change, 2017, 7, 813–816,  DOI:10.1038/nclimate3416 .
  2. E. A. Davidson and I. A. Janssens, Temperature sensitivity of soil carbon decomposition and feedbacks to climate change, Nature, 2006, 440, 165–173,  DOI:10.1038/nature04514 .
  3. P. Smith, Land use change and soil organic carbon dynamics, Nutr. Cycling Agroecosyst., 2008, 81, 169–178,  DOI:10.1007/s10705-007-9138-y .
  4. S. E. Trumbore, Potential responses of soil organic carbon to global environmental change, Proc. Natl. Acad. Sci. U.S.A., 1997, 94, 8284–8291,  DOI:10.1073/pnas.94.16.8284 .
  5. A. C. Oertel, Pedogenesis of some red-brown earths based on trace-element profiles, J. Soil Sci., 1961, 12, 242–258,  DOI:10.1111/j.1365-2389.1961.tb00914.x .
  6. R. W. Simonson, Outline of a generalized theory of soil genesis, Soil Sci. Soc. Am. J., 1959, 23, 152–156,  DOI:10.2136/sssaj1959.03615995002300020021x .
  7. G. Olivie-Lauquet, G. Gruau, A. Dia, C. Riou, A. Jaffrezic and O. Henin, Release of trace elements in wetlands: Role of seasonal variability, Water Res., 2001, 35, 943–952,  DOI:10.1016/S0043-1354(00)00328-6 .
  8. C. A. Jones, D. A. Nimick and R. B. McCleskey, Relative effect of temperature and pH on diel cycling of dissolved trace elements in Prickly Pear Creek, Montana, Water, Air, Soil Pollut., 2004, 153, 95–113,  DOI:10.1023/B:WATE.0000019934.64939.f0 .
  9. A. Feinberg, A. Stenke, T. Peter, E.-L. S. Hinckley, C. T. Driscoll and L. H. E. Winkel, Reductions in the deposition of sulfur and selenium to agricultural soils pose risk of future nutrient deficiencies, Commun. Earth Environ., 2021, 2, 101,  DOI:10.1038/s43247-021-00172-0 .
  10. Z. Yu, M. She, T. Zheng, D. Diepeveen, S. Islam, Y. Zhao, Y. Zhang, G. Tang, Y. Zhang, J. Zhang, C. L. Blanchard and W. Ma, Impact and mechanism of sulphur-deficiency on modern wheat farming nitrogen-related sustainability and gliadin content, Commun. Biol., 2021, 4, 945,  DOI:10.1038/s42003-021-02458-7 .
  11. V. Vestreng, G. Myhre, H. Fagerli, S. Reis and L. Tarrasón, Twenty-five years of continuous sulphur dioxide emission reduction in Europe, Atmos. Chem. Phys., 2007, 7, 3663–3681,  DOI:10.5194/acp-7-3663-2007 .
  12. E.-L. S. Hinckley, J. T. Crawford, H. Fakhraei and C. T. Driscoll, A shift in sulfur-cycle manipulation from atmospheric emissions to agricultural additions, Nat. Geosci., 2020, 13, 597–604,  DOI:10.1038/s41561-020-0620-3 .
  13. P. Poulton, J. Johnston, A. Macdonald, R. White and D. Powlson, Major limitations to achieving "4 per 1000" increases in soil organic carbon stock in temperate regions: Evidence from long-term experiments at Rothamsted Research, United Kingdom, Global Change Biol., 2018, 24, 2563–2584,  DOI:10.1111/gcb.14066 .
  14. G. D. Jones, B. Droz, P. Greve, P. Gottschalk, D. Poffet, S. P. McGrath, S. I. Seneviratne, P. Smith and L. H. E. Winkel, Selenium deficiency risk predicted to increase under future climate change, Proc. Natl. Acad. Sci. U.S.A., 2017, 114, 2848–2853,  DOI:10.1073/pnas.1611576114 .
  15. A. Don, J. Schumacher and A. Freibauer, Impact of tropical land-use change on soil organic carbon stocks – a meta-analysis, Global Change Biol., 2011, 17, 1658–1670,  DOI:10.1111/j.1365-2486.2010.02336.x .
  16. L. H. E. Winkel, B. Vriens, G. D. Jones, L. S. Schneider, E. Pilon-Smits and G. S. Banuelos, Selenium cycling across soil-plant-atmosphere interfaces: A critical review, Nutrients, 2015, 7, 4199–4239,  DOI:10.3390/nu7064199 .
  17. N. Karimian, S. G. Johnston and E. D. Burton, Iron and sulfur cycling in acid sulfate soil wetlands under dynamic redox conditions: A review, Chemosphere, 2018, 197, 803–816,  DOI:10.1016/j.chemosphere.2018.01.096 .
  18. V. Antoniadis, E. Levizou, S. M. Shaheen, Y. S. Ok, A. Sebastian, C. Baum, M. N. V. Prasad, W. W. Wenzel and J. Rinklebe, Trace elements in the soil-plant interface: Phytoavailability, translocation, and phytoremediation–A review, Earth-Sci. Rev., 2017, 171, 621–645,  DOI:10.1016/j.earscirev.2017.06.005 .
  19. P. J. Edwards, Sulfur Cycling, Retention, and Mobility in Soils: A Review, USDA Forest Service, Northeastern Research Station, Radnor, PA, US, 1998 Search PubMed .
  20. J. Eriksen, Soil sulfur cycling in temperate agricultural systems, in Advances in Agronomy, ed. D. L. Sparks, Academic Press, 2009, vol. 102, pp. 55–89,  DOI:10.1016/s0065-2113(09)01002-5 .
  21. R. Lal, Digging deeper: A holistic perspective of factors affecting soil organic carbon sequestration in agroecosystems, Global Change Biol., 2018, 24, 3285–3301,  DOI:10.1111/gcb.14054 .
  22. C.-J. Ji, Y.-H. Yang, W.-X. Han, Y.-F. He, J. Smith and P. Smith, Climatic and edaphic controls on soil pH in alpine grasslands on the tibetan plateau, China: A quantitative analysis, Pedosphere, 2014, 24, 39–44,  DOI:10.1016/S1002-0160(13)60078-8 .
  23. S. Hong, P. Gan and A. Chen, Environmental controls on soil pH in planted forest and its response to nitrogen deposition, Environ. Res., 2019, 172, 159–165,  DOI:10.1016/j.envres.2019.02.020 .
  24. A. Pedescoll, R. Sidrach-Cardona, J. C. Sánchez and E. Bécares, Evapotranspiration affecting redox conditions in horizontal constructed wetlands under Mediterranean climate: Influence of plant species, Ecol. Eng., 2013, 58, 335–343,  DOI:10.1016/j.ecoleng.2013.07.007 .
  25. M. Nikolausz, U. Kappelmeyer, A. Székely, A. Rusznyák, K. Márialigeti and M. Kästner, Diurnal redox fluctuation and microbial activity in the rhizosphere of wetland plants, Eur. J. Soil Biol., 2008, 44, 324–333,  DOI:10.1016/j.ejsobi.2008.01.003 .
  26. A. Niedermeier and J. S. Robinson, Hydrological controls on soil redox dynamics in a peat-based, restored wetland, Geoderma., 2007, 137, 318–326,  DOI:10.1016/j.geoderma.2006.08.027 .
  27. C. Gougoulias, J. M. Clark and L. J. Shaw, The role of soil microbes in the global carbon cycle: Tracking the below-ground microbial processing of plant-derived carbon for manipulating carbon dynamics in agricultural systems, J. Sci. Food Agric., 2014, 94, 2362–2371,  DOI:10.1002/jsfa.6577 .
  28. S. J. Hall, W. H. McDowell and W. L. Silver, When wet gets wetter: Decoupling of moisture, redox biogeochemistry, and greenhouse gas fluxes in a humid tropical forest soil, Ecosystems, 2013, 16, 576–589,  DOI:10.1007/s10021-012-9631-2 .
  29. Å. Rühling and G. Tyler, Changes in atmospheric deposition rates of heavy metals in Sweden: A summary of nationwide swedish surveys in 1968/70 – 1995, Water, Air, Soil Pollut.: Focus, 2001, 1, 311–323,  DOI:10.1023/A:1017584928458 .
  30. G. J. D. Kirk, P. H. Bellamy and R. M. Lark, Changes in soil pH across England and Wales in response to decreased acid deposition, Global Change Biol., 2010, 16, 3111–3119,  DOI:10.1111/j.1365-2486.2009.02135.x .
  31. E.-L. S. Hinckley and C. T. Driscoll, Sulfur fertiliser use in the Midwestern US increases as atmospheric sulfur deposition declines with improved air quality, Commun. Earth Environ., 2022, 3, 324,  DOI:10.1038/s43247-022-00662-9 .
  32. P. Smith, Soil organic carbon dynamics and land-use change, in Land Use and Soil Resources, ed. A. K. Braimoh and P. L. G. Vlek, Springer Dordrecht, Netherlands, 2008, ch. 2, pp. 9–22,  DOI:10.1007/978-1-4020-6778-5_2 .
  33. S. Hong, S. Piao, A. Chen, Y. Liu, L. Liu, S. Peng, J. Sardans, Y. Sun, J. Peñuelas and H. Zeng, Afforestation neutralizes soil pH, Nat. Commun., 2018, 9, 520,  DOI:10.1038/s41467-018-02970-1 .
  34. T. Hengl, J. Mendes de Jesus, G. B. M. Heuvelink, M. Ruiperez Gonzalez, M. Kilibarda, A. Blagotić, W. Shangguan, M. N. Wright, X. Geng, B. Bauer-Marschallinger, M. A. Guevara, R. Vargas, R. A. MacMillan, N. H. Batjes, J. G. B. Leenaars, E. Ribeiro, I. Wheeler, S. Mantel and B. Kempen, SoilGrids250m: Global gridded soil information based on machine learning, PLoS One, 2017, 12, e0169748,  DOI:10.1371/journal.pone.0169748 .
  35. E. Cook, Agriculture, Forestry and Fishery Statistics, Eurostat (European Commission), 2020 edition, Luxembourg, 2021 Search PubMed .
  36. Central Intelligence Agency, The World Factbook, https://www.cia.gov/the-world-factbook/ Search PubMed.
  37. C. Reimann, M. Birke, A. Demetriades, P. Filzmoser and P. O'Connor, Distribution of elements/parameters in agricultural and grazing land soil in Europe, Chemistry of Europe's Agricultural Soils. Part A: Methodology and Interpretation of the GEMAS Data Set, Bundesanstalt für Geowissenschaften und Rohstoffe, Hannover, Germany, 2014 Search PubMed .
  38. J. Smith, P. Smith, M. Wattenbach, S. Zaehle, R. Hiederer, R. J. A. Jones, L. Montanarella, M. D. A. Rounsevell, I. Reginster and F. Ewert, Projected changes in mineral soil carbon of European croplands and grasslands, 1990–2080, Global Change Biol., 2005, 11, 2141–2152,  DOI:10.1111/j.1365-2486.2005.001075.x .
  39. D. Jacob, J. Petersen, B. Eggert, A. Alias, O. B. Christensen, L. M. Bouwer, A. Braun, A. Colette, M. Déqué, G. Georgievski, E. Georgopoulou, A. Gobiet, L. Menut, G. Nikulin, A. Haensler, N. Hempelmann, C. Jones, K. Keuler, S. Kovats, N. Kröner, S. Kotlarski, A. Kriegsmann, E. Martin, E. van Meijgaard, C. Moseley, S. Pfeifer, S. Preuschmann, C. Radermacher, K. Radtke, D. Rechid, M. Rounsevell, P. Samuelsson, S. Somot, J.-F. Soussana, C. Teichmann, R. Valentini, R. Vautard, B. Weber and P. Yiou, EURO-CORDEX: New high-resolution climate change projections for European impact research, Reg. Environ. Change, 2013, 14, 563–578,  DOI:10.1007/s10113-013-0499-2 .
  40. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot and E. Duchesnay, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., 2011, 12, 2825–2830 Search PubMed .
  41. L. Insinga, B. Droz, G. D. Jones, A. Feinberg, A. Stenke, J. Smith, P. Smith and L. H. E. Winkel, Predicted losses of sulfur and selenium in European soils using machine learning: Python script v1.1,  DOI:10.5281/zenodo.12695649.
  42. Y. Zhang, M. Lei, K. Li and T. Ju, Spatial prediction of soil contamination based on machine learning: a review, Front. Environ. Sci. Eng., 2023, 17, 93,  DOI:10.1007/s11783-023-1693-1 .
  43. K. M. Ransom, B. T. Nolan, J. A. Traum, C. C. Faunt, A. M. Bell, J. A. M. Gronberg, D. C. Wheeler, C. Z. Rosecrans, B. Jurgens, G. E. Schwarz, K. Belitz, S. M. Eberts, G. Kourakos and T. Harter, A hybrid machine learning model to predict and visualize nitrate concentration throughout the Central Valley aquifer, California, USA, Sci. Total Environ., 2017, 601, 1160–1172,  DOI:10.1016/j.scitotenv.2017.05.192 .
  44. B. Droz, S. Payraudeau, J. A. Rodríguez Martín, G. Tóth, P. Panagos, L. Montanarella, P. Borrelli and G. Imfeld, Copper content and export in European vineyard soils influenced by climate and soil properties, Environ. Sci. Technol., 2021, 55, 7327–7334,  DOI:10.1021/acs.est.0c02093 .
  45. J. L. Kovar and C. A. Grant, Nutrient Cycling in Soils: Sulfur in Soil Management: Building a Stable Base for Agriculture, ed. J. L. Hatfield and T. J. Sauer, American Society of Agronomy and Soil Science Society of America, 2011, pp. 103–115,  DOI:10.2136/2011.soilmanagement.c7 .
  46. J. Tolu, S. Bouchet, J. Helfenstein, O. Hausheer, S. Chékifi, E. Frossard, F. Tamburini, O. A. Chadwick and L. H. E. Winkel, Understanding soil selenium accumulation and bioavailability through size resolved and elemental characterization of soil extracts, Nat. Commun., 2022, 13, 6974,  DOI:10.1038/s41467-022-34731-6 .
  47. R. E. LaCroix, N. Walpen, M. Sander, M. M. Tfaily, J. L. Blanchard and M. Keiluweit, Long-term warming decreases redox capacity of soil organic matter, Environ. Sci. Technol. Lett., 2020, 8, 92–97,  DOI:10.1021/acs.estlett.0c00748 .
  48. A. W. Western, R. B. Grayson and G. Bloschl, Scaling of soil moisture: A hydrologic perspective, Annu. Rev. Earth Planet. Sci., 2002, 30, 149–180,  DOI:10.1146/annurev.earth.30.091201.140434 .
  49. W. L. Silver, A. E. Lugo and M. Keller, Soil oxygen availability and biogeochemistry along rainfall and topographic gradients in upland wet tropical forest soils, Biogeochemistry, 1999, 44, 301–328,  DOI:10.1007/Bf00996995 .
  50. V. K. Sharma, T. J. McDonald, M. Sohn, G. A. K. Anquandah, M. Pettine and R. Zboril, Biogeochemistry of selenium: A review, Environ. Chem. Lett., 2014, 13, 49–58,  DOI:10.1007/s10311-014-0487-x .
  51. S. P. McGrath, F. Zhao and M. M. Blake-Kalff, Sulphur in soils, processes, behaviour and measurement, in Biogeochemistry of Sulphur in Agricultural Systems, ed. C. J. C. Dawson and M. Fotyma, The International Fertiliser Society, 2003, pp. 28–54 Search PubMed .
  52. D. Kerins and L. Li, High dissolved carbon concentration in arid rocky mountain streams, Environ. Sci. Technol., 2023, 57, 4656–4667,  DOI:10.1021/acs.est.2c06675 .
  53. D. D. Eberl, V. C. Farmer, R. M. Barrer, L. Fowden, R. M. Barrer and P. B. Tinker, Clay mineral formation and transformation in rocks and soils, Philos. Trans. R. Soc. London, Ser. A, 1984, 311, 241–257,  DOI:10.1098/rsta.1984.0026 .
  54. J. Tolu, Y. Thiry, M. Bueno, C. Jolivet, M. Potin-Gautier and I. Le Hecho, Distribution and speciation of ambient selenium in contrasted soils, from mineral to organic rich, Sci. Total Environ., 2014, 479–480, 93–101,  DOI:10.1016/j.scitotenv.2014.01.079 .
  55. T. Delfosse, P. Delmelle and B. Delvaux, Sulphate sorption at high equilibrium concentration in Andosols, Geoderma, 2006, 136, 716–722,  DOI:10.1016/j.geoderma.2006.05.009 .
  56. J. T. Abatzoglou and A. P. Williams, Impact of anthropogenic climate change on wildfire across western US forests, Proc. Natl. Acad. Sci. U. S. A., 2016, 113, 11770–11775,  DOI:10.1073/pnas.1607171113 .
  57. M. J. Paul, S. D. LeDuc, M. G. Lassiter, L. C. Moorhead, P. D. Noyes and S. G. Leibowitz, Wildfire induces changes in receiving waters: A review with considerations for water quality management, Water Resour. Res., 2022, 58, 1–28,  DOI:10.1029/2021wr030699 .
  58. I. Ben-Noah and S. P. Friedman, Review and Evaluation of Root Respiration and of Natural and Agricultural Processes of Soil Aeration, Vadose Zone J., 2018, 17, 1–47,  DOI:10.2136/vzj2017.06.0119 .
  59. I. K. Schmidt, A. Tietema, D. Williams, P. Gundersen, C. Beier, B. A. Emmett and M. Estiarte, Soil solution chemistry and element fluxes in three european heathlands and their responses to warming and drought, Ecosystems, 2004, 7, 638–649,  DOI:10.1007/s10021-004-0217-5 .
  60. T. A. Sokolova and S. A. Alekseeva, Adsorption of sulfate ions by soils: A review, Eurasian Soil Sci., 2008, 41, 140–148,  DOI:10.1134/S106422930802004X .
  61. A. Fernandez-Martinez and L. Charlet, Selenium environmental cycling and bioavailability: a structural chemist point of view, Rev. Environ. Sci. Bio/Technol., 2009, 8, 81–110,  DOI:10.1007/s11157-009-9145-3 .
  62. G. D. Jones and L. H. E. Winkel, Multi-scale factors and processes controlling Selenium distributions in soils, in Selenium in Plants: Molecular, Physiological, Ecological and Evolutionary Aspects, ed. E. A. H. Pilon-Smits, L. H. E. Winkel and Z.-Q. Lin, Springer, Cham, 2017, vol. 11, pp. 3–20,  DOI:10.1007/978-3-319-56249-0_1 .
  63. J. Lehmann, D. Solomon, F.-J. Zhao and S. P. McGrath, Atmospheric SO2 emissions since the late 1800s change organic sulfur forms in humic substance extracts of soils, Environ. Sci. Technol., 2008, 42, 3550–3555,  DOI:10.1021/es702315g .
  64. J. S. Knights, F. J. Zhao, B. Spiro and S. P. McGrath, Long-term effects of land use and fertilizer treatments on sulfur cycling, J. Environ. Qual., 2000, 29, 1867–1874,  DOI:10.2134/jeq2000.00472425002900060020x .
  65. E. A. Davidson, S. E. Trumbore and R. Amundson, Soil warming and organic carbon content, Nature, 2000, 408, 789–790,  DOI:10.1038/35048672 .
  66. P. Smith, D. Martino, Z. Cai, D. Gwary, H. Janzen, P. Kumar, B. McCarl, S. Ogle, F. O'Mara, C. Rice, B. Scholes, O. Sirotenko, M. Howden, T. McAllister, G. Pan, V. Romanenkov, U. Schneider and S. Towprayoon, Policy and technological constraints to implementation of greenhouse gas mitigation options in agriculture, Agric., Ecosyst. Environ., 2007, 118, 6–28,  DOI:10.1016/j.agee.2006.06.006 .
  67. R. Garcia Moreno, R. Burdock, M. C. Díaz Álvarez and J. W. Crawford, Managing the selenium content in soils in semiarid environments through the recycling of organic matter, Appl. Environ. Soil Sci., 2013, 2013, 283468,  DOI:10.1155/2013/283468 .
  68. M.-S. Fan, F.-J. Zhao, P. R. Poulton and S. P. McGrath, Historical changes in the concentrations of selenium in soil and wheat grain from the Broadbalk experiment over the last 160 years, Sci. Total Environ., 2008, 389, 532–538,  DOI:10.1016/j.scitotenv.2007.08.024 .
  69. J. L. Hopper and D. R. Parker, Plant availability of selenite and selenate as influenced by the competing ions phosphate and sulfate, Plant Soil, 1999, 210, 199–207,  DOI:10.1023/A:1004639906245 .

Footnote

Electronic supplementary information (ESI) available: Materials and methods including predictive variables and data processing, machine-learning parameter, and model performance calculation. Results and discussions, and references. See DOI: https://doi.org/10.1039/d4em00338a

This journal is © The Royal Society of Chemistry 2024