Improving snow water equivalent modelling: a comparative study of hybrid machine learning techniques

Publication date

2026-03-03

Authors

Moya, Oriol Pomarol
Nussbaum, MadleneORCID 0000-0002-6808-8956
Mehrkanoon, SiamakORCID 0000-0002-0516-0391ISNI 0000000512552651
Kraaijenbrink, P.D.A.ISNI 0000000468813577
Gouttevin, Isabelle
Karssenberg, D.J.ORCID 0000-0002-6475-363XISNI 0000000114829248
Immerzeel, WalterORCID 0000-0002-2010-9543ISNI 0000000108662891

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

Accurate characterization of snow water equivalent (SWE) is important for water resource management in large parts of the Northern Hemisphere, but its large spatio-temporal variability and limited observational data make it difficult to quantify. Complex physically-based models have been developed that allow long-term SWE simulation, but those still suffer from biases in their predictions, have long run times and provide challenges for integrating observational data. There have been recent attempts at using machine learning (ML) to improve SWE predictions from meteorological data with promising results, but the data scarcity issue and concerns about the ability to extrapolate in time and space remain. In this study, we evaluated two hybrid setups that integrate physically-based simulations and ML. The first setup, referred to as post-processing, follows a common approach in which the simulated outputs from a numerical snow model, Crocus, are used as predictors to the ML component in addition to the meteorological data. The second setup, named data-augmentation, involves an ML model trained not only on measured SWE but also on Crocus-simulated SWE at additional locations. These approaches were deployed using in-situ meteorological and SWE measurements available at ten stations throughout the Northern Hemisphere, and compared to Crocus and an ML setup using measured data only. The post processing setup outperformed all other approaches when predicting on left-out years in the training stations, but performed poorly when extrapolating to other locations compared to Crocus. The addition of a large set of Crocus-simulated variables besides SWE in this setup resulted in similar performance for left-out years but exacerbated the spatial extrapolation issue. On the other hand, the data-augmentation setup performed slightly worse on the left-out years, but showed much better transferability to new locations, improving the other ML-based setups greatly and reducing the RMSE in Crocus by more than 10 %. The feature importances of the ML-models were consistent with physical knowledge, despite having unusual deviations at extreme values, which showed some improvement for the data-augmentation setup. Lastly, the addition of lagged variables improved the results, but were only relevant for few variables and up to a week. These results prove the usefulness of hybrid models and particularly the data-augmentation setup for SWE prediction even in data-scarce domains, suggesting their potential to improve forecasts of SWE at large spatio-temporal scales, where they remain to be tested.

Keywords

Water Science and Technology, Earth-Surface Processes

Citation

Pomarol Moya, O, Nussbaum, M, Mehrkanoon, S, Kraaijenbrink, P D A, Gouttevin, I, Karssenberg, D & Immerzeel, W W 2026, 'Improving snow water equivalent modelling : a comparative study of hybrid machine learning techniques', Cryosphere, vol. 20, no. 2, pp. 1427-1444. https://doi.org/10.5194/tc-20-1427-2026