Epistemic and aleatoric uncertainty quantification in weather and climate models
Quarterly Journal of the Royal Meteorological Society Wiley (2026) e70219
Abstract:
Spin-up in humidity and temperature and its consequences for convective diagnostics: a Model Uncertainty Model Intercomparison Project experiment
(2026)
Crowdsourcing the Frontier: Advancing Hybrid Physics‐ML Climate Simulation via a $50,000 Kaggle Competition
Journal of Advances in Modeling Earth Systems American Geophysical Union (AGU) 18:5 (2026)
Abstract:
Abstract Subgrid machine‐learning (machine learning [ML]) parameterizations have the potential to introduce a new generation of climate models that incorporate the effects of higher‐resolution physics without incurring the prohibitive computational cost associated with more explicit physics‐based simulations. However, important issues, ranging from online instability to inconsistent online performance, have limited their operational use for long‐term climate projections. To more rapidly drive progress in solving these issues, domain scientists and ML researchers opened up the offline aspect of this problem to the broader ML and data science community with the release of ClimSim, a NeurIPS Data sets and Benchmarks publication, and an associated Kaggle competition. This paper reports on the downstream results of the Kaggle competition by coupling emulators inspired by the winning teams' architectures to an interactive climate model (including full cloud microphysics, a regime historically prone to online instability) and systematically evaluating their online performance. Our results demonstrate that online stability in the low‐resolution real‐geography setting is reproducible across multiple diverse architectures, which we consider a key milestone. All tested architectures exhibit strikingly similar offline and online biases, though their responses to architecture‐agnostic design choices (e.g., expanding the list of input variables) can differ significantly. Multiple Kaggle‐inspired architectures achieve state‐of‐the‐art results on certain metrics such as zonal mean bias patterns and global Root Mean Squared Error, indicating that crowdsourcing the essence of the offline problem is one path to improving online performance in hybrid physics‐AI climate simulation. Plain Language Summary Future climate models may use machine learning (ML) to replace small‐scale physical processes that are otherwise too costly to simulate directly over long timescales. Such “hybrid” physics–ML models could improve predictions by reducing uncertainties from current approximations. But making them run reliably in full climate simulations has been a major challenge. To speed progress, scientists created an open data set, benchmarking framework, and global competition to drive improvement for these ML components. This paper follows up on that competition by testing ideas from the winning teams within hybrid climate models. For the first time, we show that stable hybrid simulation is now reproducible across a range of diverse ML architectures. We find that different architectures share similar patterns of errors both before and after coupling, although their responses to added training inputs can differ. Finally, some competition‐inspired designs achieve state‐of‐the‐art scores on individual performance measures, but no single approach beats the previous benchmark (Hu et al., 2025, https://doi.org/10.1029/2024ms004618 ) on every metric. Key Points Online stability in the low‐resolution real‐geography setting is reproducibly achievable across diverse architectures Offline and online zonal mean biases are near‐identical across architectures; online runs underestimate tropical precipitable water An expanded variable list is universally beneficial offline but has diverging, architecture‐dependent effects onlineSpatial Patterns of Shallow Clouds: Challenging the Concept of Defined Regimes
Geophysical Research Letters Wiley 53:8 (2026) e2025GL119921
Abstract:
Plain Language Summary: The representation of tropical shallow cloud systems is a major source of uncertainty in climate models. Shallow clouds have previously been observed to organize in a variety of patterns. Four distinct classes—fish, flowers, sugar, and gravel—were identified, each with differing spatial scales of cloud organization. Here we analyze high‐resolution geostationary visible and infrared satellite images using a function that can objectively assess organization of cloudy pixels across all spatial scales. We see that examples of the four “classical” patterns are clearly identifiable using this function, but that they do not show up as clearly preferred regimes, but rather as way‐markers in a smoothly evolving sea of cloud patterns. This means that representing these patterns in parameterization schemes might be challenging.Interpretable feature incorporation machine-learning framework for flood magnitude estimation
Hydrology and Earth System Sciences Copernicus Publications 30:7 (2026) 2135-2160