Scalable Sensitivity and Uncertainty Analyses for Causal-Effect Estimates of Continuous-Valued Interventions

Advances in Neural Information Processing Systems 35 (2022)

Authors:

A Jesson, A Douglas, P Manshausen, M Solal, N Meinshausen, P Stier, Y Gal, U Shalit

Abstract:

Estimating the effects of continuous-valued interventions from observational data is a critically important task for climate science, healthcare, and economics. Recent work focuses on designing neural network architectures and regularization functions to allow for scalable estimation of average and individual-level dose-response curves from high-dimensional, large-sample data. Such methodologies assume ignorability (observation of all confounding variables) and positivity (observation of all treatment levels for every covariate value describing a set of units), assumptions problematic in the continuous treatment regime. Scalable sensitivity and uncertainty analyses to understand the ignorance induced in causal estimates when these assumptions are relaxed are less studied. Here, we develop a continuous treatment-effect marginal sensitivity model (CMSM) and derive bounds that agree with the observed data and a researcher-defined level of hidden confounding. We introduce a scalable algorithm and uncertainty-aware deep models to derive and estimate these bounds for high-dimensional, large-sample observational data. We work in concert with climate scientists interested in the climatological impacts of human emissions on cloud properties using satellite observations from the past 15 years. This problem is known to be complicated by many unobserved confounders.

Prediction of gene essentiality using machine learning and genome-scale metabolic models

IFAC-PapersOnLine 55:23 (2022)

Authors:

Lilli J Freischem, Mauricio Barahona, Diego A Oyarzún

Abstract:

The identification of essential genes, i.e. those that impair cell survival when deleted, requires large growth assays of knock-out strains. The complexity and cost of such experiments has triggered a growing interest in computational methods for prediction of gene essentiality. In the case of metabolic genes, Flux Balance Analysis (FBA) is widely employed to predict essentiality under the assumption that cells maximize their growth rate. However, this approach assumes that knock-out strains optimize the same objectives as the wild-type, which excludes cases in which deletions cause large physiological changes to meet other objectives for survival. Here, we resolve this limitation with a novel machine learning approach that predicts essentiality directly from wild-type flux distributions. We first project the wild-type FBA solution onto a mass flow graph, a digraph with reactions as nodes and edge weights proportional to the mass transfer between reactions, and then train binary classifiers on the connectivity of graph nodes. We demonstrate the efficacy of this approach using the most complete metabolic model of Escherichia coli, achieving near state-of-the art prediction accuracy for essential genes. Our approach suggests that wild-type FBA solutions contain enough information to predict essentiality, without the need to assume optimality of deletion strains.

Anthropogenic aerosols modulated 20th-century Sahel rainfall variability via their impacts on North Atlantic sea surface temperature

Geophysical Research Letters Wiley 49:1 (2021) e2021GL095629

Authors:

Shipeng Zhang, Philip Stier, Guy Dagan, Minghuai Wang

Abstract:

The Sahel rainfall has a close teleconnection with North Atlantic sea surface temperature (NASST) variability, which has separately been shown to be affected by aerosols. Therefore, changes in regional aerosols emission could potentially drive multidecadal Sahel rainfall variability. Here we combine ensembles of state-of-the-art global climate models (the CESM and CanESM large ensemble simulations and CMIP6 models) with observational data sets to demonstrate that anthropogenic aerosols have significantly impacted 20th-century detrended Sahel rainfall multidecadal variability through modifying NASST. We show that aerosol-induced multidecadal variations of downward solar radiative fluxes over the North Atlantic cause NASST variability during the 20th century, altering the ITCZ position and dynamically linking aerosol effects to Sahel rainfall variability. This process chain is caused by aerosol-induced changes in radiative surface fluxes rather than changes in ocean circulations. CMIP6 models further suggest that aerosol-cloud interactions modulate the inter-model uncertainty of simulated NASST and potentially the Sahel rainfall variability.

Model calibration using ESEm v1.1.0 – an open, scalable Earth System Emulator

Geoscientific Model Development Copernicus Publications (2021)

Authors:

Duncan WATSON-PARRIS, Andrew Williams, Lucia Deaconu, PHILIP STIER

Model calibration using ESEm v1.1.0 – an open, scalable Earth system emulator

Geoscientific Model Development Copernicus GmbH 14:12 (2021) 7659-7672

Authors:

Duncan Watson-Parris, Andrew Williams, Lucia Deaconu, Philip Stier

Abstract:

<jats:p>Abstract. Large computer models are ubiquitous in the Earth sciences. These models often have tens or hundreds of tuneable parameters and can take thousands of core hours to run to completion while generating terabytes of output. It is becoming common practice to develop emulators as fast approximations, or surrogates, of these models in order to explore the relationships between these inputs and outputs, understand uncertainties, and generate large ensembles datasets. While the purpose of these surrogates may differ, their development is often very similar. Here we introduce ESEm: an open-source tool providing a general workflow for emulating and validating a wide variety of models and outputs. It includes efficient routines for sampling these emulators for the purpose of uncertainty quantification and model calibration. It is built on well-established, high-performance libraries to ensure robustness, extensibility and scalability. We demonstrate the flexibility of ESEm through three case studies using ESEm to reduce parametric uncertainty in a general circulation model and explore precipitation sensitivity in a cloud-resolving model and scenario uncertainty in the CMIP6 multi-model ensemble. </jats:p>