Prediction of gene essentiality using machine learning and genome-scale metabolic models

IFAC-PapersOnLine 55:23 (2022)

Authors:

Lilli J Freischem, Mauricio Barahona, Diego A Oyarzún

Abstract:

The identification of essential genes, i.e. those that impair cell survival when deleted, requires large growth assays of knock-out strains. The complexity and cost of such experiments has triggered a growing interest in computational methods for prediction of gene essentiality. In the case of metabolic genes, Flux Balance Analysis (FBA) is widely employed to predict essentiality under the assumption that cells maximize their growth rate. However, this approach assumes that knock-out strains optimize the same objectives as the wild-type, which excludes cases in which deletions cause large physiological changes to meet other objectives for survival. Here, we resolve this limitation with a novel machine learning approach that predicts essentiality directly from wild-type flux distributions. We first project the wild-type FBA solution onto a mass flow graph, a digraph with reactions as nodes and edge weights proportional to the mass transfer between reactions, and then train binary classifiers on the connectivity of graph nodes. We demonstrate the efficacy of this approach using the most complete metabolic model of Escherichia coli, achieving near state-of-the art prediction accuracy for essential genes. Our approach suggests that wild-type FBA solutions contain enough information to predict essentiality, without the need to assume optimality of deletion strains.

Anthropogenic aerosols modulated 20th-century Sahel rainfall variability via their impacts on North Atlantic sea surface temperature

Geophysical Research Letters Wiley 49:1 (2021) e2021GL095629

Authors:

Shipeng Zhang, Philip Stier, Guy Dagan, Minghuai Wang

Abstract:

The Sahel rainfall has a close teleconnection with North Atlantic sea surface temperature (NASST) variability, which has separately been shown to be affected by aerosols. Therefore, changes in regional aerosols emission could potentially drive multidecadal Sahel rainfall variability. Here we combine ensembles of state-of-the-art global climate models (the CESM and CanESM large ensemble simulations and CMIP6 models) with observational data sets to demonstrate that anthropogenic aerosols have significantly impacted 20th-century detrended Sahel rainfall multidecadal variability through modifying NASST. We show that aerosol-induced multidecadal variations of downward solar radiative fluxes over the North Atlantic cause NASST variability during the 20th century, altering the ITCZ position and dynamically linking aerosol effects to Sahel rainfall variability. This process chain is caused by aerosol-induced changes in radiative surface fluxes rather than changes in ocean circulations. CMIP6 models further suggest that aerosol-cloud interactions modulate the inter-model uncertainty of simulated NASST and potentially the Sahel rainfall variability.

Model calibration using ESEm v1.1.0 – an open, scalable Earth System Emulator

Geoscientific Model Development Copernicus Publications (2021)

Authors:

Duncan WATSON-PARRIS, Andrew Williams, Lucia Deaconu, PHILIP STIER

Model calibration using ESEm v1.1.0 – an open, scalable Earth system emulator

Geoscientific Model Development Copernicus GmbH 14:12 (2021) 7659-7672

Authors:

Duncan Watson-Parris, Andrew Williams, Lucia Deaconu, Philip Stier

Abstract:

<jats:p>Abstract. Large computer models are ubiquitous in the Earth sciences. These models often have tens or hundreds of tuneable parameters and can take thousands of core hours to run to completion while generating terabytes of output. It is becoming common practice to develop emulators as fast approximations, or surrogates, of these models in order to explore the relationships between these inputs and outputs, understand uncertainties, and generate large ensembles datasets. While the purpose of these surrogates may differ, their development is often very similar. Here we introduce ESEm: an open-source tool providing a general workflow for emulating and validating a wide variety of models and outputs. It includes efficient routines for sampling these emulators for the purpose of uncertainty quantification and model calibration. It is built on well-established, high-performance libraries to ensure robustness, extensibility and scalability. We demonstrate the flexibility of ESEm through three case studies using ESEm to reduce parametric uncertainty in a general circulation model and explore precipitation sensitivity in a cloud-resolving model and scenario uncertainty in the CMIP6 multi-model ensemble. </jats:p>

Apparent temperature and heat-related illnesses during international athletic championships: A prospective cohort study.

Scandinavian journal of medicine & science in sports 31:11 (2021) 2092-2102

Authors:

Karsten Hollander, Milan Klöwer, Andy Richardson, Laurent Navarro, Sébastien Racinais, Volker Scheer, Andrew Murray, Pedro Branco, Toomas Timpka, Astrid Junge, Pascal Edouard

Abstract:

International outdoor athletics championships are typically hosted during the summer season, frequently in hot and humid climatic conditions. Therefore, we analyzed the association between apparent temperature and heat-related illnesses occurrence during international outdoor athletics championships and compared its incidence rates between athletics disciplines. Heat-related illnesses were selected from illness data prospectively collected at seven international outdoor athletics championships between 2009 and 2018 using a standardized methodology. The Universal Thermal Climate Index (UTCI) was calculated as a measure of the apparent temperature based on weather data for each day of the championships. Heat-related illness numbers and (daily) incidence rates were calculated and analyzed in relation to the daily maximum UTCI temperature and between disciplines. During 50 championships days with UTCI temperatures between 15℃ and 37℃, 132 heat-related illnesses were recorded. Average incidence rate of heat-related illnesses was 11.7 (95%CI 9.7 to 13.7) per 1000 registered athletes. The expected daily incidence rate of heat-related illnesses increased significantly with UTCI temperature (0.12 more illnesses per 1000 registered athletes/°C; 95%CI 0.08-0.16) and was found to double from 25 to 35°C UTCI. Race walkers (RR = 45.5, 95%CI 21.6-96.0) and marathon runners (RR = 47.7, 95%CI 23.0-98.8) had higher heat-related illness rates than athletes competing in short-duration disciplines. Higher UTCI temperatures were associated with more heat-related illnesses, with marathon and race walking athletes having higher risk than athletes competing in short-duration disciplines. Heat-related illness prevention strategies should predominantly focus on marathon and race walking events of outdoor athletics championships when high temperatures are forecast.