(Exhaustive) symbolic regression and model selection by minimum description length.

Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 384:2317 (2026) 20240584

Abstract:

Symbolic regression (SR) is the machine learning (ML) method for learning functions from data. After a brief overview of the SR landscape, I will describe the two main challenges that traditional algorithms face: they have an unknown (and probably significant) probability of failing to find any given good function, and they suffer from ambiguity and poorly justified assumptions in their function-selection procedure. To address these, I propose an exhaustive search and model selection by the minimum description length (MDL) principle, which allows accuracy and complexity to be directly traded off by measuring each in units of information. I showcase the resulting publicly available Exhaustive Symbolic Regression (ESR) algorithm on three open problems in astrophysics: the expansion history of the universe, the effective behaviour of gravity in galaxies and the potential of the inflaton field. In each case, the algorithm identifies many functions superior to the literature standards. This general-purpose methodology should find widespread utility in science and beyond. This article is part of the discussion meeting issue 'Symbolic regression in the physical sciences'.

Symbolic regression and differentiable fits in beyond the standard model physics.

Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 384:2317 (2026) 20240593

Authors:

Shehu AbdusSalam, Steven Abel, Deaglan Bartlett, Miguel Crispim Romao

Abstract:

We demonstrate the efficacy of symbolic regression (SR) to probe models of particle physics Beyond the Standard Model (BSM), by considering the so-called Constrained Minimal Supersymmetric Standard Model (CMSSM). Like many incarnations of BSM physics this model has a number (four) of arbitrary parameters, which determine the experimental signals, and cosmological observables such as the dark matter relic density. We show that analysis of the phenomenology can be greatly accelerated by using symbolic expressions derived for the observables in terms of the input parameters. Here we focus on the Higgs mass, the cold dark matter relic density and the contribution to the anomalous magnetic moment of the muon. We find that SR can produce remarkably accurate expressions. Using them we make global fits to derive the posterior probability densities of the CMSSM input parameters which are in good agreement with those performed using conventional methods. Moreover, we demonstrate a major advantage of SR, which is the ability to make fits using differentiable methods rather than sampling methods. We also compare the method with neural network (NN) regression. SR produces more globally robust results, while NNs require data that is focused on the promising regions in order to be equally performant. This article is part of the discussion meeting issue 'Symbolic regression in the physical sciences'.

Probing baryonic feedback with fast radio bursts: joint analyses with cosmic shear and galaxy clustering

Monthly Notices of the Royal Astronomical Society Oxford University Press 547:4 (2026) stag557

Authors:

Amy Wayland, David Alonso, Robert Reischke

Abstract:

Cosmological inference from weak lensing (WL) surveys is increasingly limited by uncertainties in baryonic physics, which suppress the non-linear matter power spectrum on small scales. Multiprobe analyses that incorporate complementary tracers of the gas distribution around haloes offer a pathway to calibrate these effects and recover unbiased cosmological information. In this work, we forecast the constraining power of a joint analysis combining fiducial data from a Stage-IV WL survey with measurements of the dispersion measure from fast radio bursts (FRBs). We evaluate the ability of this approach to simultaneously constrain cosmological parameters and the astrophysical processes governing baryonic feedback, and we quantify the impact of key FRB systematics, including redshift uncertainties and source clustering. We find that, even after accounting for these effects, a 32-point analysis of WL and FRBs significantly improves cosmological constraints, reducing the degradation factor on by compared to WL alone. We further show that FRBs alone are sensitive only to a degenerate combination of the key baryonic parameters, and , and that the inclusion of WL measurements breaks this degeneracy. Finally, we extend our framework to incorporate galaxy clustering measurements using luminous red galaxy and emission line galaxy samples, performing a unified 62-point analysis of WL, dispersion measures of FRBs, and galaxy clustering. While this combined approach tightens constraints on and , it does not lead to a significant improvement in constraints beyond those obtained from WL and FRBs alone.

Reconstructing spatially varying multiplicative bias for Stage IV weak lensing galaxy surveys with a quadratic estimator

Monthly Notices of the Royal Astronomical Society Oxford University Press 547:4 (2026) stag537

Authors:

Konstantinos Tanidis, David Alonso, Lance Miller, Joachim Harnois-Déraps

Abstract:

We present a quadratic estimator that detects and reconstructs spatially varying multiplicative (m-) bias in weak lensing shear measurements, by exploiting the mode coupling that it generates. The method combines E and B modes with inverse-variance weights, to yield an unbiased reconstruction of to first order. We study the ability of future Stage IV surveys to obtain an unbiased reconstruction of the m-bias in differing scenarios, considering differing bias morphologies, and characteristic scales, as well as differing metrics to quantify the signal-to-noise ratio of the reconstructed map. We consider an m pattern repeating on sky patches, as might be the case for an m field caused by focal-plane systematics. With a Euclid-like redshift distribution, we find that root mean square (rms) variations in m-bias may be detected at the 20 level, after stacking between and patches (rising to between and for 1 per cent rms variations, data volumes that are becoming available with upcoming surveys), depending on the morphology of the m pattern. We show that these results are robust against the cosmological model assumed in the reconstruction, as well as the presence of intrinsic alignments or baryonic effects, and that the method shows no spurious response to additive (c-) bias. These results demonstrate that percent-level, spatially varying m-bias can be detected at high significance, enabling diagnosis and mitigation in the Stage IV weak lensing era.

MIGHTEE-H I: Mass Models and Dark Matter properties

Monthly Notices of the Royal Astronomical Society (2026) stag531

Authors:

Anastasia A Ponomareva, PE Mancera Piña, AA Vărăşteanu, M Glowacki, H Desmond, MJ Jarvis, T Yasin, I Heywood, N Maddox, EAK Adams, M Baes, A Gebek, S Kurapati, M Maksymowicz-Maciata, KA Oman, H Pan, I Prandoni, SHA Rajohnson, I Ruffa, K Spekkens

Abstract:

Measuring galaxy rotation curves is critical for inferring the properties of dark-matter haloes in the Lambda Cold Dark Matter (ΛCDM) paradigm. We present H i rotation curves and mass models for 20 galaxies from the MIGHTEE survey. Using extended H i kinematics, we construct resolved mass models that include stellar, gaseous, and dark-matter components. Stellar masses are derived using 3.6 μm imaging under fixed mass-to-light ratio (ϒ* = M/L) assumptions and are complemented, for the first time for a H I-selected sample, by spatially resolved M/L, obtained from multi-wavelength SED fitting. We examine the ratio of baryonic to observed rotation velocity (Vbar/Vobs) at the characteristic radius R2.2. Adopting a fixed ϒ⋆ = 0.5 M⊙/L⊙ yields a clear dependence of V2.2/Vobs on galaxy luminosity, while adopting ϒ⋆ = 0.2 M⊙/L⊙ substantially weakens this trend. In contrast, the resolved M/L analysis preserves the luminosity dependence while modifying the stellar contribution on a galaxy-by-galaxy basis, providing a more accurate representation of the underlying relation. We model the dark-matter haloes using Navarro–Frenk–White profiles and find that the different assumptions for a fixed a M/L systematically shift galaxies relative to the theoretical stellar-to-halo mass and baryonic-to-halo mass relations, while the spatially varying M/L yields the closest agreement with theoretical benchmarks within ΛCDM. We therefore demonstrate that future investigations of the dark matter properties of galaxies using rotation curves need to account for varying M/L across individual galaxy profiles and between galaxies in order to obtain accurate measurements of the dark matter, and therefore test ΛCDM.