Earth Observation embeddings at the test: A novel benchmark to evaluate (neural) compression for satellite imagery

(2025)

Authors:

Rikard Vinge, Michael L Marszalek, Jannik Schneider, Conrad M Albrecht

Abstract:

With the rapidly growing production and utilization of Earth Observation (EO) data, the past decade sparked interest in the efficient compression of EO data into low-dimensional embeddings. In a parallel development, EO Foundation Models (FM), trained on large amounts of unlabeled data to be used in a wide range of applications, also utilize low-dimensional embeddings to distill representations of EO data [1, 2, 3]. In one aspect, EO FMs may serve as (lossy) neural compressors to improve data transfer and lower storage needs – effectively reducing the carbon footprint of EO data [4]. While the development in EO FMs rapidly advances, there is need for a novel benchmark scheme to evaluate the quality of (compressed) embeddings. The statement “foundational” or “general purpose representation” needs a test. As part of the Horizon Europe project “Embed2Scale” [5], co-funded by the European Union (Horizon Europe contract No. 101131841), the Swiss State Secretariat for Education (SERI), and UK Research and Innovation (UKRI), we present a novel approach to benchmark learnt compression of multimodal Copernicus Sentinel data for various relevant application domains. In the form of a competition, contestants provide embeddings that are evaluated on a diverse set of problems based on real-life use cases relevant for the research community, governments, and corporate businesses. The problems are hidden from the contestants to evaluate the applicability of the embeddings to unknown problems. The benchmark statistically evaluates the performance of downstream tasks through fine-tuning of neural networks that fit into commodity hardware. We underline a practically relevant scenario where end users rarely have access to costly and energy-intensive acceleration hardware. The overall performance, i.e. the evaluation across all the benchmark’s problems, is crucial and ensures a diverse and fair evaluation of the embeddings. After the competition, the datasets in the benchmark are published and made available to the community. [1] X. Sun et al., “RingMo: A remote sensing foundation model with masked image modeling,” IEEE Transactions on Geoscience and Remote Sensing, 2022. [2] D. Wang et al., “Advancing plain vision transformer toward remote sensing foundation model,” IEEE Transactions on Geoscience and Remote Sensing, 2022. [3] C. Bodnar et al., “Aurora: A foundation model of the atmosphere,” Tech. Rep., 2024. [4] R. Wilkinson, M.M. Mleczko, R.J.W. Brewin, K.J. Gaston, M. Mueller, J.D. Shutler, X. Yan, K. Anderson, Environmental impacts of earth observation data in the constellation and cloud computing era,Science of The Total Environment, Volume 909,2024,168584,ISSN 0048-9697, https://doi.org/10.1016/j.scitotenv.2023.168584 [5] https://embed2scale.eu/

Neural Embedding Compression for Earth Observation Data – an Ablation Study

(2025)

Authors:

Amelie Koch, Isabelle Wittmann, Carlos Gomez, Rikard Vinge, Michael Marszalek, Conrad Albrecht, Thomas Brunschwiler

Abstract:

The exponential growth of Earth Observation data presents challenges in storage, transfer, and processing across fields such as climate modeling, disaster response, and agricultural monitoring. Efficient compression algorithms—either lossless or lossy—are critical to reducing storage demands while preserving data utility for specific applications. Conventional methods, such as JPEG and WebP, rely on hand-crafted base functions and are widely used. However, Neural Compression, a data-driven approach leveraging deep neural networks, has demonstrated superior performance by generating embeddings suitable for high levels of entropy encoding, enabling more accurate reconstructions at significantly lower bit rates.In our prior work, we developed a Neural Compression pipeline utilizing a masked auto-encoder, embedding quantization, and an entropy encoder tailored for satellite imagery [1]. Instead of reconstructing original images, we evaluated the reconstructed embeddings for downstream tasks such as image classification and semantic segmentation. In this study, we conducted an ablation analysis to quantify the contributions of individual pipeline components—encoder, quantizer, and entropy encoder—toward the overall compression rate. Our findings reveal that satellite images achieve higher compression rates compared to ImageNet samples due to their lower entropy. Furthermore, we demonstrate the advantages of learned entropy models over hand-crafted alternatives, achieving better compression rates, particularly for datasets with seasonal or geospatial coherence. Based on these insights, we provide a list of recommendations for optimizing Neural Compression pipelines to enhance their performance and efficiency.This work was conducted under the Embed2Scale project, supported by the Swiss State Secretariat for Education, Research and Innovation (SERI contract no. 24.00116) and the European Union (Horizon Europe contract no. 101131841).[1] C. Gomes and T. Brunschwiler, “Neural Embedding Compression for Efficient Multi-Task Earth Observation Modelling,” IGARSS 2024, Athens, Greece, 2024, pp. 8268-8273, doi: 10.1109/IGARSS53475.2024.10642535.

Reconstructing 3D cloud fields from multispectral satellite images using deep learning

Copernicus Publications (2025)

Authors:

Stella Girtsou, Lilli Freischem, Kyriaki-Margarita Bintsi, Guiseppe Castiglione, Emiliano Diaz Salas-Porras, Michael Eisinger, Emmanuel Johnson, William Jones, Anna Jungbluth, Joppe Massant

Regionally focused aerosol-climate modelling at kilometer scale

Copernicus Publications (2025)

Authors:

Anne Kubin, Bernd Heinold, Philipp Weiss, Philip Stier, Ina Tegen

Satellite observations reveal higher persistence of fog in polluted conditions in the Po valley, Italy

Copernicus Publications (2025)

Authors:

Eva Pauli, Jan Cermak, Jörg Bendix, Philip Stier