Archive Layout with Content

A variety of common markup showing how the theme styles them.

Header one

Header two

Header three

Header four

Header five
Header six

Blockquotes

Single line blockquote:

Quotes are cool.

Tables

EntryItem 
John Doe2016Description of the item in the list
Jane Doe2019Description of the item in the list
Doe Doe2022Description of the item in the list
Header1Header2Header3
cell1cell2cell3
cell4cell5cell6
cell1cell2cell3
cell4cell5cell6
Foot1Foot2Foot3

Definition Lists

Definition List Title
Definition list division.
Startup
A startup company or startup is a company or temporary organization designed to search for a repeatable and scalable business model.
#dowork
Coined by Rob Dyrdek and his personal body guard Christopher “Big Black” Boykins, “Do Work” works as a self motivator, to motivating your friends.
Do It Live
I’ll let Bill O’Reilly explain this one.

Unordered Lists (Nested)

Ordered List (Nested)

  1. List item one
    1. List item one
      1. List item one
      2. List item two
      3. List item three
      4. List item four
    2. List item two
    3. List item three
    4. List item four
  2. List item two
  3. List item three
  4. List item four

Buttons

Make any link standout more when applying the .btn class.

Notices

Watch out! You can also add notices by appending {: .notice} to a paragraph.

HTML Tags

Address Tag

1 Infinite Loop
Cupertino, CA 95014
United States

This is an example of a link.

Abbreviation Tag

The abbreviation CSS stands for “Cascading Style Sheets”.

Cite Tag

“Code is poetry.” —Automattic

Code Tag

You will learn later on in these tests that word-wrap: break-word; will be your best friend.

Strike Tag

This tag will let you strikeout text.

Emphasize Tag

The emphasize tag should italicize text.

Insert Tag

This tag should denote inserted text.

Keyboard Tag

This scarcely known tag emulates keyboard text, which is usually styled like the <code> tag.

Preformatted Tag

This tag styles large blocks of code.

.post-title {
  margin: 0 0 5px;
  font-weight: bold;
  font-size: 38px;
  line-height: 1.2;
  and here's a line of some really, really, really, really long text, just to see how the PRE tag handles it and to find out how it overflows;
}

Quote Tag

Developers, developers, developers… –Steve Ballmer

Strong Tag

This tag shows bold text.

Subscript Tag

Getting our science styling on with H2O, which should push the “2” down.

Superscript Tag

Still sticking with science and Isaac Newton’s E = MC2, which should lift the 2 up.

Variable Tag

This allows you to denote variables.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, Conservatoire National des Arts et Métiers. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, Conservatoire National des Arts et Métiers. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on satellite imagery can predict household-level wealth with high accuracy, but their shrinkage toward the mean limits usefulness for policy evaluation. Two novel methods—linear calibration correction and Tweedie’s correction—reduce this bias without requiring new labeled data, operating directly on out-of-sample predictions. Experiments with Demographic and Health Survey data show that both approaches outperform existing methods, yielding more accurate and policy-relevant estimates in data-scarce settings.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, Conservatoire National des Arts et Métiers. Deep learning for remote sensing is key to turning satellite and aerial imagery into real-world insights, but models must handle diverse sensors, environments, and conditions such as clouds and seasonal changes. The talk explores two approaches to building more robust algorithms: generative domain adaptation and geospatial foundation models. FlowEO will be presented as a generative method for unsupervised domain adaptation with high predictive accuracy across tasks, including post-disaster response and SAR-to-optical translation, without retraining the base model. The webinar will also cover PANGAEA, an evaluation protocol for geospatial foundation models that covers several datasets, modalities, and temporalities, comparing them with supervised baselines. PANGAEA highlights the strengths and weaknesses of GFMs, especially under limited annotations, and examines the role of multi-temporal data.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). Monitoring glacier calving fronts with SAR imagery is essential but challenging, as deep learning models often underperform due to limited labeled data and high variability. To address this, an unlabeled Sentinel-1 SAR and Sentinel-2 optical dataset was compiled, enabling a novel multi-modal self-supervised pre-training strategy for a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed clear improvements over non-pre-trained models, demonstrating the potential of data-efficient learning for calving front segmentation.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

Self-supervised pre-training for glacier calving front extraction from synthetic aperture radar imagery

Event date:

Webinar with Nora Gourmelon, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The factors influencing melt at glacier fronts facing the ocean remain an active area of research. To better understand these factors, it is essential to monitor changes in glacier calving fronts. Due to the importance of weather-dependent and seasonal variations, Synthetic Aperture Radar (SAR) is the preferred imaging method for such monitoring. In recent years, deep learning models have been developed to automate the extraction of calving front positions, however, their performance on SAR data remains suboptimal. Limited labeled data and high variability in SAR images hinder traditional supervised learning. Foundation models pre-trained on large, diverse datasets could provide robust feature representations that require minimal labeled data for fine-tuning on specific SAR tasks. However, in preliminary experiments, we found that the domain gap is too large for the task of extracting calving fronts from SAR imagery, and foundation models fine-tuned on the "CAlving Fronts and where to Find thEm" (CaFFe) benchmark dataset performed subpar. Therefore, we compiled an unlabeled dataset of Sentinel-1 SAR image sequences of Arctic glaciers, each associated with a single Sentinel-2 optical reference image. Using this dataset, we developed a novel multi-modal self-supervised pre-training strategy and applied it to pre-train a hybrid CNN-transformer model. Fine-tuning on the CaFFe benchmark showed that the pre-trained model outperforms its non-pre-trained counterpart, enabling more robust calving front segmentation and demonstrating the potential of data-efficient learning for SAR imagery.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Hyperparameter tuning, Bayesian optimisation and applications related to climate change

Event date:

Webinar with Sigrid Passano Hellan, NORCE Norwegian Research Centre. If you don’t want to be setting and tuning machine learning model hyperparameters such as the learning rate yourself, then hyperparameter tuning is a good alternative. Also called hyperparameter optimisation, it has been developed to reduce the amount of machine learning experience required to set up a model by automatically finding the best hyperparameters. By reducing these hurdles to adoption, machine learning can be adopted for more climate applications. Bayesian optimisation is a popular method for hyperparameter optimisation, and consists of fitting a probabilistic model and using that to inform the optimisation process. Both hyperparameter optimisation and Bayesian optimisation can be greatly enhanced using transfer learning, where we learn from previous optimisation problems to speed up the current one. In this talk I will give an introduction to methods for hyperparameter tuning and Bayesian optimisation, and how transfer learning can be used. Then I will present how Bayesian optimisation has been applied to climate related research, as we discussed in a recent survey. We found the main application areas to be material discovery – finding better materials for solar panels; wind farm layouts – deciding where to place turbines; optimal control of renewables – e.g. adjusting wind turbine blade angles to current conditions; and environmental monitoring – e.g. finding pollution maxima.

Earth observation and deep learning for urban applications

Event date:

Webinar with Sebastian Hafner, RISE. Urbanization is progressing at an unprecedented rate in many places around the world, making Earth observation (EO) a vital tool for monitoring its dynamics on a global scale. Modern satellite missions provide new opportunities for urban mapping and change detection (CD) through high-resolution imagery. At the same time, EO data analysis has advanced from traditional machine learning approaches to deep learning (DL), particularly Convolutional Neural Networks (ConvNets). Yet, current DL methods for urban mapping and CD face key challenges, including effectively integrating multi-modal EO data, reducing reliance on large labeled datasets, and improving transferability across geographic regions. This talk will present methods to address these challenges in urban mapping and CD, with applications such as multi-hazard building damage detection. It will also highlight the IDEAMAPS Data Ecosystem project, showing how its community-centric approach advances our understanding of urban deprivation and enables DL-based deprivation mapping.

Leveraging AI for Climate Resilience in Africa: Challenges, Opportunities, and the Need for Collaboration

Event date:

Webinar with Amal Nammouchi, Karlstad University and AfriClimate AI. As climate change issues become more pressing, their impact in Africa calls for urgent, innovative solutions tailored to the continent’s unique challenges. While Artificial Intelligence (AI) emerges as a critical and valuable tool for climate change adaptation and mitigation, its effectiveness and potential are contingent upon overcoming significant challenges such as data scarcity, infrastructure gaps, and limited local AI development. This talk explores the role of AI in climate change adaptation and mitigation in Africa. It advocates for a collaborative approach to build capacity, develop open-source data repositories, and create context-aware, robust AI-driven climate solutions that are culturally and contextually relevant.

Leveraging AI for Large-Scale Acoustic Biodiversity Monitoring: Insights from TABMON

Event date:

Webinar with Benjamin Cretois, Norwegian Institute for Nature Research. Advancing biodiversity monitoring is crucial for meeting the EU Biodiversity Strategy targets and addressing gaps in current ecological assessments. However, collecting data to monitor the state of biodiversity is time and resource consuming. Passive Acoustic Monitoring (PAM), in combination with AI tools offers an efficient alternative to conventional data collection practices. PAM is a non-invasive method that uses sound recorders to capture wildlife vocalizations and environmental sounds over time. It is particularly valuable for monitoring elusive or nocturnal species, such as birds, amphibians, and marine mammals, that are challenging to detect visually. The "Towards a Transnational Acoustic Biodiversity Monitoring Network" (TABMON) project is an initiative to establish a transnational passive acoustic monitoring monitoring network using autonomous acoustic sensors across four different European countries: Norway, Netherlands, France and Spain. TABMON’s objective is to demonstrate how acoustic sensing, coupled with cutting-edge AI, can complement traditional monitoring methods and support the development of methods to better monitor biodiversity. In this talk, we will also share our experiences with the deployment of acoustic recorders, data management strategies, and annotation protocols. These include managing large-scale, networked deployments across diverse landscapes, designing an efficient annotation workflow, and leveraging AI tools to process and analyze massive datasets.

Tech-informed nature conservation - are we there yet?

Event date:

Webinar with Abdulhakim Abdi, Lund University. In the face of a global biodiversity crisis, the ability to efficiently monitor ecological change across space and time has never been more accessible. Advances in satellite remote sensing and AI have opened new frontiers for biodiversity monitoring by offering powerful tools to analyze vast datasets, automate species detection, and improve disturbance tracking. While these technologies hold great promise, their widespread adoption faces key challenges, including data accessibility, environmental costs, model interpretability, and the need for stronger interdisciplinary collaboration. This presentation explores the opportunities and limitations of tech-driven nature conservation, assessing whether current developments are sufficient to bridge critical knowledge gaps. The integration of remote sensing with AI brings us closer to a more data-informed approach to managing the living world - but are we truly ready to harness its full potential?

Edge AI and IoT-Driven Water Quality Monitoring

Event date:

Webinar with Atakan Aral, University of Vienna. The integration of Edge Computing and Artificial Intelligence (i.e., Edge AI) unlocks new possibilities for remote monitoring of water quality across diverse environments. This talk will explore the dual aspects of "remote" monitoring: (1) collecting water quality data in geographically remote regions with limited energy and connectivity and (2) using connected water quality sensors to gain real-time insights remotely. By deploying low-power or energy-harvesting sensors and learning from data in close proximity to sensors, we can improve efficiency, reduce latency, and maintain learning performance in challenging environments without reliable communication and energy infrastructures. The presentation will showcase two real-world case studies in river pollution source identification and monitoring of water distribution systems.

Reliable conclusions from remote sensing maps

Event date:

Webinar with Sherrie Wang, MIT. Remote sensing maps are used to estimate regression coefficients relating environmental variables, such as the effect of conservation zones on deforestation. However, the quality of map products varies, and -- because maps are outputs of complex machine learning algorithms that take in a variety of remotely sensed variables as inputs -- errors are difficult to characterize. Thus, population-level estimates from such maps may be biased. We show how a small amount of randomly sampled ground truth data can correct for bias in large-scale remote sensing map products. Applying our method across multiple remote sensing use cases in regression coefficient estimation, we find that it results in estimates that are (1) more reliable than using the map product as if it were 100% accurate and (2) have lower uncertainty than using only the ground truth and ignoring the map product. Paper: https://arxiv.org/abs/2407.13659

Machine learning for Earth system prediction and predictability

Event date:

Webinar with María J. Molina, University of Maryland. Machine learning can be used for Earth system prediction, or to study the uppermost limit of prediction skill theoretically achievable given the system's initial state or other factors, otherwise known as predictability. In traditional numerical weather prediction frameworks, we solve the governing partial differential equations starting from an initial state. This initialized prediction framework usually involves three stages: 1) generating the initial conditions of the Earth system, 2) running the mathematical representation of the system on a computer forward in time, and 3) analyzing the output and converting it into a format that is useful for end users. Machine learning can be used to improve each of these individual stages, or to circumvent the three stage framework altogether, and examples of each will be given in this seminar. More time during the seminar will be dedicated to the challenges surrounding subseasonal prediction, which focuses on lead times of three to four weeks, and how we can use machine learning to both uncover potential biases in our initialized prediction systems and how we can bias-correct them.

Optimising green infrastructure for climate-resilient cities: AI-driven approaches to urban cooling

Event date:

Webinar with Abdul Shaamala, Queensland University of Technology. Green infrastructure (GI) is critical in enhancing urban resilience, mitigating heat stress, and improving environmental sustainability. However, optimising the placement and configuration of green elements such as trees, parks, and vegetative corridors, requires a data-driven approach that accounts for microclimate variations, urban morphology, and long-term ecosystem benefits. This talk explores how artificial intelligence (AI), machine learning (ML), and geospatial analysis can be leveraged to optimise GI for urban cooling and climate adaptation. Specifically, it delves into tree optimisation strategies that enhance shade provision, reduce urban heat islands (UHI), and improve outdoor thermal comfort. By utilising computational models, including optimisation algorithms and thermal analysis, cities can strategically position vegetation to maximise cooling benefits while balancing urban development needs.

SatML for large-scale above-ground biomass estimation

Event date:

Webinar with Ghjulia Sialelli, ETH Zurich. The combination of remote sensing and machine learning has made it possible to map forest properties at an unprecedented scale and resolution. In this presentation, I will focus on the application of deep learning techniques to estimate above-ground biomass (AGB), a key metric for tracking forest carbon and ecosystem dynamics. I will begin by introducing our recently published, machine-learning-ready dataset. It features high-resolution (10m) multi-modal satellite imagery, paired with AGB reference values from NASA’s Global Ecosystem Dynamics Investigation (GEDI) mission. Key aspects include the carefully selected geographic coverage, thoughtful integration of diverse satellite data sources, and the establishment of performance baselines using standard deep learning models. Next, I will describe our ongoing efforts to build on said baselines. This was done both through feature and model engineering. I will also mention some promising yet unsuccessful approaches, highlighting some key challenges of the task at hand. Finally, I will discuss future directions, including incorporating uncertainty estimation and exploring the potential for generating a global above-ground biomass map.

The digital revolution of Earth system modelling

Event date:

Webinar with Peter Dueben, European Centre for Medium-Range Weather Forecasts. This talk will outline three revolutions that happened in Earth system modelling in the past decades. The quiet revolution has leveraged better observations and more compute power to allow for constant improvements of prediction quality of the last decades, the digital revolution has enabled us to perform km-scale simulations on modern supercomputers that further increase the quality of our models, and the machine learning revolution has now shown that machine-learned weather models are competitive with physics based weather models for many forecast scores while being easier, smaller and cheaper. This talk will summarize the past developments, explain current challenges and opportunities, and outline how the future of Earth system modelling will look like. In particular, regarding machine-learned foundation models in a physical domain such as Earth system modelling.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

High-stakes decisions from low-quality data: AI decision-making for conservation

Event date:

Webinar with Lily Xu, Columbia University. Like many of society's grand challenges, biodiversity conservation requires effectively allocating and managing our limited resources in the face of imperfect information. My research develops data-driven AI decision-making methods to do so, overcoming the messy data ubiquitous in these settings. Here, I’ll present technical advances in machine learning, reinforcement learning, and causal inference, addressing research questions that emerged from on-the-ground challenges in wildlife conservation. I’ll also discuss bridging the gap from research and practice, with anti-poaching field tests in Cambodia, field visits in Belize and Uganda, and large-scale deployment with SMART conservation software.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Generative domain adaptation and foundation models for robust Earth observation

Event date:

Webinar with Georges Le Bellier, CNAM. Deep learning for remote sensing plays a crucial role in turning satellite and aerial imagery into dependable, real-world insights. However, Earth observation models must handle diverse environments, sensors, and conditions—such as clouds, seasonal shifts, and geographic differences—while still producing accurate results. In this talk, we explore two paths that lead to more robust and adaptable algorithms: generative domain adaptation and geospatial foundation models. First, I will introduce FlowEO, a generative approach of Unsupervised Domain Adaptation (UDA) for Earth observation, and show its high performance in UDA scenarios for several downstream tasks, including dense prediction and classification. This flow-matching-based translation method improves pretrained predictive models' accuracies in challenging scenarios such as post-disaster response and high cloud coverage cases with SAR-to-optical translation. FlowEO’s generative domain adaptation method is independent of the downstream task and does not require retraining the predictive model. Then, I will present “PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models“, a standardized evaluation protocol that covers a diverse set of datasets, dense prediction tasks, resolutions, sensor modalities, and temporalities. This benchmark includes comparison between geospatial foundation models but also with supervised baselines, namely U-Net and ViT, and highlights the strengths and weaknesses of GFMs. In addition, PANGAEA evaluates models’ accuracy in cases where labels are limited and questions the impact of multi-temporal data for GFMs.

Debiasing AI predictions for causal inference without fresh ground truth data

Event date:

Webinar with Markus Pettersson, Chalmers University of Technology. Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments. In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes. Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.

2025 Nordic Workshop on AI for Climate Change

The 2025 Nordic Workshop on AI for Climate Change will gather researchers from the Nordics. This one-day, in-person workshop, will take place in Gothenburg, Sweden, May 13th 2025. The workshop will feature a mix of keynotes, oral presentations, and posters around the topics of AI for climate change, including AI for biodiversity and the green transition. The workshop will be a meeting point for a wide range of researchers from (primarily) around the Nordic countries.

AI for Climate action across the Nordics

2025 Nordic Workshop on AI for Climate Change

The 2025 Nordic Workshop on AI for Climate Change will gather researchers from the Nordics. This one-day, in-person workshop, will take place in Gothenburg, Sweden, May 13th 2025. The workshop will feature a mix of keynotes, oral presentations, and posters around the topics of AI for climate change, including AI for biodiversity and the green transition. The workshop will be a meeting point for a wide range of researchers from (primarily) around the Nordic countries.

2025 Nordic Workshop on AI for Climate Change

The 2025 Nordic Workshop on AI for Climate Change will gather researchers from the Nordics. This one-day, in-person workshop, will take place in Gothenburg, Sweden, May 13th 2025. The workshop will feature a mix of keynotes, oral presentations, and posters around the topics of AI for climate change, including AI for biodiversity and the green transition. The workshop will be a meeting point for a wide range of researchers from (primarily) around the Nordic countries.

2025 Nordic Workshop on AI for Climate Change

The 2025 Nordic Workshop on AI for Climate Change will gather researchers from the Nordics. This one-day, in-person workshop, will take place in Gothenburg, Sweden, May 13th 2025. The workshop will feature a mix of keynotes, oral presentations, and posters around the topics of AI for climate change, including AI for biodiversity and the green transition. The workshop will be a meeting point for a wide range of researchers from (primarily) around the Nordic countries.