Geospatial foundation models enable data-efficient tree species mapping in temperate mountain forests

James GC Ball, Jana Annika Wicklein, Frank Feng, Jovana Knezevic, Sadiq Jaffer, Anil Madhavapeddy, Clement Atzberger, Michele Dalponte, and David Coomes. In bioRxiv. .James GC BallJana Annika WickleinZhengpeng FengJovana KnezevicSadiq JafferAnil MadhavapeddyClement AtzbergerMichele DalponteDavid Coomes

Abstract

Accurate mapping of tree species from satellite data remains challenging in heterogeneous mountain forests due to environmental gradients, mixed stands, limited availability of high-purity training labels, and strong illumination-angle effects.

Recent geospatial foundation models offer a new approach by learning generic, cloud-agnostic, information-rich representations from large multi-sensor archives suitable for a range of downstream tasks, but their ecological utility for species-level mapping remains incompletely understood.

Here, we evaluate two geospatial foundation-model embeddings, AlphaEarth and Tessera, for tree species classification in the Trentino region of northern Italy, using parcel-level forest inventories as reference data (18 species and species groups). We compare their performance against conventional Sentinel-1+2 satellite composites across a series of controlled experiments examining classification accuracy, label efficiency, classifier complexity, robustness to label impurity, and temporal transferability

Foundation-model embeddings consistently outperform composite-based multispectral satellite baselines (weighted F1 = 0.83 vs. 0.80; macro F1 = 0.55 vs. 0.50), reaching near-asymptotic accuracy with as few as 5% of available training parcels and preserving ecologically meaningful structure aligned with functional and taxonomic groupings.

However, realising this advantage requires a nonlinear classifier: a compact neural network provides better results than classic machine learning (i.e. Random Forest) and performs as well as deeper neural networks, while a linear classifier on foundation-model embeddings underperforms a neural network on conventional composites. Ancillary environmental covariates offer no additional classification benefit when added to embedding-based models.

Classification accuracy remains robust to moderate levels of label impurity, allowing mixed parcels to be retained in the training dataset without substantial penalties, while training with parcel-level species proportions as soft labels achieves higher peak performance (macro F1 = 0.586 for Tessera, 0.589 for AlphaEarth) and lower Proportion L1 error than hard labels without requiring purity filtering, maximising the value of the full range of input data.

However, temporal transfer across years reveals performance degradation, with weighted F1 declining by 9% for Tessera and 15% for AlphaEarth, and disproportionate losses for rare species. Overall, our results show that geospatial foundation models shift a primary bottleneck in species mapping from feature engineering toward the availability, quality, and temporal alignment of ecological reference data, while opening new opportunities for scalable biodiversity monitoring and the analysis of ecological change.