Skip to main navigation Skip to search Skip to main content

Embodied multi-modal data fusion via geometry anchoring for continuous perception in ground robots

Jiahe Fan, Mohammud J. Bocus, Shaolong Shu*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

Abstract

Reliable environment perception is essential for the autonomous operation of embodied agents in complex outdoor environments. While semantic segmentation is an effective method for environment perception in outdoor navigation, its reliability in dynamic environments is often constrained by data distribution shifts and the lack of access to the original training data. To address these issues, source-free unsupervised domain adaptation (SFUDA) has emerged as a vital solution. However, the feature representations learned through unsupervised adaptation methods remain sensitive to natural variations, leading to performance degradation under changing environmental conditions. To overcome this limitation, this article presents geometry-anchored structural distillation (GASD), a framework that leverages stable geometric priors from a foundation model for robust outdoor continuous perception. To establish reliable self-supervision, a prototype-guided fusion strategy is introduced to synthesize high-quality pseudo labels by dynamically balancing appearance features and geometric stability. Building on this strategy, a geometry-anchored augmentation mechanism regularizes perturbed RGB features with invariant depth structures, ensuring consistency under varying lighting or weather conditions. Additionally, the framework employs Pearson correlation to align semantic predictions across modalities, leading to improved cross-modal consistency and enhanced robustness in continuous perception. Experiments on outdoor adaptation benchmarks demonstrate that GASD outperforms existing SFUDA methods, highlighting the effectiveness of leveraging geometric consistency to stabilize semantic perception across diverse and dynamic outdoor environments.
Original languageEnglish
Pages (from-to)162-169
Number of pages8
JournalPattern Recognition Letters
Volume203
Early online date3 Mar 2026
DOIs
Publication statusE-pub ahead of print - 3 Mar 2026

Bibliographical note

Publisher Copyright:
© 2026 Elsevier B.V.

Fingerprint

Dive into the research topics of 'Embodied multi-modal data fusion via geometry anchoring for continuous perception in ground robots'. Together they form a unique fingerprint.

Cite this