Skip to main navigation Skip to search Skip to main content

Forest-Chat: Adapting Vision-Language Agents for Interactive Forest Change Analysis

James A Brock, Ce Zhang*, Pui Anantrasirichai

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

Abstract

The increasing availability of high-resolution satellite imagery, together with advances in deep learning, creates new opportunities for forest monitoring workflows. Two central challenges in this domain are pixel-level change detection and semantic change interpretation, particularly for complex forest dynamics. While large language models (LLMs) are increasingly adopted for data exploration, their integration with vision-language models (VLMs) for remote sensing image change interpretation (RSICI) remains underexplored, especially beyond urban environments. This paper introduces Forest-Chat, an LLM-driven agent for forest change analysis, enabling natural language querying across multiple RSICI tasks, including change detection and captioning, object counting, deforestation characterisation, and change reasoning. Forest-Chat builds upon a multi-level change interpretation (MCI) vision-language backbone with LLM-based orchestration, incorporating zero-shot change detection via AnyChange and multimodal LLM-based zero-shot change captioning and refinement. To support adaptation and evaluation in forest environments, we introduce the Forest-Change dataset, comprising bi-temporal satellite imagery, pixel-level change masks, and semantic change captions generated through human annotation and rule-based methods. Forest-Chat achieves mIoU and BLEU-4 scores of 67.10% and 40.17% on Forest-Change, and 88.13% and 34.41% on LEVIR-MCI-Trees, a tree-focused subset of LEVIR-MCI. In a zero-shot capacity, it achieves 60.15% and 34.00% on Forest-Change, and 47.32% and 18.23% on LEVIR-MCI-Trees respectively. Further experiments demonstrate the value of caption refinement for injecting geographic domain knowledge into supervised captions, and the system's limited label domain transfer onto JL1-CD-Trees. These findings demonstrate that interactive, LLM-driven systems can support accessible and interpretable forest change analysis. Datasets and code are publicly available at https://github.com/JamesBrockUoB/ForestChat
Original languageEnglish
Article number103741
Number of pages18
JournalEcological Informatics
Volume95
Early online date28 Mar 2026
DOIs
Publication statusE-pub ahead of print - 28 Mar 2026

Bibliographical note

Publisher Copyright:
© 2026 The Authors.

Keywords

  • Vision-Language models
  • Multi-task learning
  • Change interpretation
  • Zero-shot change detection and captioning
  • LLM agents

Fingerprint

Dive into the research topics of 'Forest-Chat: Adapting Vision-Language Agents for Interactive Forest Change Analysis'. Together they form a unique fingerprint.

Cite this