TY - JOUR
T1 - TDMSANet
T2 - A Tri-Dimensional Multi-Head Self-Attention Network for Improved Crop Classification from Multitemporal Fine-Resolution Remotely Sensed Images
AU - Li, Jian
AU - Tang, Xuhui
AU - Lu, Jian
AU - Fu, Hongkun
AU - Zhang, Miao
AU - Huang, Jujian
AU - Zhang, Ce
AU - Li, Huapeng
PY - 2024/12/20
Y1 - 2024/12/20
N2 - Accurate and timely crop distribution data are crucial for governments, in order to make related policies to ensure food security. However, agricultural ecosystems are spatially and temporally dynamic systems, which poses a great challenge for accurate crop mapping using fine spatial resolution (FSR) imagery. This research proposed a novel Tri-Dimensional Multi-head Self-Attention Network (TDMSANet) for accurate crop mapping from multitemporal fine-resolution remotely sensed images. Specifically, three sub-modules were designed to extract spectral, temporal, and spatial feature representations, respectively. All three sub-modules adopted a multi-head self-attention mechanism to assign higher weights to important features. In addition, the positional encoding was adopted by both temporal and spatial submodules to learn the sequence relationships between the features in a feature sequence. The proposed TDMSANet was evaluated on two sites utilizing FSR SAR (UAVSAR) and optical (Rapid Eye) images, respectively. The experimental results showed that TDMSANet consistently achieved significantly higher crop mapping accuracy compared to the benchmark models across both sites, with an average overall accuracy improvement of 1.40%, 3.35%, and 6.42% over CNN, Transformer, and LSTM, respectively. The ablation experiments further showed that the three sub-modules were all useful to the TDMSANet, and the Spatial Feature Extraction Module exerted larger impact than the remaining two sub-modules.
AB - Accurate and timely crop distribution data are crucial for governments, in order to make related policies to ensure food security. However, agricultural ecosystems are spatially and temporally dynamic systems, which poses a great challenge for accurate crop mapping using fine spatial resolution (FSR) imagery. This research proposed a novel Tri-Dimensional Multi-head Self-Attention Network (TDMSANet) for accurate crop mapping from multitemporal fine-resolution remotely sensed images. Specifically, three sub-modules were designed to extract spectral, temporal, and spatial feature representations, respectively. All three sub-modules adopted a multi-head self-attention mechanism to assign higher weights to important features. In addition, the positional encoding was adopted by both temporal and spatial submodules to learn the sequence relationships between the features in a feature sequence. The proposed TDMSANet was evaluated on two sites utilizing FSR SAR (UAVSAR) and optical (Rapid Eye) images, respectively. The experimental results showed that TDMSANet consistently achieved significantly higher crop mapping accuracy compared to the benchmark models across both sites, with an average overall accuracy improvement of 1.40%, 3.35%, and 6.42% over CNN, Transformer, and LSTM, respectively. The ablation experiments further showed that the three sub-modules were all useful to the TDMSANet, and the Spatial Feature Extraction Module exerted larger impact than the remaining two sub-modules.
KW - deep learning
KW - crop mapping
KW - fine spatial resolution imagery
KW - image time series
KW - multi-head self-attention
KW - spatial feature extraction
U2 - 10.3390/rs16244755
DO - 10.3390/rs16244755
M3 - Article (Academic Journal)
SN - 2072-4292
VL - 16
JO - Remote Sensing
JF - Remote Sensing
IS - 24
M1 - 4755
ER -