融合多尺度特征Transformer的高分辨率遥感图像变化检测

李健慷; 张桂欣; 祝善友; 徐永明; 李湘雨

下载中心

优秀审稿专家

优秀论文

首页 > , Vol. , Issue () : -

摘要

全文摘要次数： 150 全文下载次数： 35

引用本文:

DOI:

10.11834/jrs.20253201

收稿日期:

2023-06-09

修改日期:

2024-01-11

PDF Free EndNote BibTeX

融合多尺度特征Transformer的高分辨率遥感图像变化检测

李健慷¹, 张桂欣², 祝善友¹, 徐永明¹, 李湘雨¹

1.南京信息工程大学遥感与测绘工程学院;2.南京信息工程大学地理科学学院

摘要:

为了加强变化检测中深度学习网络的语义信息提取能力，捕获更多高阶多尺度特征细节以及突出影像差异信息，本文提出一种基于孪生结构和多尺度特征Transformer的高分辨率遥感影像变化检测模型（Multi-scale Feature Transformer Siamese Network，MFTSNet）。该模型设计了语义特征Transformer模块（Semantic feature Transformer module, ST）捕获不同层级特征图的语义信息，同时引入置入Transformer模块（Grounding Transformer module, GT）和映射Transformer模块（Rendering Transformer module, RT）加强低层和高层语义信息的获取，发掘高阶多尺度特征细节信息以及不同空间位置和通道间的全局上下文关系，进一步提升变化检测精度和增强地物检测结果的完整性、区域内部以及边缘细节。将MFTSNet与8种变化检测模型在四个公开数据集上的变化检测结果进行对比，并通过消融实验、参数分析等手段验证MFTSNet中各模块的有效性。对比实验表明，本文MFTSNet网络模型在四个数据集上的F1分数和交并比IoU分别提高0.465%、0.113%、0.369%、2.13%和0.723%、0.188%、0.304%、2.962%。消融实验表明，GT、RT、ST三个模块共同作用可有效提升网络模型性能。MFTSNet模型中的特征信息长度L与编码器-解码器个数是两个重要的网络结构参数，在CDD与WHU-CD数据集实验中，L取16时，在SYSU-CD、LEVIR-CD数据实验中，L取8时，四个数据集上设置(EN,DN)为(1,2)时，MFTSNet模型的检测结果最优。

关键词:

高分辨率遥感，变化检测，深度学习，孪生网络，多尺度特征，Transformer，语义信息，消融实验

Change Detection for High-resolution Remote Sensing Images with Multi-scale Feature Transformer

Abstract:

Change detection is the process of compare and analyze remotely sensed bitemporal images at the same area to determine chages. With the in-depth development of aerospace and electronic technology, the spatial resolution of remote sensing images has been continuously improved. There is an increasing need to utilize optical imagery with a high spatial resolution for large-scale land cover changes detection. Though high-resolution remote sensing images can present more detailed information, the loose spatial dependencies and highly cluttered spatial structure of ground objects make extraction more difficult. The difference in geometric positions caused by different shooting angles of the sensor results in the weakening of the separability of spectral information, making it more difficult to detect changes in high-resolution images. Therefore, it is of great scientific significance and practical value to develop efficient and accurate change detection algorithms used for high-resolution images. In order to solve the problems of insufficient semantic information extraction from deep learning network, loss of high-order multi-scale feature details and lack of outstanding image difference information, a multi-scale feature transformer siamese network (MFTSNet) model was proposed to dectect changes from high-resolution images. In this model, the semantic transformer (ST) module is designed to capture semantic information of feature maps at different levels. At the same time, the grounding transformer (GT) module and the rendering transformer (RT) module are introduced to enhance the acquisition of low-level and high-level semantic information, supplement high-level multi-scale feature details and global context relationships between different spatial locations and channels, further improve the change detection accuracy and optimize the integrity of the detection results details of the interior and edges of the area. Compared with the other seven change detection models, MFTSNet improved the F1 index and intersection to union ratio (IoU) on the four publicly available datasets by at least 0.465%, 0.113%, 0.369%, 2.13% and 0.723%, 0.188%, 0.304%,2.962%, respectively. Moreover, the effectiveness of the proposed module was demonstrated through ablation study and parameter analysis. The results of the ablation experiment indicate that combination of GT, RT, and ST modules can improve the performance of the MFTSNet model. Through parameter analysis, the length of feature information L and the number of encoders decoders in the MFTSNet model are two important parameters. MFTSNet model has the highest change detection precision when L is 16 on CDD and WHU-CD dataset, 8 on SYSU-CD and LEVIR-CD dataset, and (EN, DN) can be set to (1,2). The conclusion indicates that the proposed MFTSNet model performs better in detecting the integrity of the region and the detection performance of different scale regions in the experimental area. Cross spatial and cross scale interactions can interact and integrate feature maps of different scales, thereby better capturing important information in images and improving model performance.

Key Words:

high resolution remote sensing, change detection, deep learning, siamese network, multi-scale feature, transformer, semantic information, ablation experiment

本文暂时没有被引用！