首页 >  2022, Vol. 26, Issue (8) : 1636-1649

摘要

全文摘要次数: 526 全文下载次数: 693
引用本文:

DOI:

10.11834/jrs.20221577

收稿日期:

2021-09-06

修改日期:

PDF Free   HTML   EndNote   BibTeX
高分辨率遥感影像建筑物提取的注意力胶囊网络算法
许正森1,管海燕1,彭代锋1,于永涛2,雷相达1,赵好好1
1.南京信息工程大学 遥感与测绘工程学院, 南京 210044;2.淮阴工学院 计算机与软件学院, 淮安 223003
摘要:

高分辨率遥感影像建筑物自动提取在防灾减灾、灾害估损、城市规划和地形图制作等方面具有重要意义。但是,目前常用的传统卷积神经网络模型存在异变性强而同变性弱缺陷。针对该问题,本文提出一种基于通道和空间双注意力胶囊编码—解码网络DA-CapsNet(dual-attention capsule encoder-decoder network)的建筑物提取通用模型。该模型通过胶囊卷积和空间—通道双注意力模块增强高分辨率遥感影像中建筑物高阶特征表达能力,实现建筑物遮挡部分以及对非建筑不透水层的准确提取与区分。模型首先利用胶囊编码—解码结构提取并融合多尺度建筑物胶囊特征,获得高质量建筑物特征表达。之后,设计通道和空间注意力特征模块进一步增强建筑物上下文语义信息,提高模型性能。本文选取3种高分辨率建筑物数据集进行试验,最终的平均精度、召回率和F1-score分别为92.15%、92.07%和92.18%。结果表明,本文提出的DA-CapsNet能有效克服高分辨率遥感影像中的空间异质性、同物异谱、异物同谱以及阴影遮挡等影响,实现复杂环境下的高精度建筑物自动提取。

A dual-attention capsule network for building extraction from high-resolution remote sensing imagery
Abstract:

Automatic extraction of buildings from high-resolution remote sensing images is greatly important in disaster prevention and mitigation, disaster loss estimation, urban planning, and topographic map making. With the advancement of optical remote sensing techniques in image resolutions and qualities, remote sensing images have provided an important data source for assisting the rapid updating of building footprint database. Despite the large number of algorithms proposed with enhanced performance, fulfilling highly accurate and fully automated extraction of buildings from remote sensing images is still difficult due to the considerable challenging scenarios of buildings, such as color diversities, topology variations, occlusions, and shadow covers. Thus, exploiting advanced and high-performance techniques to further improve the accuracy and automation level of building extraction is greatly meaningful and urgently required by a large variety of applications.To overcome the issues of strong variability and weak homogeneity of traditional convolutional neural networks, we propose a novel dual-attention capsule encoder–decoder network DA-CapsNet for extracting buildings. In this network, a deep capsule encoder–decoder network, along with the channel-spatial attention blocks, is developed to enhance the capability of extracting high-level feature information from very high resolution remote sensed images. Thus, this model has the ability to extract buildings covered by shadows and discriminate buildings from non-building impervious surfaces. Specifically, we initially employ a deep capsule encoder–decoder network to extract and fuse multiscale building capsule features, resulting in a high-quality building feature representation. Moreover, spatial attention and channel attention modules are designed to further rectify and enhance the captured contextual information to obtain a competitive performance in processing buildings in the diverse challenging scenarios. The contributions include the following: (1) the deep capsule encoder–decoder network is designed to generate a high-quality feature representation; (2) the channel and spatial feature attention modules are designed to highlight channel-wise salient features and focus on class-specific spatial features.The proposed DA-CapsNet was evaluated on three datasets: one Google Building Dataset and two publicly-available datasets (Wuhan and Massachusetts). The experimental results achieved a competitive performance with an average precision, recall, and F1-score of 92.15%, 92.07%, and 92.18%, respectively, in handling buildings of varying challenging scenarios. Considering the overall accuracy of F1-score, the DA-CapsNet achieved the values of 92.70%, 94.01%, and 89.84% for Google, WUH, and MA datasets, respectively. Comparative studies also confirmed the robust applicability and superior performance of the DA-CapsNet in building extraction tasks.

本文暂时没有被引用!

欢迎关注学报微信

遥感学报交流群