下载中心
优秀审稿专家
优秀论文
相关链接
摘要
随着机器学习、深度学习等人工智能技术在遥感领域的不断应用与发展,基于海量样本的数据驱动模型已经成为遥感影像信息提取的一种新的研究范式,其对样本数据的规模、质量、多样性等提出了更高要求。最近,国内外众多学者和研究机构相继发布了一系列遥感影像样本数据集,为大数据时代下遥感影像的信息提取和智能解译等奠定了研究基础。然而目前尚缺乏对上述影像样本数据集的综合分析,针对这一问题,本文在文献检索与分析的基础上,归纳总结了124个具有一定影响力且应用广泛的遥感影像样本数据集并对其元数据进行了分析,并提供了数据来源、应用领域与关键词的发展变化,分析了数据集在空间、时间、光谱分辨率上的差异,以应用领域为依据将其划分为场景识别、土地覆被/利用分类、专题要素提取、变化检测、目标检测、语义分割等8个类别并以部分数据为例进行了具体分析,总结了深度学习模型在数据集上的研究进展,并针对稀疏样本导致的模型过拟合问题,探讨了样本时空迁移、小样本和零样本学习、样本主动发现、样本生成等在遥感影像信息提取中的应用前景。本文首次对遥感影像样本数据集进行了综述研究,可为相关领域科研人员提供数据参考。
With the rapid development of artificial intelligence technology such as machine learning and deep learning in remote sensing, data-driven models have become a new research paradigm for automatic information retrieval from remote sensing imagery, calling for higher requirements for the quantity, quality, and diversity of sample datasets. Before the era of deep learning, because classical machine learning methods (e.?g., support vector machine and random forest) do not require huge numbers of samples for model training, the previously published sample datasets usually have a relatively small size (i.e., less than 100). In recent years, with the rapid development of technologies such as big data, parallel computing, and deep learning, many scholars and research institutions have issued a series of sample datasets, laying a solid foundation for a wide range of research and applications such as scene understanding, semantic segmentation, and object detection from remote sensing images. However, comprehensive review of the recently published sample datasets for remote sensing image analysis under the context of big data and deep learning remains lacking. Therefore, the objective of this study is to summarize and analyze these datasets to provide a valuable data reference for relevant researchers.On the basis of literature retrieval and analysis, this paper summarized a total of 124 widely used, open access, and influential remote sensing image sample datasets that were published between 2001 and 2020.We reviewed and summarized the development of recently published sample datasets for remote sensing imagery based on metadata analysis from the following aspects, such as data sources, application fields, keywords, and data size. Afterward, we analyzed these sample datasets from the perspective of spatial, spectral, and temporal resolutions. We listed the commonly used deep learning models (e.g., convolutional neural networks, recurrent neural networks, and generative adversarial networks) in the remote sensing field to show how these sample datasets could be used. We also divided the remote sensing image sample datasets into eight categories based on the following application fields: scene recognition, land cover/land use classification, thematic information extraction, change detection, ground-object detection, semantic segmentation, quantitative remote sensing, and other applications. The typical datasets and related research progress were carefully reviewed for each application field. In addition, because deep learning models are data-hungry, how to train a model with good generalization capability under limited labeled data has become a significant issue, especially for remote sensing applications given that obtaining sufficient labeled samples is time-consuming. To address this issue, we discussed several methods that could increase the model’s generalization capability, including sample transfer between spatio-temporal domains, few-shot learning, and zero-shot learning, active learning, and semi-supervised learning for sample discovery, as well as sample generation through generative adversarial networks.By means of multi-dimensional analysis, we give a comprehensive overview of remote sensing image sample datasets. To the best of our knowledge, this paper is the first review of remote sensing image sample datasets for deep learning, potentially providing data reference for researchers in related fields.