High-Resolution Remote Sensing Imagery for the Recognition of Traditional Villages
Download PDF


Traditional villages
Building rooftops
High spatial resolution remote sensing
Instance segmentation



Submitted : 2023-12-30
Accepted : 2024-01-14
Published : 2024-01-29


Traditional Chinese villages, vital carriers of traditional culture, have faced significant alterations due to urbanization in recent years, urgently necessitating artificial intelligence data updates. This study integrates high spatial resolution remote sensing imagery with deep learning techniques, proposing a novel method for identifying rooftops of traditional Chinese village buildings using high-definition remote sensing images. Using 0.54 m spatial resolution imagery of traditional village areas as the data source, this method analyzes the geometric and spectral image characteristics of village building rooftops. It constructs a deep learning feature sample library tailored to the target types. Employing a semantically enhanced version of the improved Mask R-CNN (Mask Region-based Convolutional Neural Network) for building recognition, the study conducts experiments on localized imagery from different regions. The results demonstrated that the modified Mask R-CNN effectively identifies traditional village building rooftops, achieving an of 0.7520 and an of 0.7400. It improves the current problem of misidentification and missed detection caused by feature heterogeneity. This method offers a viable and effective approach for industrialized data monitoring of traditional villages, contributing to their sustainable development.


Nie XY, Zhang Y, Sun LS, et al., 2015, Types and Value Recognition of Traditional Villages: A Case Study of Traditional Villages in Shijiazhuang, Hebei. Planner, 2015(S2): 5.

Chen J, 2013, Reduction of 900,000 Natural Villages in 10 Years: Traditional Villages in China are “Dialing 120,” People’s Daily, June 5, 2013.

Liu XF, Tu ZZ, 2017, Quantitative Analysis Method for the Evolution of Traditional Village Scenery. Journal of Huaqiao University: Natural Science Edition, 38(6): 7.

He K, Gkioxari G, Dollár P, et al., 2017, Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 2961–2969.

Li Y, Xu W, Chen H, et al., 2021, A Novel Framework Based on Mask R-CNN and Histogram Thresholding for ScalableSegmentation of New and Old Rural Buildings. Remote Sensing, 13(6): 1070.

Zhan Y, Liu W, Maruyama Y, 2022, Damaged Building Extraction Using Modified Mask R-CNN Model Using Post-Event Aerial Images of the 2016 Kumamoto Earthquake. Remote Sensing, 14(4): 1002.

Wang W, Shi Y, Zhang J, et al., 2023, Traditional Village Building Extraction Based on Improved Mask R-CNN: A Case Study of Beijing, China. Remote Sensing, 15(10): 2616.

Radford A, Kim JW, Hallacy C, et al., 2021, Learning Transferable Visual Models from Natural Language Supervision. International Conference on Machine Learning, PMLR 2021, 8748–8763.

Lin TY, Dollár P, Girshick R, et al., 2017, Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2117–2125.

Lin TY, Maire M, Belongie S, et al., 2014, Microsoft COCO: Common Objects in Context. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer International Publishing, Cham, 740–755.