Panoramic Glass Image Segmentation Network

  • Guanlin Pan School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
  • Yan Cui School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
  • Qingling Chang School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
  • Kelin Li School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
  • Yangtao Ou School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
  • Haohui Yu School of Electronic and Information Engineering, Wuyi University, Jiangmen, Guangdong, 529000, China,
Keywords: Glass segmentation, Machine learning, Panoramic segmentation

Abstract

In panoramic images, the geometric distortion caused by wide-angle lenses makes traditional semantic segmentation methods difficult to accurately segment the glass areas. To address the challenges of capturing spatial features and integrating context information, we propose the Panoramic Glass Image Segmentation Network (PGISNet). This network integrates the Matrix Decomposition Base Module (MDBM), the Transparent Perception Consistency Module (TACM), the Context and Texture Compensation Module (CTCM), and the Multi-scale Gated Context Attention Module (MGCA), constructing a progressive feature processing flow. Experimental results on the PanoGlassV2 benchmark test show that PGISNet achieved 90.03% IoU, 94.76% F-score, and 94.0% PA, significantly outperforming existing methods, verifying its effectiveness and advancement in the panoramic image glass segmentation task.

References

Guo M, Lu C, Hou Q, et al., 2022, Rethinking Convolutional Attention Design for Semantic Segmentation. ArXiv. https://doi.org/10.48550/arXiv.2209.08575

Chang Q, Meng X, Hong Z, et al., 2024, ProgressiveGlassNet: Glass Detection with Progressive Decoder. In: 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), 917–925.

Huo D, Wang J, Qian Y, et al., 2023, Glass Segmentation with RGB-Thermal Image Pairs. IEEE Trans Image Process, 2023(32): 1911–1926.

Xie E, Wang W, Wang W, et al., 2021, Segmenting Transparent Object in the Wild with Transformer. ArXiv. https://doi.org/10.48550/arXiv.2101.08461

Xie E, Wang W, Wang W, et al., 2020, Segmenting Transparent Objects in the Wild. ArXiv. https://doi.org/10.48550/arXiv.2003.13948

Zhao H, Shi J, Qi X, et al., 2017, Pyramid Scene Parsing Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6230–6239.

Chang Q, Liao H, Meng X, et al., 2024, PanoglassNet: Glass Detection with Panoramic RGB and Intensity Images. IEEE Trans Instrum Meas, 2024(99): 1.

Dosovitskiy A, Beyer L, Kolesnikov A, 2020, An Image is Worth 16x16 Words; Transformers for Image Recognition at Scale. ArXiv. https://doi.org/10.48550/arXiv.2010.11929

Yu C, Gao C, Wang J, 2020, BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation. ArXiv. https://doi.org/10.48550/arXiv.2004.02147

Fan M, Lai S, Huang J, et al., 2021, Rethinking BiSeNet for Real-Time Semantic Segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9711–9720.

Zhang H, Wu C, Zhang Z, 2022, Split-Attention Networks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2735–2745.

Huang Z, Wang X, Huang L, et al., 2019, CCNet: Criss-Cross Attention for Semantic Segmentation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 603–612.

Chu X, Tian Z, Wang Y, 2021, Twins: Revisiting the Design of Spatial Attention in Vision Transformers. Advances in Neural Information Processing Systems (NeurIPS 2021), 9355–9366.

Liu Z, Lin Y, Cao Y, 2021, Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 9992–10002.

Liu Z, Mao H, Wu C, 2022, A ConvNet for the 2020s. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11966–11976.

Teng Z, Zhang J, Yang K, 2022, 360BEV: Panoramic Semantic Mapping for Indoor Bird’s Eye View. ArXiv. https://doi.org/10.48550/arXiv.2303.11910

Zhang J, Yang K, Ma C, 2022, Bending Reality: Distortion-Aware Transformers for Adapting to Panoramic Semantic Segmentation. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16917–16927.

Published
2025-12-16