PSCNET: EFFICIENT RGB-D SEMANTIC SEGMENTATION PARALLEL NETWORK BASED ON SPATIAL AND CHANNEL ATTENTION
- 1School of Architecture and Urban Planning, Research Institute for Smart Cities, Shenzhen University, Shenzhen, P.R. China
- 2Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Natural Resources, Shenzhen, P.R. China
- 3School of Resource and Environmental Sciences, Wuhan University, Wuhan 430072, P.R. China
Keywords: Deep Learning, Semantic Segmentation, RGB-D Fusion, Channel Attention,Spatial Attention
Abstract. RGB-D semantic segmentation algorithm is a key technology for indoor semantic map construction. The traditional RGB-D semantic segmentation network, which always suffer from redundant parameters and modules. In this paper, an improved semantic segmentation network PSCNet is designed to reduce redundant parameters and make models easier to implement. Based on the DeepLabv3+ framework, we have improved the original model in three ways, including attention module selection, backbone simplification, and Atrous Spatial Pyramid Pooling (ASPP) module simplification. The research proposes three improvement ideas to address these issues: using spatial-channel co-attention, removing the last module from Depth Backbone, and redesigning WW-ASPP by Depthwise convolution. Compared to Deeplabv3+, the proposed PSCNet are approximately the same number of parameters, but with a 5% improvement in MIoU. Meanwhile, PSCNet achieved inference at a rate of 47 FPS on RTX3090, which is much faster than state-of-the-art semantic segmentation networks.