Can GCNs Go as Deep as CNNs?
Figure 1. Our GCNs Network architecture for point clouds semantic segmentation. Left: Our framework consists of three blocks (one GCN Backbone Block, one Fusion Block and one MLP Prediction Block). Right: We mainly study three types of GCN Backbone Blocks i.e. PlainGCN, ResGCN and DenseGCN. There are two kinds of GCN skip connections vertex-wise additions and vertex-wise concatenations. k is the number of nearest neighbors in GCN layers. f is the number of the filters or hidden units. d is the dilation rate.
Figure 2. Training Deep GCNs. Left: We present here the training loss for GCNs with 7, 14, 28, 56 layers, with and without residual connections. We note how adding more layers without residual connections translates to substantially higher loss. Right: In contrast, training GCNs with residual connections results in consistent stability for all depths.
Convolutional Neural Networks (CNNs) achieve impressive results in a wide variety of fields. Their success benefited from a massive boost with the ability to train very deep CNN models. Despite their positive results, CNNs fail to properly address problems with non-Euclidean data. To overcome this challenge, Graph Convolutional Networks (GCNs) build graphs to represent non-Euclidean data, and borrow concepts from CNNs and apply them to train these models. GCNs show promising results, but they are limited to very shallow models due to the vanishing gradient problem. As a result most state-of-the-art GCN algorithms are no deeper than 3 or 4 layers. In this work, we present new ways to successfully train very deep GCNs. We borrow concepts from CNNs, mainly residual/dense connections and dilated convolutions, and adapt them to GCN architectures. Through extensive experiments, we show the positive effect of these deep GCN frameworks. Finally, we use these new concepts to build a very deep 56-layer GCN, and show how it significantly boosts performance (+3.7% mIoU over state-of-the-art) in the task of point cloud semantic segmentation.
Figure 3. Qualitative Results for S3DIS Semantic Segmentation. We show here the effect of adding residual and dense graph connections to deep GCNs. PlainGCN, ResGCN, and DenseGCN are identical except for the presence of residual graph connections in ResGCN and dense graph connections in DenseGCN. We note how both residual graph connections and dense graph connections have a substantial effect for hard classes like board, bookcase, and sofa; these are lost in the results of PlainGCN.
Figure 4. Qualitative Results for S3DIS Semantic Segmentation. We show the importance of stochastic dilated convolutions.
Figure 5. Qualitative Results for S3DIS Semantic Segmentation. We show the importance of the number of nearest neighbors used in the convolutions.
Figure 6. Qualitative Results for S3DIS Semantic Segmentation. We show the importance of network depth (number of layers).
Figure 7. Qualitative Results for S3DIS Semantic Segmentation. We show the importance of network width (number of filters per layer).
Figure 8. Qualitative Results for S3DIS Semantic Segmentation. We show the benefit of a wider and deeper network even with only half the number of nearest neighbors.