Accelerating CNN via

Dynamic Pattern-based Pruning Network

CIKM 2022

Abstract

Recently, dynamic pruning methods have been actively researched, as they have shown very effective and remarkable performance in reducing computation complexity of deep neural networks. Nevertheless, most dynamic pruning methods fail to achieve actual acceleration due to the extra overheads caused by indexing and weight-copying to implement the dynamic sparse patterns for every input sample. To address this issue, we propose Dynamic Pattern-based Pruning Network DPPNet, which preserves the advantages of both static and dynamic networks. First, our method statically prunes the weight kernel into various sparse patterns. Then, the dynamic convolution kernel is generated via aggregating input-dependent attention weights and static kernels. Unlike previous dynamic pruning methods, our novel method dynamically fuses static kernel patterns, enhancing the kernel's representational power without additional overhead. Moreover, our dynamic sparse pattern enables an efficient process using BLAS libraries, accomplishing actual acceleration. We demonstrate the effectiveness of the proposed DPPNet on CIFAR and ImageNet, outperforming the state-of-the-art methods achieving better accuracy with lower computational cost. For example, on ImageNet classification, ResNet34 utilizing DPP module achieves state-of-the-art performance with 65.6% FLOPs reduction and the inference speed increased by 35.9% without loss in accuracy.

Challenge

Contribution

  • Efficiently change the redundant process for every input.

  • Enhance the model performance, enabling actual acceleration at the same time.

Method Overview

Experimental Results (CIFAR-10 & ImageNet)

Hyperparameters Affecting Performance


Visualization

  • Dynamic Pattern for Each Input

  • Pattern Shape for Each Layer

Conclusion

  • DPPNet is a hardware-friendly pruning method that enhances model's representational power with low computational cost

  • Resolves overheads caused by indexing and weight copying and focuses on effective kernel shape while reducing redundancy of the model

  • Achieved computation reduction and real-world acceleration without accuracy degradation


Citation

@inproceedings{lee2022accelerating,

title = {Accelerating CNN via Dynamic Pattern-based Pruning Network},

author = {Gwanghan Lee and Saebyeol Shin and Simon S. Woo},

booktitle = {Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM '22)},

year = {2022}

}