Accelerating CNN via
Dynamic Pattern-based Pruning Network
CIKM 2022
Gwanghan Lee Saebyeol Shin Simon S. Woo
ican0016@g.skku.edu toquf930@g.skku.edu swoo@g.skku.edu
Sungkyunkwan University
Abstract
Recently, dynamic pruning methods have been actively researched, as they have shown very effective and remarkable performance in reducing computation complexity of deep neural networks. Nevertheless, most dynamic pruning methods fail to achieve actual acceleration due to the extra overheads caused by indexing and weight-copying to implement the dynamic sparse patterns for every input sample. To address this issue, we propose Dynamic Pattern-based Pruning Network DPPNet, which preserves the advantages of both static and dynamic networks. First, our method statically prunes the weight kernel into various sparse patterns. Then, the dynamic convolution kernel is generated via aggregating input-dependent attention weights and static kernels. Unlike previous dynamic pruning methods, our novel method dynamically fuses static kernel patterns, enhancing the kernel's representational power without additional overhead. Moreover, our dynamic sparse pattern enables an efficient process using BLAS libraries, accomplishing actual acceleration. We demonstrate the effectiveness of the proposed DPPNet on CIFAR and ImageNet, outperforming the state-of-the-art methods achieving better accuracy with lower computational cost. For example, on ImageNet classification, ResNet34 utilizing DPP module achieves state-of-the-art performance with 65.6% FLOPs reduction and the inference speed increased by 35.9% without loss in accuracy.
Challenge
Contribution
Efficiently change the redundant process for every input.
Enhance the model performance, enabling actual acceleration at the same time.
Method Overview
Experimental Results (CIFAR-10 & ImageNet)
Hyperparameters Affecting Performance
Visualization
Dynamic Pattern for Each Input
Pattern Shape for Each Layer
Conclusion
DPPNet is a hardware-friendly pruning method that enhances model's representational power with low computational cost
Resolves overheads caused by indexing and weight copying and focuses on effective kernel shape while reducing redundancy of the model
Achieved computation reduction and real-world acceleration without accuracy degradation
Citation
@inproceedings{lee2022accelerating,
title = {Accelerating CNN via Dynamic Pattern-based Pruning Network},
author = {Gwanghan Lee and Saebyeol Shin and Simon S. Woo},
booktitle = {Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM '22)},
year = {2022}
}