Avaliação e análise de performance em IA

Playlist de vídeos da disciplina:

Material da Semana 01:

Referências e Leitura Sugerida para semana 01:

Saif M. Khan and Alexander Mann, "AI Chips: What They Are and Why They Matter" (Center for Security and Emerging Technology, April 2020), cset.georgetown.edu/research/ai-chips-what-they-are-and-why-they-matter/. https://doi.org/10.51593/20190014

Zhang, Qiyang, et al. "A comprehensive benchmark of deep learning libraries on mobile devices." Proceedings of the ACM Web Conference 2022. 2022.

https://dl.acm.org/doi/abs/10.1145/3485447.3512148

Courville, Vanessa, and Vahid Partovi Nia. "Deep learning inference frameworks for arm cpu." Journal of Computational Vision and Imaging Systems 5.1 (2019): 3-3.

https://openjournals.uwaterloo.ca/index.php/vsl/article/download/1645/2014

Ignatov, Andrey, et al. "Learned smartphone isp on mobile npus with deep learning, mobile ai 2021 challenge: Report." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

https://openaccess.thecvf.com/content/CVPR2021W/MAI/html/Ignatov_Learned_Smartphone_ISP_on_Mobile_NPUs_With_Deep_Learning_Mobile_CVPRW_2021_paper.html

Gómez-Luna, Juan, et al. "Benchmarking a new paradigm: An experimental analysis of a real processing-in-memory architecture." arXiv preprint arXiv:2105.03814 (2021).

https://arxiv.org/abs/2105.03814

Jiang, Jiantong, et al. "Boyi: A systematic framework for automatically deciding the right execution model of OpenCL applications on FPGAs." Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2020.

https://dl.acm.org/doi/abs/10.1145/3373087.3375313

Ignatov, Andrey, et al. "Ai benchmark: All about deep learning on smartphones in 2019." 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019.

https://ieeexplore.ieee.org/abstract/document/9022101

Buch, Michael, et al. "Ai tax in mobile socs: End-to-end performance analysis of machine learning in smartphones." 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 2021.

https://ieeexplore.ieee.org/document/9408206

Material da Semana 02:

Referências e Leitura Sugerida para semana 02:

Weights and Biases

wandb.ai/site

Tensorboard

www.tensorflow.org/tensorboard

Tutorial Tensorboard

https://www.tensorflow.org/tensorboard/get_started

Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).

https://arxiv.org/abs/1602.07360

Zhang, Xiangyu, et al. "Shufflenet: An extremely efficient convolutional neural network for mobile devices." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_ShuffleNet_An_Extremely_CVPR_2018_paper.html

Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).

https://arxiv.org/abs/1704.04861

Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

https://openaccess.thecvf.com/content_cvpr_2017/html/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.html

Gou, Jianping, et al. "Knowledge distillation: A survey." International Journal of Computer Vision 129.6 (2021): 1789-1819.

https://link.springer.com/article/10.1007/s11263-021-01453-z

Dive into Deep Learning. 14.2 - Computer Vision - Fine-Tuning (https://d2l.ai/chapter_computer-vision/fine-tuning.html)

Material da Semana 03:

Referências e Leitura Sugerida para semana 03:

MLCommons: https://mlcommons.org/en/

Modelo de arquitetura GPUshttps://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
Pascal (2016): https://developer.nvidia.com/pascal
Volta (2018): https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf
Turing (2019): https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf
Ampere (2020): https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html
Hopper (2022): https://resources.nvidia.com/en-us-tensor-core/gtc22-whitepaper-hopper

ISSCC 2020 Tutorial: How to Evaluate Deep Neural Network Processors: https://eems.mit.edu/wp-content/uploads/2020/09/ieee_mssc_summer2020.pdf

Jacob, Benoit; et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. 2017 (https://arxiv.org/abs/1712.05877)

Xiangxiang Chu, Liang Li, Bo Zhang. Make Rep VGG Greater Again: A Quantization-aware Approach. 2023 (https://arxiv.org/pdf/2212.01593)

Angshuman Parashar, Yannan Nellie Wu, Po-An Tsai, Vivienne Sze, Joel S. Emer. Timeloop/Accelergy Tutorial: Tools for Evaluating Deep Neural Network Accelerator Designs. Tutorial (NVidia/MIT) (2020) (https://accelergy.mit.edu/tutorial.html)

Quantization PyTorch

https://pytorch.org/docs/stable/quantization.html

https://pytorch.org/tutorials/recipes/quantization.html

Pruning PyTorch

https://pytorch.org/tutorials/intermediate/pruning_tutorial.html

NVIDIA cuSPARSE

https://docs.nvidia.com/cuda/cusparse/

Quantization/Pruning Tensorflow

https://www.tensorflow.org/model_optimization/guide/pruning/comprehensive_guide

https://www.tensorflow.org/lite/performance/post_training_quantization

Material da Semana 04:

Referências e Leitura Sugerida para semana 04:

Dive into deep learning. Computational Performance

https://d2l.ai/chapter_computational-performance/index.html

Torchscript

https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html

https://pytorch.org/tutorials/recipes/torchscript_inference.html

https://pytorch.org/tutorials/advanced/cpp_export.html

Docker: https://www.docker.com/

TF: https://www.tensorflow.org/

TFLITE: https://www.tensorflow.org/lite

ONNX: https://onnx.ai/

Material Complementar (códigos e exemplos):

Repositório da disciplina: https://github.com/TIC-13/luxai_ai_performance

Referências:

Sze, Vivienne, et al. "Efficient processing of deep neural networks: A tutorial and survey." Proceedings of the IEEE 105.12 (2017): 2295-2329. https://ieeexplore.ieee.org/abstract/document/8114708

Zhang, Aston and Lipton, Zachary C. and Li Mu and Smola, Alexander J. Dive into Deep Learning. Cambridge University Press. d2l.ai, 2023

Page updated

Report abuse