2021.2 MIT Block Sparsity and Weight Initialization in Neural Network Pruning

Kom i gang. Det er Gratis
eller tilmeld med din email adresse
2021.2 MIT Block Sparsity and Weight Initialization in Neural Network Pruning af Mind Map: 2021.2 MIT Block Sparsity and Weight Initialization in Neural Network Pruning

1. Motivation and contribution

1.1. Motivation

1.1.1. Block-sparse pruning offers an intermediate level of structure between unstructured weight pruning and highly structured neuron and channel pruning. It is particularly interesting because it allows us to understand at what granularity behavior of unstructured and highly structured pruning begins to differ.

1.2. contribution

1.2.1. our main goal is to investigate how pruning affects accuracy at varying granularities of block sparsity.

1.2.1.1. Block-Sparse Accuracy and Lottery Tickets

1.2.1.1.1. compare unstructured pruning to the block wise pruning with different size of block

1.2.2. We try to understand behavior in two contexts: fine-tuning and weight rewinding.

1.2.2.1. Fine-tuning, which involves using learned weight values from the previous run, allows us to understand the role of block sparsity at inference time.

1.2.2.1.1. reveal the sparsity at which block-sparse pruning can maintain the same accuracy as unstructured pruning. This provides insight that can guide how to balance the tradeoff between accuracy and block granularity.

1.2.2.2. Weight rewinding, which involves using values from a point early in training of the previous run, allows us to understand more about what we can learn about block sparsity at training time.

1.2.2.2.1. The weight rewinding experiments reveal whether the network can still learn to full accuracy early in training with imposed block structure.

1.2.2.3. Comparing Rewinding and Fine-tuning in Neural Network Pruning

1.2.2.3.1. Comparing Rewinding and Fine-tuning in Neural Network Pruning

1.2.3. we explore the effect of random reinitialization on blocksparse pruning to gain insight into why it decreases accuracy in unstructured pruning but not in block-sparse pruning.

1.2.3.1. They run a series of experiments that demonstrate that randomly reinitializing the network at the start of each training run does not reduce the accuracy of the final pruned network

2. introduction

2.1. Block-sparse pruning is a middle ground between unstructured weight pruning and this highly structured neuron and channel pruning.

3. Related work

3.1. language model

3.1.1. François Lagunas. Block sparse matrices for smaller and faster language models. https://huggingface.co/blog/pytorch_block_sparse, September 2020

3.1.2. Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. Generating long sequences with sparse Transformers. arXiv:1904.10509, 2019.

3.1.3. Sharan Narang, Eric Undersander, and Gregory Diamos. Block-sparse recurrent neural networks. arXiv:1711.02782, 2017.

3.2. gpu kernel

3.2.1. Trevor Gale, Matei Zaharia, Cliff Young, and Erich Elsen. Sparse GPU kernels for deep learning. arXiv:2006.10901, 2020.

3.2.2. Scott Gray, Alec Radford, and Diederik P. Kingma. Block-sparse GPU kernels. https://openai.com/blog/block-sparse-gpu-kernels/, December 2017. OpenAI blog release.

3.2.3. François Lagunas. Sparse neural networks (2/n): Understanding GPU performance. https://medium.com/huggingface/sparse-neural-networks-2-ngpu- performance-b8bc9ce950fc, May 2020. Affiliated with Hugging Face.

3.3. general

3.3.1. Exploring the regularity of sparse structure in convolutional neural networks

3.3.1.1. https://arxiv.org/pdf/1705.08922.pdf

3.3.2. Mengye Ren, Andrei Pokrovsky, Bin Yang, and Raquel Urtasun. SBNet: Leveraging activation block sparsity for speeding up convolutional neural networks. https://eng.uber.com/sbnet-sparse-block-networks-convolutionalneural- networks/, January 2018. Uber blog release.

3.3.3. Mengye Ren, Andrei Pokrovsky, Bin Yang, and Raquel Urtasun. SBNet: Sparse blocks network for fast inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.

3.3.4. Dharma Teja Vooturi, Dheevatsa Mudigere, and Sasikanth Avancha. Hierarchical block sparse neural networks. arXiv:1808.03420, 2018.

3.4. basis

3.4.1. 2019 AMC