시작하기. 무료입니다
또는 회원 가입 e메일 주소
Literature 저자: Mind Map: Literature

1. block wise

1.1. 2021.2 MIT Block Sparsity and Weight Initialization in Neural Network Pruning

1.2. 2020. Parallel Block-Wise Knowledge Distillation for Deep Neural Network Compression

1.3. 2020 Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference

1.4. 2018 SBNet: Sparse Blocks Network for Fast Inference

1.5. 2019 Block-wise Dynamic Sparseness

1.6. 2020 Spatially Adaptive inference with Stochastic Feature Sampling and interpolation

1.6.1. Code Recovery

1.7. 2019 Learning Instance-wise Sparsity for Accelerating Deep Models

1.8. 结果比较

1.9. 2018 BLOCK-WISE INTERMEDIATE REPRESENTATION

1.10. 2019 Dynamic Block Sparse Reparameterization of Convolutional Neural Networks(India)

2. Ref

2.1. 2013 Estimating or propagating gradients through stochastic neurons for conditional computation

2.1.1. 这篇文字主要针对随机神经元训练时不可导的问题提出了两种方法体系。

2.2. 2014 Spatially-sparse convolutional neural networks

2.2.1. 研发了一种CNN网络,用来处理spatially-sparse input

2.3. 2015 Conditional computation in neural networks for faster models

2.3.1. "Conditional computation refers to activating only some of the units in a network, in an input-dependent fashion. For example, if we think we’re looking at a car, we only need to compute the activations of the vehicle detecting units, not of all features that a network could possible compute. The immediate effect of activating fewer units is that propagating information through the network will be faster, both at training as well as at test time. However, one needs to be able to decide in an intelligent fashion which units to turn on and off, depending on the input data. This is typically achieved with some form of gating structure, learned in parallel with the original network."

2.4. 2016 Dynamic Capacity Networks

2.4.1. We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity subnetworks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN’s output is most sensitive and to which we should devote more capacity.

2.4.1.1. 介绍一种DCN网络。可以根据input动态适应网络的容量。

2.5. 2016 Deep Networks with Stochastic Depth

2.5.1. 层数很多的深度网络训练非常难,作者提出一种方法叫stochastic depth,可以帮助缩短训练时间,同时在inference的时候也可以用深度学习网络。

2.6. 2017 Spatially adaptive computation time for residual networks

2.6.1. https://arxiv.org/pdf/1612.02297.pdf

2.6.2. 根据image 的region的不同,来选择合适的layer层数,并且主要针对的是residual网络。

2.7. 2017 categorical reparameterization with gumbel-softmax

2.7.1. 解决不可导问题

2.7.2. https://arxiv.org/pdf/1611.01144.pdf

3. Language Model

3.1. https://arxiv.org/pdf/2108.06277.pdf

3.2. https://arxiv.org/pdf/2106.08846.pdf

4. github

4.1. GitHub - happysheep224/block-wise-pruning

5. Huang Gao

5.1. 2020 CVPR Resolution Adaptive Networks for Efficient Inference

6. Distilling

6.1. 2015 Distilling the Knowledge in a Neural Network