1. Method
1.1. 首先,描述pixel-wise masks如何使用Gumbel-softmax trick 去学习
1.1.1. Trainable Mask
1.1.1.1. Block Arch
1.1.1.1.1. The operation of the residual block
1.1.1.1.2. a binary Gumbel-Softmax trick on each element of Mb
1.1.1.1.3. The residual block with sparsity mask
1.1.1.2. Binary Gumbel-Softmax
1.1.1.2.1. Discrete samples Z canbe drawn using
1.1.1.2.2. use Gumbel-Softmax trick to defines acontinuous differentialbe approximation by replacing the argmax operation with a softmax
1.1.1.2.3. converted to a probability pi1 indicating the probability that a pixel should be executed, using a sigmoid function.
1.1.1.2.4. The probability that a pixel is not executed
1.1.1.2.5. substituting pi1 and pi2 in equation 5
1.1.1.2.6. use a straight-through estimator, where hard samples are used during the forward pass and gradients are obtained from soft samples during the backward pass
1.2. 如何实现dynamic convolutions 去减小inference time
1.2.1. Efficient inference implementation
1.2.1.1. mask dilation
1.2.1.2. masked gather operation
1.2.1.3. modified 3*3 depthwise convolution
1.3. 最后提出一种sparsity criterion的标准
2. Abstract
2.1. 传统的卷积神经网络操作对一张图片里的每个pixel一视同仁,然鹅,图片里并不是所有的region都是同样重要的。因此,本文作者提出了一种针对输入图像动态的有条件的卷积方法。另外,作者还引入了一种针对cuda的有效的动态卷积实现方法,利用的是gater-scatter方法。
2.2. Method1:
2.2.1. 引入了residual block,一个小的gating 分支,用来学习哪个空间位置需要被evaluate。
2.2.1.1. 这个离散的gating decision 是通过端到端训出来的,并且用到了一个Gumbel-Softmax的小trick,和稀疏规则合并起来。
2.2.1.1.1. 数据集
2.3. Methord2:
2.3.1. 为mobilenetV2和shufflenetv2提速,在人体姿态估计上做了实验