2020 Sampling and interpolation

Começar. É Gratuito
ou inscrever-se com seu endereço de e-mail
2020 Sampling and interpolation por Mind Map: 2020  Sampling and interpolation

1. Abstract

1.1. 在CNN的过程中,feature map常常存在比较大量的空间冗余,这就导致了大量的重复计算。为了减小这些过多的计算,作者提出,只在稀疏采样的位置计算feature map,而这取决于激活函数的响应,然后在重建的时候,利用补差值的方式密集重建这部分feature map。

1.2. 这个方案的好处是:1、避免在可以有效插值的位置消耗过多的计算量 ;2、通过广泛的分布采样,提高了激活函数预测的鲁棒性

1.3. 技术难点:二进制决策变量对于表达离散采样位置是不可微分的,导致无法与反向传播兼容。为了克服这个问题,作者使用了基于Gumbel Softmax distribution 的重参数 trick,这样反向传播在迭代的时候可以将这些变量传向二值。

1.4. Code

1.4.1. https://github.com/zdaxie/SpatiallyAdaptiveInference-Detection.

2. Introduction

2.1. Toward greater practicality of deep models, much attention has been focused on reducing CNN computation.

2.1.1. a common aproach to this problem is weights pruning or neurons pruning, which can not good maintain the performance.

2.2. In this paper, author seek a more efficient allocation of computation over a feature map that takes advantage of its spatial redundancy.

2.2.1. Treat the predicted activation map as a probability filed, stochastically sample a sparse set of locations based on their probability filed.And then interpolate the features at these samples to reconstruct the rest of the feature map.

2.2.1.1. Advantage:这个策略避免了在特征可以简单插值的区域上浪费计算资源;同时对于低激活区域,它也允许特征计算扩展进去,从而降低和补偿激活预测所带来的误差。

2.3. To identify sparse points for interpolation, our network trains a content-aware stochastic sampling module that produces a binary sampling mask over taining binary variables.

2.3.1. reparameterization trick

2.3.1.1. make non-differentiable mask values are replaced by differentialbe samples from a Gumbel Softmax distribution.

3. Related work

3.1. Model Pruning

3.2. Early Stopping

3.3. Activation Sparsity

3.3.1. 激活函数Relu通常也是稀疏的,这个特性用来挖掘网络加速,通过排除下一层网络的零值。

3.3.2. 上面这个方法通过评估激活稀疏性,然后跳过不重要的激活值以此来避免额外的计算量。

3.3.3. 作者提出的方法,并不是去重建activation map, 而是用一种content-aware fashion的方式去采样一个sparse set, 然后利用这个sparse set 去插值。这种方法的好处是,避免在容易重建的位置去耗费计算量。更多的,我们的probabilistic sampling distributes computation 为activation prediction error提供了鲁棒性。

3.4. Sparse Sampling

3.4.1. PerforatedCNN只计算前一层卷积层输出的稀疏sample,同时插值剩下的value。采样时,跟随一个事先定义好的pattern,并且插值也是针对最近的邻居。作者提出的方法,与perforatedCNN的不同之处在于,本文提出的网络是自适应的,它取决于input。比如采样密度反应了predicted activation value,同时插值参数也学习到了。这样的方法,可以允许更高的稀疏度。高分辨率结果来自低分辨率结果。

3.5. Gumbel-based selection

3.5.1. 基于Gumbel distribution 的随机选择被用于网络加速时候的离散决策。这个gumbel softmax trick 有两个用途:1、自适应选择layers去和输入image适配;2、选择channel或者layer是否要skip。

4. Methodology

4.1. a general introduction of the stochastic sampling-interpolation network

4.1.1. Generate a sparse sampling mask M based on the input feature map X.

4.1.1.1. Then caclulate the features only at the sampling points, forming a sparse feature map Ys.

4.1.1.1.1. The features of unsampled points are interpolated by the interpolation module to form the output feature mapy Y*.

4.2. sampling module

4.2.1. Deterministic manner

4.2.1.1. where points with confidence greater than a certain threshold are sampled

4.2.1.1.1. Due to the spatial redundancy over the feature map, adjacent points may bave similar features and confidences, so deterministic sampling typically samples or not samples adjacent points together.

4.2.2. Stochastic sampling

4.2.2.1. a higher confidence only indicates a higher probability of the point being sampled

4.2.3. the proposed sampling

4.2.3.1. two-class gumbel-softmax distribution

4.2.3.2. The sparse loss

4.2.3.3. the training objective

4.3. interplolation module

4.3.1. the interpolation is formulated

4.3.2. The window interpolation module is formulated as

4.3.3. The best formulation of WI in irregular spatial sampling is an open problem.

4.3.4. Grid Prior: M = max(Msample, Mgrid)

4.4. Integration with residual Block