Image compression with Stochastic Winner-Take-All Auto-Encoder
read more
Citations
Learned Compression Artifact Removal by Deep Residual Networks
An Untrained Neural Network Prior for Light Field Compression
Multi-tier block truncation coding model using genetic auto encoders for gray scale images
Masked Neural Sparse Encoder for Face Occlusion Detection
3D Tensor Auto-encoder with Application to Video Compression
References
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet: A large-scale hierarchical image database
Reducing the Dimensionality of Data with Neural Networks
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Visualizing and Understanding Convolutional Networks
Related Papers (5)
Frequently Asked Questions (15)
Q2. What is the coding objective of the algorithm?
Given Γ and p ∈ N∗+, let φ be a function that randomly partitions Γ into ηp = η / p mini-batches { X(1), ...,X(ηp) } , where, for i ∈ J1, ηpK, X(i) ∈ Rm×p.
Q3. What is the code for the last feature map in Z?
The position along z is coded with a fixed-length code and, for each pair (x, y), the number of non-zero coefficients along z is coded with a Huffman code.
Q4. What is the objective of the training?
The training objective is to minimize the mean squared error between these cropped images and their reconstruction plus l2-norm weights decay.
Q5. What is the definition of a coding constraint?
Max-pooling is a core component of neural networks [11] that downsamples its input representation by appling a maxfilter to non-overlapping sub-regions.
Q6. What is the structure of the layer input?
Each layer i ∈ J1, 4K consists in convolving the layer input with the bank of filters W(i), adding the biases b(i) and applying a mapping g(i), producing the layer output.
Q7. What is the solution to the coding problem?
η∑ i=j ‖Zj‖0 ≤ γ × n× η(4)(4) is solved by Algorithm 2 which alternates between sparse coding steps that involve WTA OMP and dictionary updates that use stochastic gradient descent.
Q8. What is the code for the non-zero coefficients?
The non-zero coefficients are uniformly quantized over 8-bits and coded with a Huffman code while their position is coded with a fixed-length code.
Q9. What is the code for the WTA OMP?
for WTA OMP only, the number of non-zero coefficients of the sparse decomposition of each patch over D is coded with a Huffman code.
Q10. What is the simplest way to decompose a matrix?
For each j ∈ J1, pK,Yj = OMP (Xj ,D, k) (1) The author= fγ (Y) (2)For each j ∈ J1, pK,Zj = min z∈Rn ‖Xj −Dz‖22 st.supp (z) = supp (Ij) (3)Output: Z ∈ Rn×p.4.
Q11. What is the effect of removing a maxpooling layer?
For instance, [20] proves that removing a maxpooling layer and increasing the stride of the previous convolution, as the authors do, harms neural networks.
Q12. What is the difference between SWTA AE and WTA OMP?
CONCLUSIONS AND FUTURE WORKThe authors have shown that, SWTA AE is more adaptated to image compression than auto-encoders as it performs variable rate image compression for any size of image after a single training and provides better rate-distortion trade-offs.
Q13. Who is the author of this article?
Gary J. Sullivan, Jim M. Boyce, Ying Chen, Jens-Rainer Ohm, C. Andrew Segal, and Anthony Vetro, “Standardized extensions of high efficiency video coding (HEVC),” IEEE Journal of Selected Topics in Signal Processing, vol. 7 (6), pp. 1001–1016, December 2013.[8]
Q14. What is the definition of a vector of coefficients?
it keeps the γ × n × p coefficients with largest absolute value for the nlength sparse representation of the p patches and sets the rest to 0, see (2).
Q15. What is the coding objective of the problem?
Given Γ, k < m and γ ∈ ]0, 1[, the dictionary learning problem is formulated as (4).min D,Z1,...,Zη1η η∑ j=1 ‖Γj −DZj‖22st. ∀j ∈ J1, ηK, ‖Zj‖0 ≤ kst.