scispace - formally typeset
Journal ArticleDOI

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

TLDR
In this paper , the authors presented a novel micro-ISP model designed specifically for edge devices, taking into account their computational and memory limitations, which is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 second to perform the inference.
Abstract
While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 second to perform the inference, while for FullHD images it achieves real-time performance. The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power. To evaluate the performance of the model, we collected a novel Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The experiments demonstrated that, despite its compact size, the MicroISP model is able to provide comparable or better visual results than the traditional mobile ISP systems, while outperforming the previously proposed efficient deep learning based solutions. Finally, this model is also compatible with the latest mobile AI accelerators, achieving good runtime and low power consumption on smartphone NPUs and APUs. The code, dataset and pre-trained models are available on the project website: https://people.ee.ethz.ch/~ihnatova/microisp.html

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

TL;DR: In this article , the authors proposed an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs, which is fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images.
Proceedings ArticleDOI

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

TL;DR: Gmalivenko et al. as mentioned in this paper proposed a novel PyNET-V2 Mobile CNN architecture designed specifically for edge devices, being able to process RAW 12MP photos directly on mobile phones under 1.5 seconds and producing high perceptual photo quality.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

TL;DR: Zhang et al. as mentioned in this paper proposed a feed-forward denoising convolutional neural networks (DnCNNs) to handle Gaussian denobling with unknown noise level.
Posted Content

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

TL;DR: This work considers image transformation problems, and proposes the use of perceptual loss functions for training feed-forward networks for image transformation tasks, and shows results on image style transfer, where aFeed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time.
Posted Content

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

TL;DR: This work presents a highly accurate single-image superresolution (SR) method using a very deep convolutional network inspired by VGG-net used for ImageNet classification and uses extremely high learning rates enabled by adjustable gradient clipping.
Book ChapterDOI

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

TL;DR: ESRGAN as mentioned in this paper improves the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery, and won the first place in the PIRM2018-SR Challenge (region 3).