scispace - formally typeset
Search or ask a question
Author

Kui Jiang

Bio: Kui Jiang is an academic researcher from Wuhan University. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 13, co-authored 37 publications receiving 851 citations.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work explores the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi- scale progressive fusion network (MSPFN) for single image rain streak removal.
Abstract: Rain streaks in the air appear in various blurring degrees and resolutions due to different distances from their positions to the camera. Similar rain patterns are visible in a rain image as well as its multi-scale (or multi-resolution) versions, which makes it possible to exploit such complementary information for rain streak representation. In this work, we explore the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi-scale progressive fusion network (MSPFN) for single image rain streak removal. For the similar rain streaks at different positions, we employ recurrent calculation to capture the global texture, thus allowing to explore the complementary and redundant information at the spatial dimension to characterize target rain streaks. Besides, we construct multi-scale pyramid structure, and further introduce the attention mechanism to guide the fine fusion of these correlated information from different scales. This multi-scale progressive fusion strategy not only promotes the cooperative representation, but also boosts the end-to-end training. Our proposed method is extensively evaluated on several benchmark datasets and achieves the state-of-the-art results. Moreover, we conduct experiments on joint deraining, detection, and segmentation tasks, and inspire a new research direction of vision task driven image deraining. The source code is available at https://github.com/kuihua/MSPFN.

361 citations

Journal ArticleDOI
TL;DR: A generative adversarial network (GAN)-based edge-enhancement network (EEGAN) for robust satellite image SR reconstruction along with the adversarial learning strategy that is insensitive to noise is proposed.
Abstract: The current superresolution (SR) methods based on deep learning have shown remarkable comparative advantages but remain unsatisfactory in recovering the high-frequency edge details of the images in noise-contaminated imaging conditions, e.g., remote sensing satellite imaging. In this paper, we propose a generative adversarial network (GAN)-based edge-enhancement network (EEGAN) for robust satellite image SR reconstruction along with the adversarial learning strategy that is insensitive to noise. In particular, EEGAN consists of two main subnetworks: an ultradense subnetwork (UDSN) and an edge-enhancement subnetwork (EESN). In UDSN, a group of 2-D dense blocks is assembled for feature extraction and to obtain an intermediate high-resolution result that looks sharp but is eroded with artifacts and noises as previous GAN-based methods do. Then, EESN is constructed to extract and enhance the image contours by purifying the noise-contaminated components with mask processing. The recovered intermediate image and enhanced edges can be combined to generate the result that enjoys high credibility and clear contents. Extensive experiments on Kaggle Open Source Data set , Jilin-1 video satellite images, and Digitalglobe show superior reconstruction performance compared to the state-of-the-art SR approaches.

305 citations

Posted Content
TL;DR: A multi-granularity masked face recognition model is developed that achieves 95% accuracy, exceeding the results reported by the industry and is currently the world's largest real-world masked face dataset.
Abstract: In order to effectively prevent the spread of COVID-19 virus, almost everyone wears a mask during coronavirus epidemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently theworld's largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: this https URL.

277 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This study proposes a novel progressive fusion network for video SR, which is designed to make better use of spatio-temporal information and is proved to be more efficient and effective than the existing direct fusion, slow fusion or 3D convolution strategies.
Abstract: Most previous fusion strategies either fail to fully utilize temporal information or cost too much time, and how to effectively fuse temporal information from consecutive frames plays an important role in video super-resolution (SR). In this study, we propose a novel progressive fusion network for video SR, which is designed to make better use of spatio-temporal information and is proved to be more efficient and effective than the existing direct fusion, slow fusion or 3D convolution strategies. Under this progressive fusion framework, we further introduce an improved non-local operation to avoid the complex motion estimation and motion compensation (ME&MC) procedures as in previous video SR approaches. Extensive experiments on public datasets demonstrate that our method surpasses state-of-the-art with 0.96 dB in average, and runs about 3 times faster, while requires only about half of the parameters.

173 citations

Journal ArticleDOI
TL;DR: This paper proposes a multi-memory CNN (MMCNN) for video SR, cascading an optical flow network and an image-reconstruction network that shows superiority over the state-of-the-art methods in terms of PSNR and visual quality and surpasses the best counterpart method by 1 dB at most.
Abstract: Video super-resolution (SR) is focused on reconstructing high-resolution frames from consecutive low-resolution (LR) frames. Most previous video SR methods based on convolutional neural networks (CNN) use a direct connection and single-memory module within the network, and thus, they fail to make full use of spatio-temporal complementary information from LR observed frames. To fully exploit spatio-temporal correlations between adjacent LR frames and reveal more realistic details, this paper proposes a multi-memory CNN (MMCNN) for video SR, cascading an optical flow network and an image-reconstruction network. A series of residual blocks engaged in utilizing intra-frame spatial correlations is proposed for feature extraction and reconstruction. Particularly, instead of using a single-memory module, we embed convolutional long short-term memory into the residual block, thus forming a multi-memory residual block to progressively extract and retain inter-frame temporal correlations between the consecutive LR frames. We conduct extensive experiments on numerous testing datasets with respect to different scaling factors. Our proposed MMCNN shows superiority over the state-of-the-art methods in terms of PSNR and visual quality and surpasses the best counterpart method by 1 dB at most.

135 citations


Cited by
More filters
Proceedings ArticleDOI
04 Feb 2021
TL;DR: MPRNet as discussed by the authors proposes a multi-stage architecture that progressively learns restoration functions for the degraded inputs, thereby breaking down the overall recovery process into more manageable steps, and introduces a novel per-pixel adaptive design that leverages in-situ supervised attention to reweight the local features.
Abstract: Image restoration tasks demand a complex balance between spatial details and high-level contextualized information while recovering images. In this paper, we propose a novel synergistic design that can optimally balance these competing goals. Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs, thereby breaking down the overall recovery process into more manageable steps. Specifically, our model first learns the contextualized features using encoder-decoder architectures and later combines them with a high-resolution branch that retains local information. At each stage, we introduce a novel per-pixel adaptive design that leverages in-situ supervised attention to reweight the local features. A key ingredient in such a multi-stage architecture is the information exchange between different stages. To this end, we propose a two-faceted approach where the information is not only exchanged sequentially from early to late stages, but lateral connections between feature processing blocks also exist to avoid any loss of information. The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets across a range of tasks including image deraining, deblurring, and denoising. The source code and pre-trained models are available at https://github.com/swz30/MPRNet.

716 citations

Journal ArticleDOI
TL;DR: A hybrid model using deep and classical machine learning for face mask detection will be presented, and the SVM classifier achieved 99.64 % testing accuracy in RMFD.

540 citations

Journal ArticleDOI
TL;DR: A generative adversarial network (GAN)-based edge-enhancement network (EEGAN) for robust satellite image SR reconstruction along with the adversarial learning strategy that is insensitive to noise is proposed.
Abstract: The current superresolution (SR) methods based on deep learning have shown remarkable comparative advantages but remain unsatisfactory in recovering the high-frequency edge details of the images in noise-contaminated imaging conditions, e.g., remote sensing satellite imaging. In this paper, we propose a generative adversarial network (GAN)-based edge-enhancement network (EEGAN) for robust satellite image SR reconstruction along with the adversarial learning strategy that is insensitive to noise. In particular, EEGAN consists of two main subnetworks: an ultradense subnetwork (UDSN) and an edge-enhancement subnetwork (EESN). In UDSN, a group of 2-D dense blocks is assembled for feature extraction and to obtain an intermediate high-resolution result that looks sharp but is eroded with artifacts and noises as previous GAN-based methods do. Then, EESN is constructed to extract and enhance the image contours by purifying the noise-contaminated components with mask processing. The recovered intermediate image and enhanced edges can be combined to generate the result that enjoys high credibility and clear contents. Extensive experiments on Kaggle Open Source Data set , Jilin-1 video satellite images, and Digitalglobe show superior reconstruction performance compared to the state-of-the-art SR approaches.

305 citations

Posted Content
TL;DR: A multi-granularity masked face recognition model is developed that achieves 95% accuracy, exceeding the results reported by the industry and is currently the world's largest real-world masked face dataset.
Abstract: In order to effectively prevent the spread of COVID-19 virus, almost everyone wears a mask during coronavirus epidemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently theworld's largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: this https URL.

277 citations

Proceedings ArticleDOI
13 May 2021
TL;DR: Wang et al. as mentioned in this paper designed a simple and powerful multi-stage network named HINet, which consists of two subnetworks, and achieved state-of-the-art performance on various image restoration tasks.
Abstract: In this paper, we explore the role of Instance Normalization in low-level vision tasks. Specifically, we present a novel block: Half Instance Normalization Block (HIN Block), to boost the performance of image restoration networks. Based on HIN Block, we design a simple and powerful multi-stage network named HINet, which consists of two subnetworks. With the help of HIN Block, HINet surpasses the state-of-the-art (SOTA) on various image restoration tasks. For image denoising, we exceed it 0.11dB and 0.28 dB in PSNR on SIDD dataset, with only 7.5% and 30% of its multiplier-accumulator operations (MACs), 6.8× and 2.9× speedup respectively. For image deblurring, we get comparable performance with 22.5% of its MACs and 3.3× speedup on REDS and GoPro datasets. For image deraining, we exceed it by 0.3 dB in PSNR on the average result of multiple datasets with 1.4× speedup. With HINet, we won the 1st place on the NTIRE 2021 Image Deblurring Challenge - Track2. JPEG Artifacts, with a PSNR of 29.70.

210 citations