scispace - formally typeset
Search or ask a question

Showing papers by "Kui Jiang published in 2020"


Proceedings Article•DOI•
14 Jun 2020
TL;DR: This work explores the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi- scale progressive fusion network (MSPFN) for single image rain streak removal.
Abstract: Rain streaks in the air appear in various blurring degrees and resolutions due to different distances from their positions to the camera. Similar rain patterns are visible in a rain image as well as its multi-scale (or multi-resolution) versions, which makes it possible to exploit such complementary information for rain streak representation. In this work, we explore the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi-scale progressive fusion network (MSPFN) for single image rain streak removal. For the similar rain streaks at different positions, we employ recurrent calculation to capture the global texture, thus allowing to explore the complementary and redundant information at the spatial dimension to characterize target rain streaks. Besides, we construct multi-scale pyramid structure, and further introduce the attention mechanism to guide the fine fusion of these correlated information from different scales. This multi-scale progressive fusion strategy not only promotes the cooperative representation, but also boosts the end-to-end training. Our proposed method is extensively evaluated on several benchmark datasets and achieves the state-of-the-art results. Moreover, we conduct experiments on joint deraining, detection, and segmentation tasks, and inspire a new research direction of vision task driven image deraining. The source code is available at https://github.com/kuihua/MSPFN.

361 citations


Posted Content•
TL;DR: A multi-granularity masked face recognition model is developed that achieves 95% accuracy, exceeding the results reported by the industry and is currently the world's largest real-world masked face dataset.
Abstract: In order to effectively prevent the spread of COVID-19 virus, almost everyone wears a mask during coronavirus epidemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently theworld's largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: this https URL.

277 citations


Journal Article•DOI•
TL;DR: A novel hierarchical dense connection network (HDN) is advocated for image SR that outperforms the state-of-the-art methods in terms of quantitative indicators and realistic visual effects, as well as enjoys a fast and accurate reconstruction.

113 citations


Journal Article•DOI•
Peng Yi1, Zhongyuan Wang1, Kui Jiang1, Zhenfeng Shao1, Jiayi Ma1 •
TL;DR: A multi-temporal ultra-dense memory (MTUDM) network for video super-resolution is proposed that outperforms the state-of-the-art methods by a large margin and adopts multi- Temporal information fusion (MTIF) strategy to merge the extracted temporal feature maps in consecutive frames, improving the accuracy without requiring much extra computational cost.
Abstract: Video super-resolution (SR) aims to reconstruct the corresponding high-resolution (HR) frames from consecutive low-resolution (LR) frames. It is crucial for video SR to harness both inter-frame temporal correlations and intra-frame spatial correlations among frames. Previous video SR methods based on convolutional neural network (CNN) mostly adopt a single-channel structure and a single memory module, so they are unable to fully exploit inter-frame temporal correlations specific for video. To this end, this paper proposes a multi-temporal ultra-dense memory (MTUDM) network for video super-resolution. Particularly, we embed convolutional long-short-term memory (ConvLSTM) into ultra-dense residual block (UDRB) to construct an ultra-dense memory block (UDMB) for extracting and retaining spatio-temporal correlations. This design also reduces the layer depth by expanding the width, thus avoiding training difficulties, such as gradient exploding and vanishing under a large model. We further adopt multi-temporal information fusion (MTIF) strategy to merge the extracted temporal feature maps in consecutive frames, improving the accuracy without requiring much extra computational cost. The experimental results on extensive public datasets demonstrate that our method outperforms the state-of-the-art methods by a large margin.

101 citations


Journal Article•DOI•
TL;DR: An adaptive-threshold-based multi-model fusion network (ATMFN) for compressed face hallucination, which unifies different deep learning models to take advantages of their respective learning merits.
Abstract: Although tremendous strides have been recently made in face hallucination, exiting methods based on a single deep learning framework can hardly satisfactorily provide fine facial features from tiny faces under complex degradation. This article advocates an adaptive-threshold-based multi-model fusion network (ATMFN) for compressed face hallucination, which unifies different deep learning models to take advantages of their respective learning merits. First of all, we construct CNN-, GAN- and RNN-based underlying super-resolvers to produce candidate SR results. Further, the attention subnetwork is proposed to learn the individual fusion weight matrices capturing the most informative components of the candidate SR faces. Particularly, the hyper-parameters of the fusion matrices and the underlying networks are optimized together in an end-to-end manner to drive them for collaborative learning. Finally, a threshold-based fusion and reconstruction module is employed to exploit the candidates’ complementarity and thus generate high-quality face images. Extensive experiments on benchmark face datasets and real-world samples show that our model outperforms the state-of-the-art SR methods in terms of quantitative indicators and visual effects. The code and configurations are released at https://github.com/kuihua/ATMFN .

65 citations


Journal Article•DOI•
Zhongyuan Wang1, Kui Jiang1, Peng Yi1, Zhen Han1, Zheng He1 •
TL;DR: An ultra-dense GAN (udGAN) is proposed for image SR, where the internal layout of the residual block is reformed into a two-dimensional matrix topology, which can provide additional diagonal connections so that the model can still accomplish enough pathways with fewer layers.

50 citations


Journal Article•DOI•
TL;DR: A simple yet effective dual-path deep fusion network (DPDFN) for face image super-resolution (SR) without requiring additional face prior, which learns the global facial shape and local facial components through two individual branches.
Abstract: Along with the performance improvement of deep-learning-based face hallucination methods, various face priors (facial shape, facial landmark heatmaps, or parsing maps) have been used to describe holistic and partial facial features, making the cost of generating super-resolved face images expensive and laborious. To deal with this problem, we present a simple yet effective dual-path deep fusion network (DPDFN) for face image super-resolution (SR) without requiring additional face prior, which learns the global facial shape and local facial components through two individual branches. The proposed DPDFN is composed of three components: a global memory subnetwork (GMN), a local reinforcement subnetwork (LRN), and a fusion and reconstruction module (FRM). In particular, GMN characterize the holistic facial shape by employing recurrent dense residual learning to excavate wide-range context across spatial series. Meanwhile, LRN is committed to learning local facial components, which focuses on the patch-wise mapping relations between low-resolution (LR) and high-resolution (HR) space on local regions rather than the entire image. Furthermore, by aggregating the global and local facial information from the preceding dual-path subnetworks, FRM can generate the corresponding high-quality face image. Experimental results of face hallucination on public face data sets and face recognition on real-world data sets (VGGface and SCFace) show the superiority both on visual effect and objective indicators over the previous state-of-the-art methods.

31 citations


Posted Content•
TL;DR: Wang et al. as discussed by the authors proposed a multi-scale progressive fusion network (MSPFN) for single image rain streak removal and achieved state-of-the-art performance.
Abstract: Rain streaks in the air appear in various blurring degrees and resolutions due to different distances from their positions to the camera. Similar rain patterns are visible in a rain image as well as its multi-scale (or multi-resolution) versions, which makes it possible to exploit such complementary information for rain streak representation. In this work, we explore the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi-scale progressive fusion network (MSPFN) for single image rain streak removal. For similar rain streaks at different positions, we employ recurrent calculation to capture the global texture, thus allowing to explore the complementary and redundant information at the spatial dimension to characterize target rain streaks. Besides, we construct multi-scale pyramid structure, and further introduce the attention mechanism to guide the fine fusion of this correlated information from different scales. This multi-scale progressive fusion strategy not only promotes the cooperative representation, but also boosts the end-to-end training. Our proposed method is extensively evaluated on several benchmark datasets and achieves state-of-the-art results. Moreover, we conduct experiments on joint deraining, detection, and segmentation tasks, and inspire a new research direction of vision task-driven image deraining. The source code is available at \url{this https URL}.

14 citations


Journal Article•DOI•
TL;DR: A new rain removal network based on the clique recursive feedback mechanism is proposed, considering the interaction of the feature between different convolution layers, and constructing a residual clique block (RCB) to infer the local information, allowing the network to rectify the model parameters in RCB alternately.

5 citations


Proceedings Article•DOI•
04 May 2020
TL;DR: A novel attention-guided deraining network (ADN) is proposed for rain streak removal that decomposes the rain streaks into multiple rain streak layers, and individually model them along the stages of the network to match the increasing abstracts.
Abstract: Due to diverse rain shapes, directions, densities as well as different distances to cameras, rain streaks in the air are interweaved and overlapped. However, most existing deraining methods are inherently oblivious this phenomenon and tend to learn a single rain streak layer to simulate this complex distribution, consequently failing to restore high-quality rain-free images. To solve this problem, along with the stage-wise learning, we propose a novel attention-guided deraining network (ADN) for rain streak removal. Specially, we decompose the rain streaks into multiple rain streak layers, and individually model them along the stages of the network to match the increasing abstracts. Moreover, the attention mechanism is utilized to guide the fusion of these rain streak layers by handling the overlaps between them. Extensive experiments on several benchmark datasets and real-world scenarios show substantial improvements both on quantitative indicators and visual effects over the current top-performing methods.

1 citations


Proceedings Article•DOI•
Baojin Huang1, Zheng He1, Zhongyuan Wang1, Kui Jiang1, Guangcheng Wang1 •
01 Nov 2020
TL;DR: Zhang et al. as mentioned in this paper proposed a lightweight progressive residual clique network (PRCN) for image super-resolution, which is built on the two-stage residual channel separation block (RCSB) and long-skip connections.
Abstract: Deeper and wider convolutional neural networks (CNN) hava been widely applied to the single image super-resolution (SR) task for its appealing performance. However, enormous parametric memory footprint hinders its real-time application on mobile devices, especially in the energy-sensitive environment. In this work, we take both the reconstruction performance and efficiency into consideration and propose a lightweight progressive residual clique network (PRCN) for image SR. PRCN is built on the two-stage residual channel separation block (RCSB) and long-skip connections. First, we divide the input into four channel groups to differently learn texture details, immediately followed by a primary fusion to establish cross-channel correspondence in the first stage. Then we perform a further fusion on the outputs of the first stage to constitute a clique for the refinement in the second stage. Meanwhile, we employ SENet to improve the outputs of the second stage with the separate features of the first stage. This design not only enforces the correlation across channels, but also allows fewer densely connected blocks. Experimental results on public datasets show that PRCN outperforms state-of-the-art methods in terms of performance and complexity.

Patent•
Wang Zhongyuan, Kui Jiang, Peng Yi, Zou Qin, Zhen Han 
14 Apr 2020
TL;DR: In this article, an UAV image weak and small target enhancement extraction method was proposed, which consists of three steps of small target feature enhancement based on a constant-equal-resolution feature enhancement network, foreground target visual saliency enhancement, and target detection based on YOLOV3.
Abstract: The invention discloses an unmanned aerial vehicle image weak and small target enhancement extraction method. A better extraction effect is achieved by enhancing the structure and texture features ofa weak and small target. The method specifically comprises three steps of small target feature enhancement based on a constant-equal-resolution feature enhancement network, foreground target visual saliency enhancement based on an attention network, and target detection based on YOLOV3. According to the proposed constant-equal-resolution feature enhancement network, the number of target feature points is increased on the premise that the spatial resolution of the image is not increased so that the detection efficiency is ensured; and an attention mechanism is introduced to realize accurate description of a potential target area so that the interference of a complex background is eliminated, and the robustness of a detection algorithm is improved.