scispace - formally typeset
Search or ask a question
Author

Yi-Qing Wang

Bio: Yi-Qing Wang is an academic researcher from École Normale Supérieure. The author has contributed to research in topics: Artificial intelligence & Computer science. The author has an hindex of 4, co-authored 4 publications receiving 241 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A complete algorithmic description, a learning code and a learned face detector that can be applied to any color image are proposed and a post-processing step is proposed to reduce detection redundancy using a robustness argument.
Abstract: In this article, we decipher the Viola-Jones algorithm, the first ever real-time face detection system. There are three ingredients working in concert to enable a fast and accurate detection: the integral image for feature computation, Adaboost for feature selection and an attentional cascade for efficient computational resource allocation. Here we propose a complete algorithmic description, a learning code and a learned face detector that can be applied to any color image. Since the Viola-Jones algorithm typically gives multiple detections, a post-processing step is also proposed to reduce detection redundancy using a robustness argument. Source Code The source code and the online demo are accessible at the IPOL web page of this article 1 .

259 citations

Journal ArticleDOI
TL;DR: A probabilistic view of an existing algorithm piecewise linear estimation (PLE) for image inpainting is presented which leads to several theoretical and numerical improvements based on an eective use of Gaussian mixture.
Abstract: Gaussian mixture is a powerful tool for modeling the patch prior. In this work, a probabilistic view of an existing algorithm piecewise linear estimation (PLE) for image inpainting is presented which leads to several theoretical and numerical improvements based on an eective use of Gaussian mixture. Source Code An ANSI C++ implementation of the algorithm has been peer reviewed and is accessible at the IPOL web page of this article 1 .

19 citations

Journal ArticleDOI
TL;DR: This article focuses on the implementation of S-PLE (Stein’s Unbiased Risk Estimator) and shows its performance by comparing it with several other acclaimed algorithms.
Abstract: SURE (Stein’s Unbiased Risk Estimator) guided Piecewise Linear Estimation (S-PLE) is a recently introduced patch-based state-of-the-art denoising algorithm. In this article, we focus on its implementation and show its performance by comparing it with several other acclaimed algorithms. Source Code ANSI C source code for both S-PLE and PLE is accessible on the article web page. A live demo for S-PLE can be found at the IPOL web page of this article1.

9 citations

Journal ArticleDOI
TL;DR: SSaNN (Self-Similarity and Neural Networks), a denoising algorithm which combines the strength of BM3D on large-scale structured patterns with that of neural networks on small-scale texture content and is able to produce a better overall recovery than both BM 3D and small neural networks.
Abstract: Recent years have seen a surge of interest in deep neural networks fueled by their successful applications in numerous image processing and computer vision tasks. However, such applications typically come with huge computational loads. In this article, we explore the possibility of using small neural networks to denoise images. In particular, we present SSaNN (Self-Similarity and Neural Networks), a denoising algorithm which combines the strength of BM3D on large-scale structured patterns with that of neural networks on small-scale texture content. This algorithm is able to produce a better overall recovery than both BM3D and small neural networks.

4 citations

Proceedings ArticleDOI
10 Nov 2022
TL;DR: In this paper , a new method for parsing 3D binary lesion masks and an approach to evaluate its performance is proposed, which outperforms 3D connected component analysis on a large collection of annotated portal-venous phase studies.
Abstract: Liver lesion segmentation is a key module for an automated liver disease diagnosis system. Numerous methods have been developed recently to produce accurate 3D binary lesion masks for CT scans. From the clinical perspective, it is thus important to be able to correctly parse these masks into separate lesion instances in order to enable downstream applications such as lesion tracking and characterization. For the lack of a better alternative, 3D connected component analysis is often used for this task, though it does not always work, especially in the presence of confluent lesions. In this paper, we propose a new method for parsing 3D binary lesion masks and an approach to evaluating its performance. We show that our method outperforms 3D connected component analysis on a large collection of annotated portal-venous phase studies.

Cited by
More filters
Journal ArticleDOI
TL;DR: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel, and can achieve comparable performance with easier procedures.
Abstract: Facial expression recognition (FER) is a significant task for the machines to understand the emotional changes in human beings. However, accurate hand-crafted features that are highly related to changes in expression are difficult to extract because of the influences of individual difference and variations in emotional intensity. Therefore, features that can accurately describe the changes in facial expressions are urgently required. Method: A weighted mixture deep neural network (WMDNN) is proposed to automatically extract the features that are effective for FER tasks. Several pre-processing approaches, such as face detection, rotation rectification, and data augmentation, are implemented to restrict the regions for FER. Two channels of facial images, including facial grayscale images and their corresponding local binary pattern (LBP) facial images, are processed by WMDNN. Expression-related features of facial grayscale images are extracted by fine-tuning a partial VGG16 network, the parameters of which are initialized using VGG16 model trained on ImageNet database. Features of LBP facial images are extracted by a shallow convolutional neural network (CNN) built based on DeepID. The outputs of both channels are fused in a weighted manner. The result of final recognition is calculated using softmax classification. Results: Experimental results indicate that the proposed algorithm can recognize six basic facial expressions (happiness, sadness, anger, disgust, fear, and surprise) with high accuracy. The average recognition accuracies for benchmarking data sets “CK+,” “JAFFE,” and “Oulu-CASIA” are 0.970, 0.922, and 0.923, respectively. Conclusions: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel. Compared with the deep networks that use multiple channels, our proposed network can achieve comparable performance with easier procedures. Fine-tuning is effective to FER tasks with a well pre-trained model if sufficient samples cannot be collected.

160 citations

Book
01 Jul 2019
TL;DR: In this paper, the authors frame cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions, such as how many clusters are there? which method should I use? How should I handle outliers.
Abstract: Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.

134 citations

Journal ArticleDOI
TL;DR: This paper addresses the problem of recovering degraded images using multivariate Gaussian mixture model (GMM) as a prior, and a novel approach for computing aggregation weights for image reconstruction from recovered patches is introduced which is based on similarity degree of each patch to the estimated Gaussian clusters.
Abstract: In this paper, we address the problem of recovering degraded images using multivariate Gaussian mixture model (GMM) as a prior. The GMM framework in our method for image restoration is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering. Our conducted experiments show that in the case of constraining Gaussian estimates into a finite-sized windows, the patch clusters are more likely to be derived from the estimated multivariate Gaussian distributions, i.e., the proposed statistical patch-based model provides a better goodness-of-fit to statistical properties of natural images. A novel approach for computing aggregation weights for image reconstruction from recovered patches is introduced which is based on similarity degree of each patch to the estimated Gaussian clusters. The results admit that in the case of image denoising, our method is highly comparable with the state-of-the-art methods, and our image interpolation method outperforms previous state-of-the-art methods.

86 citations

Journal ArticleDOI
TL;DR: A systematic and comprehensive survey on current state-of-art Artificial Intelligence techniques (datasets and algorithms) that provide a solution to the aforementioned issues and a taxonomy of existing facial sentiment analysis strategies in brief are presented.
Abstract: With the advancements in machine and deep learning algorithms, the envision of various critical real-life applications in computer vision becomes possible. One of the applications is facial sentiment analysis. Deep learning has made facial expression recognition the most trending research fields in computer vision area. Recently, deep learning-based FER models have suffered from various technological issues like under-fitting or over-fitting. It is due to either insufficient training and expression data. Motivated from the above facts, this paper presents a systematic and comprehensive survey on current state-of-art Artificial Intelligence techniques (datasets and algorithms) that provide a solution to the aforementioned issues. It also presents a taxonomy of existing facial sentiment analysis strategies in brief. Then, this paper reviews the existing novel machine and deep learning networks proposed by researchers that are specifically designed for facial expression recognition based on static images and present their merits and demerits and summarized their approach. Finally, this paper also presents the open issues and research challenges for the design of a robust facial expression recognition system.

86 citations

Journal ArticleDOI
TL;DR: This model uses a hierarchical feature representation to deal with spontaneous emotions, and learns how to integrate multiple modalities for non-verbal emotion recognition, making it suitable to be used in an HRI scenario.

69 citations