scispace - formally typeset
Search or ask a question
Author

Kim-Hui Yap

Other affiliations: University of Sydney
Bio: Kim-Hui Yap is an academic researcher from Nanyang Technological University. The author has contributed to research in topics: Image restoration & Image retrieval. The author has an hindex of 24, co-authored 138 publications receiving 1838 citations. Previous affiliations of Kim-Hui Yap include University of Sydney.


Papers
More filters
Proceedings ArticleDOI
15 Jun 2019
TL;DR: Zhang et al. as mentioned in this paper proposed a new architecture that integrates person attributes and attribute attention maps into a classification framework to solve the person re-ID (re-ID) problem.
Abstract: This paper proposes Attribute Attention Network (AANet), a new architecture that integrates person attributes and attribute attention maps into a classification framework to solve the person re-identification (re-ID) problem. Many person re-ID models typically employ semantic cues such as body parts or human pose to improve the re-ID performance. Attribute information, however, is often not utilized. The proposed AANet leverages on a baseline model that uses body parts and integrates the key attribute information in an unified learning framework. The AANet consists of a global person ID task, a part detection task and a crucial attribute detection task. By estimating the class responses of individual attributes and combining them to form the attribute attention map (AAM), a very strong discriminatory representation is constructed. The proposed AANet outperforms the best state-of-the-art method \cite{Sun_2018_ECCV} using ResNet-50 by 3.36\% in mAP and 3.12\% in Rank-1 accuracy on DukeMTMC-reID dataset. On Market1501 dataset, AANet achieves 92.38\% mAP and 95.10\% Rank-1 accuracy with re-ranking, outperforming~\cite{kalayeh2018human}, another state of the art method using ResNet-152, by 1.42\% in mAP and 0.47\% in Rank-1 accuracy. In addition, AANet can perform person attribute prediction (e.g., gender, hair length, clothing length etc.), and localize the attributes in the query image.

284 citations

Proceedings Article
27 Sep 2018
TL;DR: In this article, the authors examined two different techniques for parameter averaging in GAN training, namely, exponential moving average (EMA) and exponential discounted sum (EMA), and showed that EMA converges to limit cycles around the equilibrium with vanishing amplitude.
Abstract: We examine two different techniques for parameter averaging in GAN training. Moving Average (MA) computes the time-average of parameters, whereas Exponential Moving Average (EMA) computes an exponentially discounted sum. Whilst MA is known to lead to convergence in bilinear settings, we provide the -- to our knowledge -- first theoretical arguments in support of EMA. We show that EMA converges to limit cycles around the equilibrium with vanishing amplitude as the discount parameter approaches one for simple bilinear games and also enhances the stability of general GAN training. We establish experimentally that both techniques are strikingly effective in the non-convex-concave GAN setting as well. Both improve inception and FID scores on different architectures and for different GAN objectives. We provide comprehensive experimental results across a range of datasets -- mixture of Gaussians, CIFAR-10, STL-10, CelebA and ImageNet -- to demonstrate its effectiveness. We achieve state-of-the-art results on CIFAR-10 and produce clean CelebA face images.\footnote{~The code is available at \url{this https URL}}

130 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed method is effective in performing image registration and SR for simulated as well as real-life images, and an iterative scheme is developed to solve the arising nonlinear least squares problem.
Abstract: This paper proposes a new algorithm to integrate image registration into image super-resolution (SR) Image SR is a process to reconstruct a high-resolution (HR) image by fusing multiple low-resolution (LR) images A critical step in image SR is accurate registration of the LR images or, in other words, effective estimation of motion parameters Conventional SR algorithms assume either the estimated motion parameters by existing registration methods to be error-free or the motion parameters are known a priori This assumption, however, is impractical in many applications, as most existing registration algorithms still experience various degrees of errors, and the motion parameters among the LR images are generally unknown a priori In view of this, this paper presents a new framework that performs simultaneous image registration and HR image reconstruction As opposed to other current methods that treat image registration and HR reconstruction as disjoint processes, the new framework enables image registration and HR reconstruction to be estimated simultaneously and improved progressively Further, unlike most algorithms that focus on the translational motion model, the proposed method adopts a more generic motion model that includes both translation as well as rotation An iterative scheme is developed to solve the arising nonlinear least squares problem Experimental results show that the proposed method is effective in performing image registration and SR for simulated as well as real-life images

125 citations

Posted Content
TL;DR: The proposed AANet leverages on a baseline model that uses body parts and integrates the key attribute information in an unified learning framework and outperforms the best state-of-the-art method using ResNet-50.
Abstract: This paper proposes Attribute Attention Network (AANet), a new architecture that integrates person attributes and attribute attention maps into a classification framework to solve the person re-identification (re-ID) problem. Many person re-ID models typically employ semantic cues such as body parts or human pose to improve the re-ID performance. Attribute information, however, is often not utilized. The proposed AANet leverages on a baseline model that uses body parts and integrates the key attribute information in an unified learning framework. The AANet consists of a global person ID task, a part detection task and a crucial attribute detection task. By estimating the class responses of individual attributes and combining them to form the attribute attention map (AAM), a very strong discriminatory representation is constructed. The proposed AANet outperforms the best state-of-the-art method arXiv:1711.09349v3 [cs.CV] using ResNet-50 by 3.36% in mAP and 3.12% in Rank-1 accuracy on DukeMTMC-reID dataset. On Market1501 dataset, AANet achieves 92.38% mAP and 95.10% Rank-1 accuracy with re-ranking, outperforming arXiv:1804.00216v1 [cs.CV], another state of the art method using ResNet-152, by 1.42% in mAP and 0.47% in Rank-1 accuracy. In addition, AANet can perform person attribute prediction (e.g., gender, hair length, clothing length etc.), and localize the attributes in the query image.

102 citations

Journal ArticleDOI
TL;DR: A new soft maximum a posteriori (MAP) estimation framework to perform joint blur identification and HR image reconstruction that incorporates a soft blur prior that estimates the relevance of the best-fit parametric blur model, and induces reinforcement learning towards it.

88 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

01 Jan 2016
TL;DR: In this paper, the authors present the principles of optics electromagnetic theory of propagation interference and diffraction of light, which can be used to find a good book with a cup of coffee in the afternoon, instead of facing with some infectious bugs inside their computer.
Abstract: Thank you for reading principles of optics electromagnetic theory of propagation interference and diffraction of light. As you may know, people have search hundreds times for their favorite novels like this principles of optics electromagnetic theory of propagation interference and diffraction of light, but end up in harmful downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they are facing with some infectious bugs inside their computer.

2,213 citations

Proceedings Article
01 Jan 1994
TL;DR: The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images.
Abstract: MUCKE aims to mine a large volume of images, to structure them conceptually and to use this conceptual structuring in order to improve large-scale image retrieval. The last decade witnessed important progress concerning low-level image representations. However, there are a number problems which need to be solved in order to unleash the full potential of image mining in applications. The central problem with low-level representations is the mismatch between them and the human interpretation of image content. This problem can be instantiated, for instance, by the incapability of existing descriptors to capture spatial relationships between the concepts represented or by their incapability to convey an explanation of why two images are similar in a content-based image retrieval framework. We start by assessing existing local descriptors for image classification and by proposing to use co-occurrence matrices to better capture spatial relationships in images. The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images. Consequently, we introduce methods which tackle these two problems and compare results to state of the art methods. Note: some aspects of this deliverable are withheld at this time as they are pending review. Please contact the authors for a preview.

2,134 citations

Posted Content
TL;DR: BigGAN as mentioned in this paper applies orthogonal regularization to the generator, allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the generator's input, leading to models which set the new state of the art in class-conditional image synthesis.
Abstract: Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.

1,443 citations

Proceedings ArticleDOI
14 May 2006
TL;DR: A novel method for blind image restoration which is a multidimensional extension of an approach used successfully for audio restoration, and a maximum marginalised a posteriori (MMAP) blur estimate is obtained by optimising the resulting probability density function.
Abstract: We present a novel method for Blind image restoration which is a multidimensional extension of an approach used successfully for audio restoration. A nonstationary image model is used to increase reliability of blur estimates. This source model consists of a separate autoregressive model in each region of the image. A hierarchical Bayesian model for the observations is used, and a maximum marginalised a posteriori (MMAP) blur estimate is obtained by optimising the resulting probability density function.

1,132 citations