scispace - formally typeset
Search or ask a question
Author

Du-sik Park

Other affiliations: Samsung Electro-Mechanics
Bio: Du-sik Park is an academic researcher from Samsung. The author has contributed to research in topics: Pixel & Color image. The author has an hindex of 33, co-authored 408 publications receiving 4506 citations. Previous affiliations of Du-sik Park include Samsung Electro-Mechanics.


Papers
More filters
Proceedings ArticleDOI
Junho Yim1, Heechul Jung1, ByungIn Yoo1, Changkyu Choi2, Du-sik Park2, Junmo Kim1 
07 Jun 2015
TL;DR: A new deep architecture based on a novel type of multitask learning, which can achieve superior performance in rotating to a target-pose face image from an arbitrary pose and illumination image while preserving identity is proposed.
Abstract: Face recognition under viewpoint and illumination changes is a difficult problem, so many researchers have tried to solve this problem by producing the pose- and illumination- invariant feature. Zhu et al. [26] changed all arbitrary pose and illumination images to the frontal view image to use for the invariant feature. In this scheme, preserving identity while rotating pose image is a crucial issue. This paper proposes a new deep architecture based on a novel type of multitask learning, which can achieve superior performance in rotating to a target-pose face image from an arbitrary pose and illumination image while preserving identity. The target pose can be controlled by the user's intention. This novel type of multi-task model significantly improves identity preservation over the single task model. By using all the synthesized controlled pose images, called Controlled Pose Image (CPI), for the pose-illumination-invariant feature and voting among the multiple face recognition results, we clearly outperform the state-of-the-art algorithms by more than 4∼6% on the MultiPIE dataset.

297 citations

Patent
22 Jan 2008
TL;DR: A Red Green Blue-to-Red Green Green Blue White (RGB-toRGBW) color decomposition method was proposed in this article, where the output value of white is determined based on RGB values and a saturation.
Abstract: A Red Green Blue-to-Red Green Blue White (RGB-to-RGBW) color decomposition method and system. The RGB-to-RGBW color decomposition method includes: determining an output value of white based on inputted RGB values and a saturation; and outputting the output value when an input color is a pure color.

108 citations

Patent
Dong Kyung Nam1, Yun-Tae Kim1, Du-sik Park1, Gee Young Sung1, Juyong Park1 
30 Apr 2009
TL;DR: In this article, a display apparatus and a method that may display a high depth 3D image is described, where the display method may separate an input image into a near-sighted image and a farsighted image, and image and output the nearsighted image using a light field method.
Abstract: A display apparatus and method that may display a high depth three-dimensional (3D) image is provided. The display method may separate an input image into a near-sighted image and a far-sighted image, image and output the near-sighted image using a light field method, and image and output the far-sighted image using a multi-view method.

104 citations

Journal ArticleDOI
Jin-Ho Lee1, Juyong Park1, Dongkyung Nam1, Seo Young Choi1, Du-sik Park1, Chang Yeong Kim1 
TL;DR: A 300-Mpixel multi-projection 3D display that has a 100-inch screen and a 40° viewing angle and the design was optimized to minimize the variation in the brightness of projected images acquired very smooth motion parallax images without discontinuity.
Abstract: To achieve an immersive natural 3D experience on a large screen, a 300-Mpixel multi-projection 3D display that has a 100-inch screen and a 40° viewing angle has been developed. To increase the number of rays emanating from each pixel to 300 in the horizontal direction, three hundred projectors were used. The projector configuration is an important issue in generating a high-quality 3D image, the luminance characteristics were analyzed and the design was optimized to minimize the variation in the brightness of projected images. The rows of the projector arrays were repeatedly changed according to a predetermined row interval and the projectors were arranged in an equi-angular pitch toward the constant central point. As a result, we acquired very smooth motion parallax images without discontinuity. There is no limit of viewing distance, so natural 3D images can be viewed from 2 m to over 20 m.

104 citations

Journal ArticleDOI
TL;DR: This paper presents an efficient depth map coding method based on a newly defined rendering view distortion function that estimates rendered view quality, where area-based scheme is provided in order to mimic the warping/view-rendering process accurately.
Abstract: This paper presents an efficient depth map coding method based on a newly defined rendering view distortion function. As compared to the conventional depth map coding in which distortion is measured by only investigating the coding error in depth map, the proposed scheme focuses on virtually synthesized view quality by involving co-located color information. In detail, the proposed distortion function estimates rendered view quality, where area-based scheme is provided in order to mimic the warping/view-rendering process accurately. Moreover, the coding performance of the proposed distortion metric is even improved by involving the additional SKIP mode derived by co-located color coding information. The simulation results show the proposed scheme could achieve approximately 30% bit-rate saving for depth data, and about 10% bit-rate saving for overall multi-view data.

99 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.
Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.

3,097 citations

Journal ArticleDOI
TL;DR: In this paper, the authors offer a new book that enPDFd the perception of the visual world to read, which they call "Let's Read". But they do not discuss how to read it.
Abstract: Let's read! We will often find out this sentence everywhere. When still being a kid, mom used to order us to always read, so did the teacher. Some books are fully read in a week and we need the obligation to support reading. What about now? Do you still love reading? Is reading only for you who have obligation? Absolutely not! We here offer you a new book enPDFd the perception of the visual world to read.

2,250 citations

Proceedings Article
05 Dec 2016
TL;DR: This work proposes coupled generative adversarial network (CoGAN), which can learn a joint distribution without any tuple of corresponding images, and applies it to several joint distribution learning tasks, and demonstrates its applications to domain adaptation and image transformation.
Abstract: We propose the coupled generative adversarial nets (CoGAN) framework for generating pairs of corresponding images in two different domains. The framework consists of a pair of generative adversarial nets, each responsible for generating images in one domain. We show that by enforcing a simple weight-sharing constraint, the CoGAN learns to generate pairs of corresponding images without existence of any pairs of corresponding images in the two domains in the training set. In other words, the CoGAN learns a joint distribution of images in the two domains from images drawn separately from the marginal distributions of the individual domains. This is in contrast to the existing multi-modal generative models, which require corresponding images for training. We apply the CoGAN to several pair image generation tasks. For each task, the CoGAN learns to generate convincing pairs of corresponding images. We further demonstrate the applications of the CoGAN framework for the domain adaptation and cross-domain image generation tasks.

1,548 citations

Posted Content
TL;DR: Multi-task learning (MTL) as mentioned in this paper is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks.
Abstract: Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL from the perspective of algorithmic modeling, applications and theoretical analyses. For algorithmic modeling, we give a definition of MTL and then classify different MTL algorithms into five categories, including feature learning approach, low-rank approach, task clustering approach, task relation learning approach and decomposition approach as well as discussing the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, we review online, parallel and distributed MTL models as well as dimensionality reduction and feature hashing to reveal their computational and storage advantages. Many real-world applications use MTL to boost their performance and we review representative works in this paper. Finally, we present theoretical analyses and discuss several future directions for MTL.

1,202 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: Quantitative and qualitative evaluation on both controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art.
Abstract: The large pose discrepancy between two face images is one of the key challenges in face recognition. Conventional approaches for pose-invariant face recognition either perform face frontalization on, or learn a pose-invariant representation from, a non-frontal face image. We argue that it is more desirable to perform both tasks jointly to allow them to leverage each other. To this end, this paper proposes Disentangled Representation learning-Generative Adversarial Network (DR-GAN) with three distinct novelties. First, the encoder-decoder structure of the generator allows DR-GAN to learn a generative and discriminative representation, in addition to image synthesis. Second, this representation is explicitly disentangled from other face variations such as pose, through the pose code provided to the decoder and pose estimation in the discriminator. Third, DR-GAN can take one or multiple images as the input, and generate one unified representation along with an arbitrary number of synthetic images. Quantitative and qualitative evaluation on both controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art.

1,016 citations