scispace - formally typeset
Search or ask a question
Author

Haesol Park

Other affiliations: Seoul National University
Bio: Haesol Park is an academic researcher from Systems Research Institute. The author has contributed to research in topics: Deblurring & Computer science. The author has an hindex of 7, co-authored 8 publications receiving 189 citations. Previous affiliations of Haesol Park include Seoul National University.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel convolutional neural network module to learn a stereo matching cost with a large-sized window that can successfully utilize the information from a large area without introducing the fattening effect is proposed.
Abstract: When a human matches two images, the viewer has a natural tendency to view the wide area around the target pixel to obtain clues of right correspondence. However, designing a matching cost function that works on a large window in the same way is difficult. The cost function is typically not intelligent enough to discard the information irrelevant to the target pixel, resulting in undesirable artifacts. In this letter, we propose a novel convolutional neural network (CNN) module to learn a stereo matching cost with a large-sized window. Unlike conventional pooling layers with strides, the proposed per-pixel pyramid-pooling layer can cover a large area without a loss of resolution and detail. Therefore, the learned matching cost function can successfully utilize the information from a large area without introducing the fattening effect. The proposed method is robust despite the presence of weak textures, depth discontinuity, illumination, and exposure difference. The proposed method achieves near-peak performance on the Middlebury benchmark.

94 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper proposes a pioneering unified framework that solves four problems simultaneously, namely, dense depth reconstruction, camera pose estimation, super-resolution, and deblurring, by reflecting a physical imaging process and solving the cost minimization problem using an alternating optimization technique.
Abstract: The conventional methods for estimating camera poses and scene structures from severely blurry or low resolution images often result in failure. The off-the-shelf deblurring or super-resolution methods may show visually pleasing results. However, applying each technique independently before matching is generally unprofitable because this naive series of procedures ignores the consistency between images. In this paper, we propose a pioneering unified framework that solves four problems simultaneously, namely, dense depth reconstruction, camera pose estimation, super-resolution, and deblurring. By reflecting a physical imaging process, we formulate a cost minimization problem and solve it using an alternating optimization technique. The experimental results on both synthetic and real videos show high-quality depth maps derived from severely degraded images that contrast the failures of naive multi-view stereo methods. Our proposed method also produces outstanding deblurred and super-resolved images unlike the independent application or combination of conventional video deblurring, super-resolution methods.

44 citations

Journal ArticleDOI
TL;DR: A new surfel (surface element) based multi-view stereo algorithm that runs entirely on GPU that utilizes more accurate photo-consistency and reconstructs the 3D shape up to sub-voxel accuracy.

27 citations

Journal ArticleDOI
TL;DR: Huang et al. as mentioned in this paper proposed a pyramid-pooling layer to learn a stereo matching cost with a large-sized window, which can successfully utilize the information from a large area without introducing the fattening effect.
Abstract: When a human matches two images, the viewer has a natural tendency to view the wide area around the target pixel to obtain clues of right correspondence. However, designing a matching cost function that works on a large window in the same way is difficult. The cost function is typically not intelligent enough to discard the information irrelevant to the target pixel, resulting in undesirable artifacts. In this paper, we propose a novel learn a stereo matching cost with a large-sized window. Unlike conventional pooling layers with strides, the proposed per-pixel pyramid-pooling layer can cover a large area without a loss of resolution and detail. Therefore, the learned matching cost function can successfully utilize the information from a large area without introducing the fattening effect. The proposed method is robust despite the presence of weak textures, depth discontinuity, illumination, and exposure difference. The proposed method achieves near-peak performance on the Middlebury benchmark.

22 citations

Book ChapterDOI
08 Sep 2018
TL;DR: A novel algorithm to estimate all blur model variables jointly, including latent sub-aperture image, camera motion, and scene depth from the blurred 4D light field, achieves high quality light field deblurring and depth estimation simultaneously under arbitrary 6-DOF camera motion and unconstrained scene depth.
Abstract: Removing camera motion blur from a single light field is a challenging task since it is highly ill-posed inverse problem. The problem becomes even worse when blur kernel varies spatially due to scene depth variation and high-order camera motion. In this paper, we propose a novel algorithm to estimate all blur model variables jointly, including latent sub-aperture image, camera motion, and scene depth from the blurred 4D light field. Exploiting multi-view nature of a light field relieves the inverse property of the optimization by utilizing strong depth cues and multi-view blur observation. The proposed joint estimation achieves high quality light field deblurring and depth estimation simultaneously under arbitrary 6-DOF camera motion and unconstrained scene depth. Intensive experiment on real and synthetic blurred light field confirms that the proposed algorithm outperforms the state-of-the-art light field deblurring and depth estimation methods.

20 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The challenges of using deep learning for remote-sensing data analysis are analyzed, recent advances are reviewed, and resources are provided that hope will make deep learning in remote sensing seem ridiculously simple.
Abstract: Central to the looming paradigm shift toward data-intensive science, machine-learning techniques are becoming increasingly important. In particular, deep learning has proven to be both a major breakthrough and an extremely powerful tool in many fields. Shall we embrace deep learning as the key to everything? Or should we resist a black-box solution? These are controversial issues within the remote-sensing community. In this article, we analyze the challenges of using deep learning for remote-sensing data analysis, review recent advances, and provide resources we hope will make deep learning in remote sensing seem ridiculously simple. More importantly, we encourage remote-sensing scientists to bring their expertise into deep learning and use it as an implicit general model to tackle unprecedented, large-scale, influential challenges, such as climate change and urbanization.

2,095 citations

Journal ArticleDOI
TL;DR: In this article, the authors analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with.
Abstract: Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.

629 citations

Journal ArticleDOI
TL;DR: To demonstrate accurate MR image reconstruction from undersampled k‐space data using cross‐domain convolutional neural networks (CNNs) using cross-domain Convolutional Neural Networks, a parallel version of TSP, is presented.
Abstract: Purpose To demonstrate accurate MR image reconstruction from undersampled k-space data using cross-domain convolutional neural networks (CNNs) METHODS: Cross-domain CNNs consist of 3 components: (1) a deep CNN operating on the k-space (KCNN), (2) a deep CNN operating on an image domain (ICNN), and (3) an interleaved data consistency operations. These components are alternately applied, and each CNN is trained to minimize the loss between the reconstructed and corresponding fully sampled k-spaces. The final reconstructed image is obtained by forward-propagating the undersampled k-space data through the entire network. Results Performances of K-net (KCNN with inverse Fourier transform), I-net (ICNN with interleaved data consistency), and various combinations of the 2 different networks were tested. The test results indicated that K-net and I-net have different advantages/disadvantages in terms of tissue-structure restoration. Consequently, the combination of K-net and I-net is superior to single-domain CNNs. Three MR data sets, the T2 fluid-attenuated inversion recovery (T2 FLAIR) set from the Alzheimer's Disease Neuroimaging Initiative and 2 data sets acquired at our local institute (T2 FLAIR and T1 weighted), were used to evaluate the performance of 7 conventional reconstruction algorithms and the proposed cross-domain CNNs, which hereafter is referred to as KIKI-net. KIKI-net outperforms conventional algorithms with mean improvements of 2.29 dB in peak SNR and 0.031 in structure similarity. Conclusion KIKI-net exhibits superior performance over state-of-the-art conventional algorithms in terms of restoring tissue structures and removing aliasing artifacts. The results demonstrate that KIKI-net is applicable up to a reduction factor of 3 to 4 based on variable-density Cartesian undersampling.

323 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, the authors propose a network architecture to incorporate all steps of stereo matching, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement, which achieves the state-of-the-art performance on the KITTI 2012 and KittI 2015 benchmarks while maintaining a very fast running time.
Abstract: Stereo matching algorithms usually consist of four steps, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement. Existing CNN-based methods only adopt CNN to solve parts of the four steps, or use different networks to deal with different steps, making them difficult to obtain the overall optimal solution. In this paper, we propose a network architecture to incorporate all steps of stereo matching. The network consists of three parts. The first part calculates the multi-scale shared features. The second part performs matching cost calculation, matching cost aggregation and disparity calculation to estimate the initial disparity using shared features. The initial disparity and the shared features are used to calculate the feature constancy that measures correctness of the correspondence between two input images. The initial disparity and the feature constancy are then fed into a sub-network to refine the initial disparity. The proposed method has been evaluated on the Scene Flow and KITTI datasets. It achieves the state-of-the-art performance on the KITTI 2012 and KITTI 2015 benchmarks while maintaining a very fast running time. Source code is available at http://github.com/leonzfa/iResNet.

252 citations

Book ChapterDOI
08 Sep 2018
TL;DR: A novel end-to-end approach for unstructured point cloud semantic segmentation, named 3P-RNN, is proposed to exploit the inherent contextual features of 3D point clouds to demonstrate robust performance superior to state-of-the-arts.
Abstract: Semantic segmentation of 3D unstructured point clouds remains an open research problem. Recent works predict semantic labels of 3D points by virtue of neural networks but take limited context knowledge into consideration. In this paper, a novel end-to-end approach for unstructured point cloud semantic segmentation, named 3P-RNN, is proposed to exploit the inherent contextual features. First the efficient pointwise pyramid pooling module is investigated to capture local structures at various densities by taking multi-scale neighborhood into account. Then the two-direction hierarchical recurrent neural networks (RNNs) are utilized to explore long-range spatial dependencies. Each recurrent layer takes as input the local features derived from unrolled cells and sweeps the 3D space along two directions successively to integrate structure knowledge. On challenging indoor and outdoor 3D datasets, the proposed framework demonstrates robust performance superior to state-of-the-arts.

251 citations