scispace - formally typeset
Search or ask a question
Author

Kwanghoon Sohn

Bio: Kwanghoon Sohn is an academic researcher from Yonsei University. The author has contributed to research in topics: Convolutional neural network & Motion estimation. The author has an hindex of 37, co-authored 395 publications receiving 5256 citations. Previous affiliations of Kwanghoon Sohn include Georgetown University & North Carolina State University.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents an efficient technique for performing a spatially inhomogeneous edge-preserving image smoothing, called fast global smoother, focusing on sparse Laplacian matrices consisting of a data term and a prior term that approximate the solution of the memory- and computation-intensive large linear system by solving a sequence of 1D subsystems.
Abstract: This paper presents an efficient technique for performing a spatially inhomogeneous edge-preserving image smoothing, called fast global smoother. Focusing on sparse Laplacian matrices consisting of a data term and a prior term (typically defined using four or eight neighbors for 2D image), our approach efficiently solves such global objective functions. In particular, we approximate the solution of the memory-and computation-intensive large linear system, defined over a d-dimensional spatial domain, by solving a sequence of 1D subsystems. Our separable implementation enables applying a linear-time tridiagonal matrix algorithm to solve d three-point Laplacian matrices iteratively. Our approach combines the best of two paradigms, i.e., efficient edge-preserving filters and optimization-based smoothing. Our method has a comparable runtime to the fast edge-preserving filters, but its global optimization formulation overcomes many limitations of the local filtering approaches. Our method also achieves high-quality results as the state-of-the-art optimization-based techniques, but runs ∼10-30 times faster. Besides, considering the flexibility in defining an objective function, we further propose generalized fast algorithms that perform Lγ norm smoothing (0 < γ < 2) and support an aggregated (robust) data term for handling imprecise data constraints. We demonstrate the effectiveness and efficiency of our techniques in a range of image processing and computer graphics applications.

313 citations

Journal ArticleDOI
TL;DR: This paper proposes a real-time and illumination invariant lane detection method for lane departure warning system that works well in various illumination conditions such as in bad weather conditions and at night time.
Abstract: Invariant property of lane color under various illuminations is utilized for lane detection.Computational complexity is reduced using vanishing point detection and adaptive ROI.Datasets for evaluation include various environments from several devices.Simulation demo demonstrate fast and powerful performance for real-time applications. Lane detection is an important element in improving driving safety. In this paper, we propose a real-time and illumination invariant lane detection method for lane departure warning system. The proposed method works well in various illumination conditions such as in bad weather conditions and at night time. It includes three major components: First, we detect a vanishing point based on a voting map and define an adaptive region of interest (ROI) to reduce computational complexity. Second, we utilize the distinct property of lane colors to achieve illumination invariant lane marker candidate detection. Finally, we find the main lane using a clustering method from the lane marker candidates. In case of lane departure situation, our system sends driver alarm signal. Experimental results show satisfactory performance with an average detection rate of 93% under various illumination conditions. Moreover, the overall process takes only 33ms per frame.

194 citations

Journal ArticleDOI
TL;DR: A gradient-enhancing conversion method that produces a new gray-level image from an RGB color image based on linear discriminant analysis for illumination-robust lane detection and a novel lane detection algorithm, which uses the proposed conversion method, adaptive Canny edge detector, Hough transform, and curve model fitting method.
Abstract: Lane detection is important in many advanced driver-assistance systems (ADAS). Vision-based lane detection algorithms are widely used and generally use gradient information as a lane feature. However, gradient values between lanes and roads vary with illumination change, which degrades the performance of lane detection systems. In this paper, we propose a gradient-enhancing conversion method for illumination-robust lane detection. Our proposed gradient-enhancing conversion method produces a new gray-level image from an RGB color image based on linear discriminant analysis. The converted images have large gradients at lane boundaries. To deal with illumination changes, the gray-level conversion vector is dynamically updated. In addition, we propose a novel lane detection algorithm, which uses the proposed conversion method, adaptive Canny edge detector, Hough transform, and curve model fitting method. We performed several experiments in various illumination environments and confirmed that the gradient is maximized at lane boundaries on the road. The detection rate of the proposed lane detection algorithm averages 96% and is greater than 93% in very poor environments.

188 citations

Journal ArticleDOI
TL;DR: The proposed method for cost aggregation and occlusion handling for stereo matching is the most successful among all cost aggregation methods based on standard stereo test beds and asymmetric information is used so that few additional computational loads are necessary.
Abstract: This paper presents a novel method for cost aggregation and occlusion handling for stereo matching. In order to estimate optimal cost, given a per-pixel difference image as observed data, we define an energy function and solve the minimization problem by solving the iterative equation with the numerical method. We improve performance and increase the convergence rate by using several acceleration techniques such as the Gauss-Seidel method, the multiscale approach, and adaptive interpolation. The proposed method is computationally efficient since it does not use color segmentation or any global optimization techniques. For occlusion handling, which has not been performed effectively by any conventional cost aggregation approaches, we combine the occlusion problem with the proposed minimization scheme. Asymmetric information is used so that few additional computational loads are necessary. Experimental results show that performance is comparable to that of many state-of-the-art methods. The proposed method is in fact the most successful among all cost aggregation methods based on standard stereo test beds.

144 citations

Patent
17 Sep 2002
TL;DR: In this paper, a disparity prediction stage and a motion prediction stage predict disparity vector and motion vector by extending a MPEG-2 structure into a view axis and using spatial/temporal correlation.
Abstract: A disparity prediction stage and a motion prediction stage predict a disparity vector and a motion vector by extending a MPEG-2 structure into a view axis and using spatial/temporal correlation. A disparity/motion compensation stage compensates an image reconstructed by the disparity prediction stage and the motion prediction stage by using a sub-pixel compensation method. A residual image encoding stage performs an encoding to provide a better visual quality and a three-dimensional effect of an original image and the reconstructed image. A bit rate control stage controls a bit rate for assigning an effective amount of bit to each frame on the reconstructed image according to a bit rate. An entropy encoding stage generates a bit stream on multi-view video source data according to the bit rate.

126 citations


Cited by
More filters
Proceedings Article
01 Jan 1994
TL;DR: The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images.
Abstract: MUCKE aims to mine a large volume of images, to structure them conceptually and to use this conceptual structuring in order to improve large-scale image retrieval. The last decade witnessed important progress concerning low-level image representations. However, there are a number problems which need to be solved in order to unleash the full potential of image mining in applications. The central problem with low-level representations is the mismatch between them and the human interpretation of image content. This problem can be instantiated, for instance, by the incapability of existing descriptors to capture spatial relationships between the concepts represented or by their incapability to convey an explanation of why two images are similar in a content-based image retrieval framework. We start by assessing existing local descriptors for image classification and by proposing to use co-occurrence matrices to better capture spatial relationships in images. The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images. Consequently, we introduce methods which tackle these two problems and compare results to state of the art methods. Note: some aspects of this deliverable are withheld at this time as they are pending review. Please contact the authors for a preview.

2,134 citations

Reference EntryDOI
15 Oct 2004

2,118 citations

Proceedings Article
01 Jan 1989
TL;DR: A scheme is developed for classifying the types of motion perceived by a humanlike robot and equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented.
Abstract: A scheme is developed for classifying the types of motion perceived by a humanlike robot. It is assumed that the robot receives visual images of the scene using a perspective system model. Equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented. >

2,000 citations

Journal ArticleDOI
TL;DR: A new heuristic for feature detection is presented and, using machine learning, a feature detector is derived from this which can fully process live PAL video using less than 5 percent of the available processing time.
Abstract: The repeatability and efficiency of a corner detector determines how likely it is to be useful in a real-world application. The repeatability is important because the same scene viewed from different positions should yield features which correspond to the same real-world 3D locations. The efficiency is important because this determines whether the detector combined with further processing can operate at frame rate. Three advances are described in this paper. First, we present a new heuristic for feature detection and, using machine learning, we derive a feature detector from this which can fully process live PAL video using less than 5 percent of the available processing time. By comparison, most other detectors cannot even operate at frame rate (Harris detector 115 percent, SIFT 195 percent). Second, we generalize the detector, allowing it to be optimized for repeatability, with little loss of efficiency. Third, we carry out a rigorous comparison of corner detectors based on the above repeatability criterion applied to 3D scenes. We show that, despite being principally constructed for speed, on these stringent tests, our heuristic detector significantly outperforms existing feature detectors. Finally, the comparison demonstrates that using machine learning produces significant improvements in repeatability, yielding a detector that is both very fast and of very high quality.

1,847 citations