scispace - formally typeset
Search or ask a question

Showing papers in "The Visual Computer in 2014"


Journal ArticleDOI
TL;DR: A comprehensive survey and key insights into this fast-rising area of InfoVis are presented, which identifies existing technical challenges and propose directions for future research.
Abstract: Information visualization (InfoVis), the study of transforming data, information, and knowledge into interactive visual representations, is very important to users because it provides mental models of information. The boom in big data analytics has triggered broad use of InfoVis in a variety of domains, ranging from finance to sports to politics. In this paper, we present a comprehensive survey and key insights into this fast-rising area. The research on InfoVis is organized into a taxonomy that contains four main categories, namely empirical methodologies, user interactions, visualization frameworks, and applications, which are each described in terms of their major goals, fundamental principles, recent trends, and state-of-the-art approaches. At the conclusion of this survey, we identify existing technical challenges and propose directions for future research.

328 citations


Journal ArticleDOI
TL;DR: This work introduces group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness.
Abstract: Efficiently identifying salient objects in large image collections is essential for many applications including image retrieval, surveillance, image annotation, and object recognition. We propose a simple, fast, and effective algorithm for locating and segmenting salient objects by analysing image collections. As a key novelty, we introduce group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness. To evaluate our method, we construct a large benchmark dataset consisting of 15 K images across multiple categories with 6000+ pixel-accurate ground truth annotations for salient object regions where applicable. In all our tests, group saliency consistently outperforms state-of-the-art single-image saliency algorithms, resulting in both higher precision and better recall. Our algorithm successfully handles image collections, of an order larger than any existing benchmark datasets, consisting of diverse and heterogeneous images from various internet sources.

304 citations


Journal ArticleDOI
TL;DR: A comprehensive survey on low-resolution face recognition methods, including concept description, system architecture, and method categorization is given and promising trends and crucial issues for future research are discussed.
Abstract: Low-resolution face recognition (LR FR) aims to recognize faces from small size or poor quality images with varying pose, illumination, expression, etc. It has received much attention with increasing demands for long distance surveillance applications, and extensive efforts have been made on LR FR research in recent years. However, many issues in LR FR are still unsolved, such as super-resolution (SR) for face recognition, resolution-robust features, unified feature spaces, and face detection at a distance, although many methods have been developed for that. This paper provides a comprehensive survey on these methods and discusses many related issues. First, it gives an overview on LR FR, including concept description, system architecture, and method categorization. Second, many representative methods are broadly reviewed and discussed. They are classified into two different categories, super-resolution for LR FR and resolution-robust feature representation for LR FR. Their strategies and advantages/disadvantages are elaborated. Some relevant issues such as databases and evaluations for LR FR are also presented. By generalizing their performances and limitations, promising trends and crucial issues for future research are finally discussed.

122 citations


Journal ArticleDOI
TL;DR: An overview of the various technologies that have been developed in recent years to assist the visually impaired in recognizing generic objects in an indoors environment with a focus on approaches based on computer vision is provided.
Abstract: Though several electronic assistive devices have been developed for the visually impaired in the past few decades, however, relatively few solutions have been devised to aid them in recognizing generic objects in their environment, particularly indoors. Nevertheless, research in this area is gaining momentum. Among the various technologies being utilized for this purpose, computer vision based solutions are emerging as one of the most promising options mainly due to their affordability and accessibility. This paper provides an overview of the various technologies that have been developed in recent years to assist the visually impaired in recognizing generic objects in an indoors environment with a focus on approaches based on computer vision. It aims to introduce researchers to the latest trends in this area as well as to serve as a resource for developers who wish to incorporate such solutions into their own work.

108 citations


Journal ArticleDOI
TL;DR: A physiological inverse tone mapping algorithm inspired by the property of the Human Visual System (HVS) first imitates the retina response and deduce it to be local adaptive; then it estimates local adaptation luminance at each point in the image; finally, the LDR image and local luminance are applied to the inversed local retina response to reconstruct the dynamic range of the original scene.
Abstract: The mismatch between the Low Dynamic Range (LDR) content and the High Dynamic Range (HDR) display arouses the research on inverse tone mapping algorithms. In this paper, we present a physiological inverse tone mapping algorithm inspired by the property of the Human Visual System (HVS). It first imitates the retina response and deduce it to be local adaptive; then estimates local adaptation luminance at each point in the image; finally, the LDR image and local luminance are applied to the inversed local retina response to reconstruct the dynamic range of the original scene. The good performance and high-visual quality were validated by operating on 40 test images. Comparison results with several existing inverse tone mapping methods prove the conciseness and efficiency of the proposed algorithm.

105 citations


Journal ArticleDOI
TL;DR: A comprehensive investigation on which are, on the average, the most relevant facial features, how effective can be computer algorithms for detecting siblings pairs, and if they can outperform human evaluation.
Abstract: In everyday life, face similarity is an important kinship clue. Computer algorithms able to infer kinship from pairs of face images could be applied in forensics, image retrieval and annotation, and historical studies. So far, little work in this area has been presented, and only one study, using a small set of low quality images, tackles the problem of identifying siblings pairs. The purpose of our paper is to present a comprehensive investigation on this subject, aimed at understanding which are, on the average, the most relevant facial features, how effective can be computer algorithms for detecting siblings pairs, and if they can outperform human evaluation. To avoid problems due to low quality pictures and uncontrolled imaging conditions, as for the heterogeneous datasets collected for previous researches, we prepared a database of high quality pictures of sibling pairs, shot in controlled conditions and including frontal, profile, expressionless, and smiling faces. Then we constructed various classifiers of image pairs using different types of facial data, based on various geometric, textural, and holistic features. The classifiers were first tested separately, and then the most significant facial data, selected with a two stage feature selection algorithm were combined into a unique classifier. The discriminating ability of the automatic classifier combining features of different nature has been found to outperform that of a panel of human raters. We also show the good generalization capabilities of the algorithm by applying the classifier, in a cross-database experiment, to a low quality database of images collected from the Internet.

83 citations


Journal ArticleDOI
TL;DR: A new method which is based on artificial neural networks (ANN) and an image processing technique was used for identification of butterfly species as an alternative to conventional diagnostic methods and suggested that the texture and color features can be useful for identificationof butterfly species.
Abstract: Butterflies can be classified by their outer morphological qualities, genital characteristics that can be obtained using various chemical substances and methods which are carried out manually by preparing genital slides through some certain processes or molecular techniques which is a very expensive method. In this study, a new method which is based on artificial neural networks (ANN) and an image processing technique was used for identification of butterfly species as an alternative to conventional diagnostic methods. Five texture and three color features obtained from 140 butterfly images were used for identification of species. Texture features were obtained by using the average of gray level co-occurrence matrix (GLCM) with different angles and distances. The accuracy of the purposed butterfly classification method has reached 92.85Â %. These findings suggested that the texture and color features can be useful for identification of butterfly species.

80 citations


Journal ArticleDOI
TL;DR: The developed solution enables natural and intuitive hand-pose recognition of American Sign Language (ASL), extending the recognition to ambiguous letters not challenged by previous work.
Abstract: This work targets real-time recognition of both static hand-poses and dynamic hand-gestures in a unified open-source framework. The developed solution enables natural and intuitive hand-pose recognition of American Sign Language (ASL), extending the recognition to ambiguous letters not challenged by previous work. While hand-pose recognition exploits techniques working on depth information using texture-based descriptors, gesture recognition evaluates hand trajectories in the depth stream using angular features and hidden Markov models (HMM). Although classifiers come already trained on ASL alphabet and 16 uni-stroke dynamic gestures, users are able to extend these default sets by adding their personalized poses and gestures. The accuracy and robustness of the recognition system have been evaluated using a publicly available database and across many users. The XKin open project is available online (Pedersoli, XKin libraries. https://github.com/fpeder/XKin , 2013) under FreeBSD License for researchers in human---machine interaction.

70 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed algorithm is able to efficiently track salient objects and is better accounted for partial occlusions and large variations in appearance.
Abstract: In this paper we represent the object with multiple attentional blocks which reflect some findings of selective visual attention in human perception. The attentional blocks are extracted using a branch-and-bound search method on the saliency map, and meanwhile the weight of each block is determined. Independent particle filter tracking is applied to each attentional block and the tracking results of all the blocks are then combined in a linear weighting scheme to get the location of the entire target object. The attentional blocks are propagated to the object location found in each new frame and the state of the most likely particle in each block is also updated with the new propagated position. In addition, to avoid error accumulation caused by the appearance variations, the object template and the positions of the attentional blocks are adaptively updated while tracking. Experimental results show that the proposed algorithm is able to efficiently track salient objects and is better accounted for partial occlusions and large variations in appearance.

45 citations


Journal ArticleDOI
M.A. Berbar1
TL;DR: This research paper introduces three robust approaches for features extraction for gender classification based on using Discrete Cosine Transform (DCT), the extraction of texture features using the gray-level cooccurrence matrix (GLCM), and a third approach based on 2D-wavelet transform.
Abstract: This research paper introduces three robust approaches for features extraction for gender classification. The first approach is based on using Discrete Cosine Transform (DCT) and consists of two different methods for calculating features values. The second approach is based on the extraction of texture features using the gray-level cooccurrence matrix (GLCM). The third approach is based on 2D-wavelet transform. The extracted features vectors are classified using SVM. For precise evaluation, the databases used for gender evaluation are based on images from the AT@T, Faces94, UMIST, and color FERET databases. K-fold cross validation is used in training the SVM. The accuracies of gender classification when using one of the two proposed DCT methods for features extraction are 98.6 %, 99.97 %, 99.90 %, and 93.3 % with 2-fold cross validation, and 98.93 %, 100 %, 99.9 %, and 92.18 % with 5-fold cross validation. The accuracies of GLCM texture features approach for facial gender classification are 98.8 %, 99.6 %, 100 %, and 93.11 %, for AT@T, Faces94, UMIST, and FERET, databases. The accuracies for all databases when using 2D-WT are ranging between 96.18 % and 99.6 % except FERET and its accuracy is 92 %.

43 citations


Journal ArticleDOI
TL;DR: This technique is found to have an edge over the other contemporary methods in terms of Entropy and Absolute Mean Brightness Error.
Abstract: A novel technique, Optimized Bi-Histogram Equalization (OBHE), is proposed in this paper for preserving brightness and enhancing the contrast of any input image. The central idea of this technique is to first segment the histogram of the input image into two, based on its mean and then weighting constraints are applied to each of the sub-histograms separately. Those two histograms are equalized independently and their union produces a brightness-preserved and contrast-enhanced output image. While formulating the weighting constraints, Particle Swarm Optimization (PSO) is employed to find the optimal constraints in order to maximize the degree of brightness preservation and contrast enhancement. This technique is found to have an edge over the other contemporary methods in terms of Entropy and Absolute Mean Brightness Error.

Journal ArticleDOI
TL;DR: A hierarchical model for action recognition that can robustly recognize actions in real-time and handle confusing motions is proposed, and bag-of-words is used to represent the classification features.
Abstract: Action recognition solely based on video data has known to be very sensitive to background activity, and also lacks the ability to discriminate complex 3D motion. With the development of commercial depth cameras, skeleton-based action recognition is becoming more and more popular. However, the skeleton-based approach is still very challenging because of the large variation in human actions and temporal dynamics. In this paper, we propose a hierarchical model for action recognition. To handle confusing motions, a motion-based grouping method is proposed, which can efficiently assign each video a group label, and then for each group, a pre-trained classifier is used for frame-labeling. Unlike previous methods, we adopt a bottom-up approach that first performs action recognition for each frame. The final action label is obtained by fusing the classification to its frames, with the effect of each frame being adaptively adjusted based on its local properties. To achieve online real-time performance and suppressing noise, bag-of-words is used to represent the classification features. The proposed method is evaluated using two challenge datasets captured by a Kinect. Experiments show that our method can robustly recognize actions in real-time.

Journal ArticleDOI
TL;DR: The binary patterns generated using threshold as a summation of center pixel value and average local differences are proposed and can achieve higher classification accuracy while being more robust to noise.
Abstract: Effectiveness of local binary pattern (LBP) features is well proven in the field of texture image classification and retrieval. This paper presents a more effective completed modeling of the LBP. The traditional LBP has a shortcoming that sometimes it may represent different structural patterns with same LBP code. In addition, LBP also lacks global information and is sensitive to noise. In this paper, the binary patterns generated using threshold as a summation of center pixel value and average local differences are proposed. The proposed local structure patterns (LSP) can more accurately classify different textural structures as they utilize both local and global information. The LSP can be combined with a simple LBP and center pixel pattern to give a completed local structure pattern (CLSP) to achieve higher classification accuracy. In order to make CLSP insensitive to noise, a robust local structure pattern (RLSP) is also proposed. The proposed scheme is tested over three representative texture databases viz. Outex, Curet, and UIUC. The experimental results indicate that the proposed method can achieve higher classification accuracy while being more robust to noise.

Journal ArticleDOI
TL;DR: This paper presents a 3D face recognition approach based on the meshDOG keypoints detector and local GH descriptor, and proposes original solutions to improve keypoints stability and select the most effective features from the local descriptors.
Abstract: 3D face identification based on the detection and comparison of keypoints of the face is a promising solution to extend face recognition approaches to the case of 3D scans with occlusions and missing parts. In fact, approaches that perform sparse keypoints matching can naturally allow for partial face comparison. However, such methods typically use a large number of keypoints, locally described by high-dimensional feature vectors: This, combined with the combinatorial number of keypoint comparisons required to match two face scans, results in a high computational cost that does not scale well with large datasets. Motivated by these considerations, in this paper, we present a 3D face recognition approach based on the meshDOG keypoints detector and local GH descriptor, and propose original solutions to improve keypoints stability and select the most effective features from the local descriptors. Experiments have been performed to assess the validity of the proposed optimizations for stable keypoints detection and feature selection. Recognition accuracy has been evaluated on the Bosphorus database, showing competitive results with respect to existing 3D face identification solutions based on 3D keypoints.

Journal ArticleDOI
TL;DR: This paper proposes a method to segment and label bone fragments from CT images based on 2D region growing and requires minimal user interaction and is able to separate wrongly joined fragments during the segmentation process.
Abstract: The segmentation of fractured bone from computed tomographies (CT images) is an important process in medical visualization and simulation, because it enables such applications to use data of a specific patient. On the other hand, the labeling of fractured bone usually requires the participation of an expert. Moreover, close fragment can be joined after the segmentation because of their proximity and the resolution of the CT image. Classical methods perform well in the segmentation of healthy bone, but they are not able to identify bone fragments separately. In this paper, we propose a method to segment and label bone fragments from CT images. Labeling involves the identification of bone fragments separately. The method is based on 2D region growing and requires minimal user interaction. In addition, the presented method is able to separate wrongly joined fragments during the segmentation process.

Journal ArticleDOI
TL;DR: The use of an octree grid is used that takes a middle path between these two structures, and accelerates collision detection by significantly reducing the number of broad-phase tests which, due to their large quantity, are generally the main bottleneck in performance.
Abstract: In spatial subdivision-based collision detection methods on GPUs, uniform subdivision works well for even triangle spatial distributions, whilst for uneven cases non-uniform subdivision works better. Non-uniform subdivision techniques mainly include hierarchical grids and octrees. Hierarchical grids have been adopted for previous GPU-based approaches, due to their suitability for GPUs. However, octrees offer a better adaptation to distributions. One contribution of this paper is the use of an octree grid that takes a middle path between these two structures, and accelerates collision detection by significantly reducing the number of broad-phase tests which, due to their large quantity, are generally the main bottleneck in performance. Another contribution is to achieve further reduction in the number of tests in the broad phase using a two-stage scheme to improve octree subdivision. The octree grid approach is also able to address the issue of uneven triangle sizes, another common difficulty for spatial subdivision techniques. Compared to the virtual subdivision method which reports the fastest results among existing methods, speedups between 1.0 $$\times $$ × and 1.5 $$\times $$ × are observed for most standard benchmarks where triangle sizes and spatial distributions are uneven.

Journal ArticleDOI
TL;DR: This work presents an efficient approach for high-quality non-blind deconvolution based on the use of sparse adaptive priors that enforces preservation of strong edges while removing noise, and shows that its results tend to have higher peak signal-to-noise ratio than the state-of-the-art techniques.
Abstract: We present an efficient approach for high-quality non-blind deconvolution based on the use of sparse adaptive priors. Its regularization term enforces preservation of strong edges while removing noise. We model the image-prior deconvolution problem as a linear system, which is solved in the frequency domain. This clean formulation lends to a simple and efficient implementation. We demonstrate its effectiveness by performing an extensive comparison with existing non-blind deconvolution methods, and by using it to deblur photographs degraded by camera shake. Our experiments show that our solution is faster and its results tend to have higher peak signal-to-noise ratio than the state-of-the-art techniques. Thus, it provides an attractive alternative to perform high-quality non-blind deconvolution of large images, as well as to be used as the final step of blind-deconvolution algorithms.

Journal ArticleDOI
TL;DR: The present approach is based on the formulation of a nonlinear cost function from the determination of a relationship between two points of the scene and their projections in the image planes and the resolution of this function enables us to estimate the intrinsic parameters of different cameras.
Abstract: This work proposes a method of camera self-calibration having varying intrinsic parameters from a sequence of images of an unknown 3D object. The projection of two points of the 3D scene in the image planes is used with fundamental matrices to determine the projection matrices. The present approach is based on the formulation of a nonlinear cost function from the determination of a relationship between two points of the scene and their projections in the image planes. The resolution of this function enables us to estimate the intrinsic parameters of different cameras. The strong point of the present approach is clearly seen in the minimization of the three constraints of a self-calibration system (a pair of images, 3D scene, any camera): The use of a single pair of images provides fewer equations, which minimizes the execution time of the program, the use of a 3D scene reduces the planarity constraints, and the use of any camera eliminates the constraints of cameras having constant parameters. The experiment results on synthetic and real data are presented to demonstrate the performance of the present approach in terms of accuracy, simplicity, stability, and convergence.

Journal ArticleDOI
TL;DR: A new 3D shape classification and retrieval method, based on a supervised selection of the most significant features in a space of attributed extended Reeb graphs encoding different shape characteristics, is proposed.
Abstract: We propose in this article a new 3D shape classification and retrieval method, based on a supervised selection of the most significant features in a space of attributed extended Reeb graphs encoding different shape characteristics. The similarity between pairs of graphs is addressed through both their representation as set of bags of shortest paths, and the definition of kernels adapted to these descriptions. A multiple kernel learning algorithm is used on this set of kernels to find an optimal linear combination of kernels for classification and retrieval purposes. Results on classical data sets are comparable with the best results of the literature, and the modularity and flexibility of the kernel learning ensure its applicability to a large set of methods.

Journal ArticleDOI
TL;DR: A novel architecture for template generation in the context of situation awareness system in real and virtual applications is presented and a novel cancelable biometric template generation algorithm utilizing random biometric fusion, random projection and selection is developed.
Abstract: Recently, cancelable biometrics emerged as one of the highly effective methods of template protection. The concept behind the cancelable biometrics or cancelability is a transformation of a biometric data or extracted feature into an alternative form, which cannot be used by the imposter or intruder easily, and can be revoked if compromised. In this paper, we present a novel architecture for template generation in the context of situation awareness system in real and virtual applications. We develop a novel cancelable biometric template generation algorithm utilizing random biometric fusion, random projection and selection. Proposed random cross-folding method generate cancelable biometric template from multiple biometric traits. We further validate the performance of the proposed algorithm using a virtual multimodal face and ear database.

Journal ArticleDOI
TL;DR: A novel method to automatically reconstruct tree geometry from inhomogeneous point clouds created by a laser scanner is proposed, using principal curvatures as indicators for branches and creating branch skeletons for dense regions and sparse regions.
Abstract: Trees are an important asset for natural-looking digital environments. We propose a novel method to automatically reconstruct tree geometry from inhomogeneous point clouds created by a laser scanner. While previous approaches focus either on dense or sparse point clouds, our hybrid method allows for the reconstruction of a tree from an inhomogeneous point cloud without further preprocessing. Using principal curvatures as indicators for branches, we detect ellipses in branch cross-sections and create branch skeletons for dense regions. For sparse regions we approximate branch skeletons with a spanning tree. Branch widths are obtained from the ellipse fitting in dense regions and propagated to the sparse regions, to create geometry for the whole tree. We demonstrate the effectiveness of our approach in several real-world examples.

Journal ArticleDOI
TL;DR: A system to easily capture building interiors and automatically generate floor plans scaled to their metric dimensions is presented, exploiting the redundancy of the instruments commonly available on commodity smartphones.
Abstract: We present a system to easily capture building interiors and automatically generate floor plans scaled to their metric dimensions. The proposed approach is able to manage scenes not necessarily limited to the Manhattan World assumption, exploiting the redundancy of the instruments commonly available on commodity smartphones, such as accelerometer, magnetometer and camera. Without specialized training or equipment, our system can produce a 2D floor plan and a representative 3D model of the scene accurate enough to be used for simulations and interactive applications.

Journal ArticleDOI
TL;DR: A rotation invariant distance function to be used by a random forest algorithm to perform the human action recognition, requiring only silhouettes of actors, has an accuracy comparable to the related works and it performs well even in varying rotation.
Abstract: Human action recognition is an important problem in Computer Vision. Although most of the existing solutions provide good accuracy results, the methods are often overly complex and computationally expensive, hindering practical applications. In this regard, we introduce the combination of time-series representation for the silhouette and Symbolic Aggregate approXimation (SAX), which we refer to as SAX-Shapes, to address the problem of human action recognition. Given an action sequence, the extracted silhouettes of an actor from every frame are transformed into time series. Each of these time series is then efficiently converted into the symbolic vector: SAX. The set of all these SAX vectors (SAX-Shape) represents the action. We propose a rotation invariant distance function to be used by a random forest algorithm to perform the human action recognition. Requiring only silhouettes of actors, the proposed method is validated on two public datasets. It has an accuracy comparable to the related works and it performs well even in varying rotation.

Journal ArticleDOI
TL;DR: This paper proposes an approach to generate a per-pixel confidence measurement for each depth map captured by Kinect devices in indoor environments through supervised learning, and trains depth map estimators using Random Forest regressor.
Abstract: All depth data captured by Kinect devices are noisy, and sometimes even lost or shifted, especially around the edges of the depth. In this paper, we propose an approach to generate a per-pixel confidence measurement for each depth map captured by Kinect devices in indoor environments through supervised learning. Several distinguishing features from both the color images and depth maps are selected to train depth map estimators using Random Forest regressor. Using this estimator, we can predict a confidence map of any depth map captured by Kinect devices. Usage of other devices, such as an industrial laser scanner, is unnecessary, making the implementation more convenient. The experiments demonstrate precise confidence prediction of the depth.

Journal ArticleDOI
TL;DR: This paper proposes a novel patch-based method for automatic completion of stereoscopic images and the corresponding depth/disparity maps simultaneously using a patch distance metric designed to take the appearance, depth gradients and depth inconsistency into account.
Abstract: In this paper, we have proposed a novel patch-based method for automatic completion of stereoscopic images and the corresponding depth/disparity maps simultaneously. The missing depths are estimated in local feature space and a patch distance metric is designed to take the appearance, depth gradients and depth inconsistency into account. To ensure the proper stereopsis, we first search for the proper stereoscopic patch in both left and right images according to the distance metric, and then iteratively refine the images. Our method is capable of dealing with general scenes including both frontal-parallel and non-frontal-parallel objects. Experimental results show that our method is superior to previous ones with better stereoscopically consistent content and more plausible completion.

Journal ArticleDOI
TL;DR: A novel, real-time and robust hand tracking system, capable of tracking the articulated hand motion in full degrees of freedom (DOF) using a single depth camera is introduced, and it outperforms the state-of-the-art model based hand tracking systems in terms of both speed and accuracy.
Abstract: In this paper, we introduce a novel, real-time and robust hand tracking system, capable of tracking the articulated hand motion in full degrees of freedom (DOF) using a single depth camera. Unlike most previous systems, our system is able to initialize and recover from tracking loss automatically. This is achieved through an efficient two-stage k-nearest neighbor database searching method proposed in the paper. It is effective for searching from a pre-rendered database of small hand depth images, designed to provide good initial guesses for model based tracking. We also propose a robust objective function, and improve the Particle Swarm Optimization algorithm with a resampling based strategy in model based tracking. It provides continuous solutions in full DOF hand motion space more efficiently than previous methods. Our system runs at 40 fps on a GeForce GTX 580 GPU and experimental results show that the system outperforms the state-of-the-art model based hand tracking systems in terms of both speed and accuracy. The work result is of significance to various applications in the field of human---computer-interaction and virtual reality.

Journal ArticleDOI
TL;DR: A new method for pairwise matching of broken fragments from unorganized point clouds using a new descriptor that contains not only the cluster of feature points but also curves along the principal directions of the cluster.
Abstract: In this paper, we introduce a new method for pairwise matching of broken fragments from unorganized point clouds. We use a new descriptor that contains not only the cluster of feature points but also curves along the principal directions of the cluster. In our method, feature points are extracted by using the curvature values of points. Curves of the descriptor are approximated using Fourier series. The main idea is motivated by comparing descriptor curves between each cluster of matching faces. For comparing curves, the Fourier coefficients of each curve are computed by using Fast Fourier Transform and total energies of curves are compared.

Journal ArticleDOI
TL;DR: A framework for creating 3D animated characters using a simple sketching interface coupled with a large, unannotated motion database that is used to find the appropriate motion sequences corresponding to the input sketches to improve matching in the context of the existing animation sequences is presented.
Abstract: Quick creation of 3D character animations is an important task in game design, simulations, forensic animation, education, training, and more. We present a framework for creating 3D animated characters using a simple sketching interface coupled with a large, unannotated motion database that is used to find the appropriate motion sequences corresponding to the input sketches. Contrary to the previous work that deals with static sketches, our input sketches can be enhanced by motion and rotation curves that improve matching in the context of the existing animation sequences. Our framework uses animated sequences as the basic building blocks of the final animated scenes, and allows for various operations with them such as trimming, resampling, or connecting by use of blending and interpolation. A database of significant and unique poses, together with a two-pass search running on the GPU, allows for interactive matching even for large amounts of poses in a template database. The system provides intuitive interfaces, an immediate feedback, and poses very small requirements on the user. A user study showed that the system can be used by novice users with no animation experience or artistic talent, as well as by users with an animation background. Both groups were able to create animated scenes consisting of complex and varied actions in less than 20 minutes.

Journal ArticleDOI
TL;DR: A novel approach to resample the input into regularly sampled 3D light fields by aligning them in the spatio-temporal domain, and a technique for high-quality disparity estimation from light fields are included.
Abstract: We propose a method to acquire 3D light fields using a hand-held camera, and describe several computational photography applications facilitated by our approach. As our input we take an image sequence from a camera translating along an approximately linear path with limited camera rotations. Users can acquire such data easily in a few seconds by moving a hand-held camera. We include a novel approach to resample the input into regularly sampled 3D light fields by aligning them in the spatio-temporal domain, and a technique for high-quality disparity estimation from light fields. We show applications including digital refocusing and synthetic aperture blur, foreground removal, selective colorization, and others.

Journal ArticleDOI
TL;DR: By fusing raw depth values with image color, edges and smooth priors in a Markov random field optimization framework, both misalignment and large holes can be eliminated effectively and the method thus can produce high-quality depth maps that are consistent with the color image.
Abstract: Current low-cost depth sensing techniques, such as Microsoft Kinect, still can achieve only limited precision. The resultant depth maps are often found to be noisy, misaligned with the color images, and even contain many large holes. These limitations make it difficult to be adopted by many graphics applications. In this paper, we propose a computational approach to address the problem. By fusing raw depth values with image color, edges and smooth priors in a Markov random field optimization framework, both misalignment and large holes can be eliminated effectively, our method thus can produce high-quality depth maps that are consistent with the color image. To achieve this, a confidence map is estimated for adaptive weighting of different cues, an image inpainting technique is introduced to handle large holes, and contrasts in the color image are also considered for an accurate alignment. Experimental results demonstrate the effectiveness of our method.