Author
Nianhua Xie
Other affiliations: Temple University
Bio: Nianhua Xie is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Contextual image classification & Histogram. The author has an hindex of 8, co-authored 16 publications receiving 857 citations. Previous affiliations of Nianhua Xie include Temple University.
Papers
More filters
01 Nov 2011
TL;DR: Methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, and video retrieval including query interfaces are analyzed.
Abstract: Video indexing and retrieval have a wide spectrum of promising applications, motivating the interest of researchers worldwide. This paper offers a tutorial and an overview of the landscape of general strategies in visual content-based video indexing and retrieval, focusing on methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, video retrieval including query interfaces, similarity measure and relevance feedback, and video browsing. Finally, we analyze future research directions.
606 citations
TL;DR: Experimental results demonstrate that the proposed tracking system can effectively tackle the difficulties caused by LFR, and an integral image based parameter calculation is constructed, which greatly reduces the computational load.
Abstract: Tracking in low frame rate (LFR) videos is one of the most important problems in the tracking literature. Most existing approaches treat LFR video tracking as an abrupt motion tracking problem. However, in LFR video tracking applications, LFR not only causes abrupt motions, but also large appearance changes of objects because the objects' poses and the illumination may undergo large changes from one frame to the next. This adds extra difficulties to LFR video tracking. In this paper, we propose a robust and general tracking system for LFR videos. The tracking system consists of four major parts: dominant color-spatial based object representation, bin-ratio based similarity measure, annealed particle swarm optimization (PSO) based searching, and an integral image based parameter calculation. The first two parts are combined to provide a good solution to the appearance changes, and the abrupt motion is effectively captured by the annealed PSO based searching. Moreover, an integral image of model parameters is constructed, which provides a look-up table for parameters calculation. This greatly reduces the computational load. Experimental results demonstrate that the proposed tracking system can effectively tackle the difficulties caused by LFR.
73 citations
13 Jun 2010
TL;DR: It is shown that bin-ratio information, which is collected from the ratios between bin values of histograms, provides several attractive advantages for category and scene classification tasks: first, BRD is robust to cluttering, partial occlusion and histogram normalization, and second,BRD captures rich co-occurrence information while enjoying a linear computational complexity.
Abstract: In this paper we propose using bin-ratio information, which is collected from the ratios between bin values of histograms, for scene and category classification. To use such information, a new histogram dissimilarity, bin-ratio dissimilarity (BRD), is designed. We show that BRD provides several attractive advantages for category and scene classification tasks: First, BRD is robust to cluttering, partial occlusion and histogram normalization; Second, BRD captures rich co-occurrence information while enjoying a linear computational complexity; Third, BRD can be easily combined with other dissimilarity measures, such as L 1 and χ2, to gather complimentary information. We apply the proposed methods to category and scene classification tasks in the bag-of-words framework. The experiments are conducted on several widely tested datasets including PASCAL 2005, PASCAL 2008, Oxford flowers, and Scene-15 dataset. In all experiments, the proposed methods demonstrate excellent performance in comparison with previously reported solutions.
54 citations
01 Oct 2009
TL;DR: Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that the proposed unsupervised active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification.
Abstract: Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.
51 citations
TL;DR: The algorithm emphasizes the foreground features, which are important for image classification, and preserves or even enhances semantically important structures in the foreground, and inhibits and smoothes clutter in the background.
Abstract: In this paper, we propose saliency driven image multiscale nonlinear diffusion filtering. The resulting scale space in general preserves or even enhances semantically important structures such as edges, lines, or flow-like structures in the foreground, and inhibits and smoothes clutter in the background. The image is classified using multiscale information fusion based on the original image, the image at the final scale at which the diffusion process converges, and the image at a midscale. Our algorithm emphasizes the foreground features, which are important for image classification. The background image regions, whether considered as contexts of the foreground or noise to the foreground, can be globally handled by fusing information from different scales. Experimental tests of the effectiveness of the multiscale space for the image classification are conducted on the following publicly available datasets: 1) the PASCAL 2005 dataset; 2) the Oxford 102 flowers dataset; and 3) the Oxford 17 flowers dataset, with high classification rates.
45 citations
Cited by
More filters
[...]
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality.
Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …
33,785 citations
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher:
The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
3,627 citations
[...]
TL;DR: The need to develop appropriate and efficient analytical methods to leverage massive volumes of heterogeneous data in unstructured text, audio, and video formats is highlighted and the need to devise new tools for predictive analytics for structured big data is reinforced.
Abstract: We define what is meant by big data.We review analytics techniques for text, audio, video, and social media data.We make the case for new statistical techniques for big data.We highlight the expected future developments in big data analytics. Size is the first, and at times, the only dimension that leaps out at the mention of big data. This paper attempts to offer a broader definition of big data that captures its other unique and defining characteristics. The rapid evolution and adoption of big data by industry has leapfrogged the discourse to popular outlets, forcing the academic press to catch up. Academic journals in numerous disciplines, which will benefit from a relevant discussion of big data, have yet to cover the topic. This paper presents a consolidated description of big data by integrating definitions from practitioners and academics. The paper's primary focus is on the analytic methods used for big data. A particular distinguishing feature of this paper is its focus on analytics related to unstructured data, which constitute 95% of big data. This paper highlights the need to develop appropriate and efficient analytical methods to leverage massive volumes of heterogeneous data in unstructured text, audio, and video formats. This paper also reinforces the need to devise new tools for predictive analytics for structured big data. The statistical methods in practice were devised to infer from sample data. The heterogeneity, noise, and the massive size of structured big data calls for developing computationally efficient algorithms that may avoid big data pitfalls, such as spurious correlation.
2,962 citations
[...]
TL;DR: The background and state-of-the-art of big data are reviewed, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid, as well as related technologies.
Abstract: In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.
2,303 citations