Other affiliations: Tsinghua University, University of Science and Technology of China, Southwest Jiaotong University ...read more
Bio: Zhu Li is an academic researcher from University of Missouri–Kansas City. The author has contributed to research in topics: Point cloud & Video tracking. The author has an hindex of 29, co-authored 258 publications receiving 3529 citations. Previous affiliations of Zhu Li include Tsinghua University & University of Science and Technology of China.
Papers published on a yearly basis
TL;DR: The main developments and technical aspects of this ongoing standardization effort for compactly representing 3D point clouds, which are the 3D equivalent of the very well-known 2D pixels are introduced.
Abstract: Due to the increased popularity of augmented and virtual reality experiences, the interest in capturing the real world in multiple dimensions and in presenting it to users in an immersible fashion has never been higher. Distributing such representations enables users to freely navigate in multi-sensory 3D media experiences. Unfortunately, such representations require a large amount of data, not feasible for transmission on today’s networks. Efficient compression technologies well adopted in the content chain are in high demand and are key components to democratize augmented and virtual reality applications. Moving Picture Experts Group, as one of the main standardization groups dealing with multimedia, identified the trend and started recently the process of building an open standard for compactly representing 3D point clouds, which are the 3D equivalent of the very well-known 2D pixels. This paper introduces the main developments and technical aspects of this ongoing standardization effort.
TL;DR: A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification that classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers.
Abstract: Accurate and timely traffic classification is critical in network security monitoring and traffic engineering. Traditional methods based on port numbers and protocols have proven to be ineffective in terms of dynamic port allocation and packet encapsulation. The signature matching methods, on the other hand, require a known signature set and processing of packet payload, can only handle the signatures of a limited number of IP packets in real-time. A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification. The method classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers. An optimized feature set is obtained via multiple classifier selection methods. Experimental results using traffic from campus backbone show that an accuracy of 99.42% is achieved with the regular biased training and testing samples. An accuracy of 97.17% is achieved when un-biased training and testing samples are used with the same feature set. Furthermore, as all the feature parameters are computable from the packet headers, the proposed method is also applicable to encrypted network traffic.
TL;DR: An attention-aided CNN model based on the traditional CNN model that incorporates attention modules to aid networks that focus on more discriminative channels or positions for spectral and spatial classifications of hyperspectral images is proposed.
Abstract: Convolutional neural networks (CNNs) have been widely used for hyperspectral image classification. As a common process, small cubes are first cropped from the hyperspectral image and then fed into CNNs to extract spectral and spatial features. It is well known that different spectral bands and spatial positions in the cubes have different discriminative abilities. If fully explored, this prior information will help improve the learning capacity of CNNs. Along this direction, we propose an attention-aided CNN model for spectral–spatial classification of hyperspectral images. Specifically, a spectral attention subnetwork and a spatial attention subnetwork are proposed for spectral and spatial classifications, respectively. Both of them are based on the traditional CNN model and incorporate attention modules to aid networks that focus on more discriminative channels or positions. In the final classification phase, the spectral classification result and the spatial classification result are combined together via an adaptively weighted summation method. To evaluate the effectiveness of the proposed model, we conduct experiments on three standard hyperspectral data sets. The experimental results show that the proposed model can achieve superior performance compared with several state-of-the-art CNN-related models.
••06 Nov 2011
TL;DR: This paper proposes a so-called complementary hashing approach, which is able to balance the precision and recall in a more effective way, and significantly improves the performance and outperforms the state-of-the-art on large scale ANN search.
Abstract: Recently, hashing based Approximate Nearest Neighbor (ANN) techniques have been attracting lots of attention in computer vision. The data-dependent hashing methods, e.g., Spectral Hashing, expects better performance than the data-blind counterparts, e.g., Locality Sensitive Hashing (LSH). However, most data-dependent hashing methods only employ a single hash table. When higher recall is desired, they have to retrieve exponentially growing number of hash buckets around the bucket containing the query, which may drag down the precision rapidly. In this paper, we propose a so-called complementary hashing approach, which is able to balance the precision and recall in a more effective way. The key idea is to employ multiple complementary hash tables, which are learned sequentially in a boosting manner, so that, given a query, its true nearest neighbors missed from the active bucket of one hash table are more likely to be found in the active bucket of the next hash table. Compared with LSH that also can exploit multiple hash tables, our approach is more effective to find true NNs, thanks to the complementarity property of the hash tables from our approach. Experimental results on large scale ANN search show that the proposed method significantly improves the performance and outperforms the state-of-the-art.
TL;DR: A joint adaptation, resource allocation and scheduling (JARS) algorithm, which allocates the communication resource based on the video users' quality of service, adapts video sources based on smart summarization, and schedules the transmissions to meet the frame delivery deadlines.
Abstract: Multi-user video streaming over wireless channels is a challenging problem, where the demand for better video quality and small transmission delays needs to be reconciled with the limited and often time-varying communication resources. This paper presents a framework for joint network optimization, source adaptation, and deadline-driven scheduling for multi-user video streaming over wireless networks. We develop a joint adaptation, resource allocation and scheduling (JARS) algorithm, which allocates the communication resource based on the video users' quality of service, adapts video sources based on smart summarization, and schedules the transmissions to meet the frame delivery deadlines. The proposed algorithm leads to near full utilization of the network resources and satisfies the delivery deadlines for all video frames. Substantial performance improvements are achieved compared with heuristic schemes that do not take the interactions between multiple users into consideration.
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
TL;DR: A survey of MCC is given, which helps general readers have an overview of the MCC including the definition, architecture, and applications and the issues, existing solutions, and approaches are presented.
Abstract: Together with an explosive growth of the mobile applications and emerging of cloud computing concept, mobile cloud computing (MCC) has been introduced to be a potential technology for mobile services. MCC integrates the cloud computing into the mobile environment and overcomes obstacles related to the performance (e.g., battery life, storage, and bandwidth), environment (e.g., heterogeneity, scalability, and availability), and security (e.g., reliability and privacy) discussed in mobile computing. This paper gives a survey of MCC, which helps general readers have an overview of the MCC including the definition, architecture, and applications. The issues, existing solutions, and approaches are presented. In addition, the future research directions of MCC are discussed. Copyright © 2011 John Wiley & Sons, Ltd.
TL;DR: This article provides a detailed overview of various state-of-the-art research papers on human activity recognition, discussing both the methodologies developed for simple human actions and those for high-level activities.
Abstract: Human activity recognition is an important area of computer vision research. Its applications include surveillance systems, patient monitoring systems, and a variety of systems that involve interactions between persons and electronic devices such as human-computer interfaces. Most of these applications require an automated recognition of high-level activities, composed of multiple simple (or atomic) actions of persons. This article provides a detailed overview of various state-of-the-art research papers on human activity recognition. We discuss both the methodologies developed for simple human actions and those for high-level activities. An approach-based taxonomy is chosen that compares the advantages and limitations of each approach. Recognition methodologies for an analysis of the simple actions of a single person are first presented in the article. Space-time volume approaches and sequential approaches that represent and recognize activities directly from input images are discussed. Next, hierarchical recognition methodologies for high-level activities are presented and compared. Statistical approaches, syntactic approaches, and description-based approaches for hierarchical recognition are discussed in the article. In addition, we further discuss the papers on the recognition of human-object interactions and group activities. Public datasets designed for the evaluation of the recognition methodologies are illustrated in our article as well, comparing the methodologies' performances. This review will provide the impetus for future research in more productive areas.
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.