scispace - formally typeset
Search or ask a question
Author

Zheng Wang

Bio: Zheng Wang is an academic researcher from Tianjin University. The author has contributed to research in topics: Feature (computer vision) & Support vector machine. The author has an hindex of 15, co-authored 47 publications receiving 779 citations. Previous affiliations of Zheng Wang include University of Strathclyde & Electric Power Research Institute.

Papers
More filters
Journal ArticleDOI
TL;DR: Experimental results show that the proposed model outperforms five other state-of-the-art video saliency detection approaches and the proposed framework is found useful for other video content based applications such as video highlights.

130 citations

Journal ArticleDOI
TL;DR: It has been proven that the proposed method is able to remove the undesirable artifacts introduced during the data acquisition process, and it has been shown that the classification performance is comparable with several recent spectral-spatial classification methods.
Abstract: Hyperspectral imaging (HSI) classification has become a popular research topic in recent years, and effective feature extraction is an important step before the classification task. Traditionally, spectral feature extraction techniques are applied to the HSI data cube directly. This paper presents a novel algorithm for HSI feature extraction by exploiting the curvelet-transformed domain via a relatively new spectral feature processing technique—singular spectrum analysis (SSA). Although the wavelet transform has been widely applied for HSI data analysis, the curvelet transform is employed in this paper since it is able to separate image geometric details and background noise effectively. Using the support vector machine classifier, experimental results have shown that features extracted by SSA on curvelet coefficients have better performance in terms of classification accuracy over features extracted on wavelet coefficients. Since the proposed approach mainly relies on SSA for feature extraction on the spectral dimension, it actually belongs to the spectral feature extraction category. Therefore, the proposed method has also been compared with some state-of-the-art spectral feature extraction techniques to show its efficacy. In addition, it has been proven that the proposed method is able to remove the undesirable artifacts introduced during the data acquisition process. By adding an extra spatial postprocessing step to the classified map achieved using the proposed approach, we have shown that the classification performance is comparable with several recent spectral–spatial classification methods.

117 citations

Journal ArticleDOI
TL;DR: The heteroscedastic spline regression model (HSRM) and robust spline regressors (RSRM) are proposed to obtain more accurate power curves even in the presence of the inconsistent samples and the results show that more accurate wind power forecasts can be obtained using the above-mentioned data processing method.
Abstract: Wind power curve modeling is a challenging task due to the existence of inconsistent data, in which the recorded wind power is far away from the theoretical wind power at a given wind speed. In this case, confronted with these samples, the estimated errors of wind power will become large. Thus, the estimated errors will present two properties: heteroscedasticity and error distribution with a long tail. In this paper, according to the above-mentioned error characteristics, the heteroscedastic spline regression model (HSRM) and robust spline regression model (RSRM) are proposed to obtain more accurate power curves even in the presence of the inconsistent samples. The results of power curve modeling on the real-world data show the effectiveness of HSRM and RSRM in different seasons. As HSRM and RSRM are optimized by variational Bayesian, except the deterministic power curves, probabilistic power curves, which can be used to detect the inconsistent samples, can also be obtained. Additionally, with the data processed by replacing the wind power in the detected inconsistent samples with the wind power on the estimated power curve, the forecasting results show that more accurate wind power forecasts can be obtained using the above-mentioned data processing method.

88 citations

Journal ArticleDOI
TL;DR: A fast implementation of SSA (F-SSA) is proposed for efficient feature extraction in HSI that only needs one SVD applied to a representative pixel of the HSI hypercube, and the overall computational complexity has been significantly reduced.
Abstract: As a very recent technique for time-series analysis, singular spectrum analysis (SSA) has been applied in many diverse areas, where an original 1-D signal can be decomposed into a sum of components, including varying trends, oscillations, and noise. Considering pixel-based spectral profiles as 1-D signals, in this letter, SSA has been applied in hyperspectral imaging for effective feature extraction. By removing noisy components in extracting the features, the discriminating ability of the features has been much improved. Experiments show that this SSA approach supersedes the empirical mode decomposition technique from which our work was originally inspired, where improved results in effective data classification using support vector machine are also reported.

83 citations

Proceedings ArticleDOI
19 Oct 2017
TL;DR: A novel approach which will automatically focus on regions-of-interest and catch their temporal structures and obtains the state-of theart results on popular evaluation metrics like BLEU-4, CIDEr, and METEOR.
Abstract: As a crucial challenge for video understanding, exploiting the spatial-temporal structure of video has attracted much attention recently, especially on video captioning. Inspired by the insight that people always focus on certain interested regions of video content, we propose a novel approach which will automatically focus on regions-of-interest and catch their temporal structures. In our approach, we utilize a specific attention model to adaptively select regions-of-interest for each video frame. Then a Dual Memory Recurrent Model (DMRM) is introduced to incorporate temporal structure of global features and regions-of-interest features in parallel, which will obtain rough understanding of video content and particular information of regions-of-interest. Since the attention model could not always catch the right interests, we additionally adopt semantic supervision to attend to interested regions more correctly. We evaluate our method for video captioning on two public benchmarks: the Microsoft Video Description Corpus (MSVD) and the Montreal Video Annotation Dataset (M-VAD). The experiments demonstrate that catching temporal regions-of-interest information really enhances the representation of input videos and our approach obtains the state-of-the-art results on popular evaluation metrics like BLEU-4, CIDEr, and METEOR.

83 citations


Cited by
More filters
01 Jan 1979
TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.
Abstract: In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes contain a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with Shared Information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different level of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems. This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis. Both state-of-the-art works, as well as literature reviews, are welcome for submission. Papers addressing interesting real-world computer vision and multimedia applications are especially encouraged. Topics of interest include, but are not limited to: • Multi-task learning or transfer learning for large-scale computer vision and multimedia analysis • Deep learning for large-scale computer vision and multimedia analysis • Multi-modal approach for large-scale computer vision and multimedia analysis • Different sharing strategies, e.g., sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, • Real-world computer vision and multimedia applications based on learning with shared information, e.g., event detection, object recognition, object detection, action recognition, human head pose estimation, object tracking, location-based services, semantic indexing. • New datasets and metrics to evaluate the benefit of the proposed sharing ability for the specific computer vision or multimedia problem. • Survey papers regarding the topic of learning with shared information. Authors who are unsure whether their planned submission is in scope may contact the guest editors prior to the submission deadline with an abstract, in order to receive feedback.

1,758 citations

Posted ContentDOI
TL;DR: This survey paper presents the first effort to offer a comprehensive framework that examines the latest metaverse development under the dimensions of state-of-the-art technologies and metaverse ecosystems, and illustrates the possibility of the digital `big bang' of the authors' cyberspace.
Abstract: Since the popularisation of the Internet in the 1990s, the cyberspace has kept evolving. We have created various computer-mediated virtual environments including social networks, video conferencing, virtual 3D worlds (e.g., VR Chat), augmented reality applications (e.g., Pokemon Go), and Non-Fungible Token Games (e.g., Upland). Such virtual environments, albeit non-perpetual and unconnected, have bought us various degrees of digital transformation. The term `metaverse' has been coined to further facilitate the digital transformation in every aspect of our physical lives. At the core of the metaverse stands the vision of an immersive Internet as a gigantic, unified, persistent, and shared realm. While the metaverse may seem futuristic, catalysed by emerging technologies such as Extended Reality, 5G, and Artificial Intelligence, the digital `big bang' of our cyberspace is not far away. This survey paper presents the first effort to offer a comprehensive framework that examines the latest metaverse development under the dimensions of state-of-the-art technologies and metaverse ecosystems, and illustrates the possibility of the digital `big bang'. First, technologies are the enablers that drive the transition from the current Internet to the metaverse. We thus examine eight enabling technologies rigorously - Extended Reality, User Interactivity (Human-Computer Interaction), Artificial Intelligence, Blockchain, Computer Vision, IoT and Robotics, Edge and Cloud computing, and Future Mobile Networks. In terms of applications, the metaverse ecosystem allows human users to live and play within a self-sustaining, persistent, and shared realm. Therefore, we discuss six user-centric factors -- Avatar, Content Creation, Virtual Economy, Social Acceptability, Security and Privacy, and Trust and Accountability. Finally, we propose a concrete research agenda for the development of the metaverse.

326 citations

Journal ArticleDOI
TL;DR: Segmented SAE (S-SAE) is proposed by confronting the original features into smaller data segments, which are separately processed by different smaller SAEs, which has resulted in reduced complexity but improved efficacy of data abstraction and accuracy of data classification.

308 citations

Journal ArticleDOI
11 Aug 2020-Sensors
TL;DR: This work evaluates the speed–accuracy tradeoff of three popular deep learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs and develops a regression model capable to estimate the performance, both in terms of processing time and accuracy.
Abstract: Face recognition is a valuable forensic tool for criminal investigators since it certainly helps in identifying individuals in scenarios of criminal activity like fugitives or child sexual abuse. It is, however, a very challenging task as it must be able to handle low-quality images of real world settings and fulfill real time requirements. Deep learning approaches for face detection have proven to be very successful but they require large computation power and processing time. In this work, we evaluate the speed-accuracy tradeoff of three popular deep-learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs. We also develop a regression model capable to estimate the performance, both in terms of processing time and accuracy. We expect this to become a very useful tool for the end user in forensic laboratories in order to estimate the performance for different face detection options. Experimental results showed that the best speed-accuracy tradeoff is achieved with images resized to 50% of the original size in GPUs and images resized to 25% of the original size in CPUs. Moreover, performance can be estimated using multiple linear regression models with a Mean Absolute Error (MAE) of 0.113, which is very promising for the forensic field.

267 citations

Journal ArticleDOI
TL;DR: The proposed PCA-EPFs method for HSI classification sharply improves the accuracy of the SVM classifier with respect to the standard edge-preserving filtering-based feature extraction method, and other widely used spectral-spatial classifiers.
Abstract: Edge-preserving features (EPFs) obtained by the application of edge-preserving filters to hyperspectral images (HSIs) have been found very effective in characterizing significant spectral and spatial structures of objects in a scene. However, a direct use of the EPFs can be insufficient to provide a complete characterization of spatial information when objects of different scales are present in the considered images. Furthermore, the edge-preserving smoothing operation unavoidably decreases the spectral differences among objects of different classes, which may affect the following classification. To overcome these problems, in this paper, a novel principal component analysis (PCA)-based EPFs (PCA-EPFs) method for HSI classification is proposed, which consists of the following steps. First, the standard EPFs are constructed by applying edge-preserving filters with different parameter settings to the considered image, and the resulting EPFs are stacked together. Next, the spectral dimension of the stacked EPFs is reduced with the PCA, which not only can represent the EPFs in the mean square sense but also highlight the separability of pixels in the EPFs. Finally, the resulting PCA-EPFs are classified by a support vector machine (SVM) classifier. Experiments performed on several real hyperspectral data sets show the effectiveness of the proposed PCA-EPFs, which sharply improves the accuracy of the SVM classifier with respect to the standard edge-preserving filtering-based feature extraction method, and other widely used spectral-spatial classifiers.

265 citations