scispace - formally typeset
Search or ask a question
Author

Zhihuan Song

Bio: Zhihuan Song is an academic researcher from Zhejiang University. The author has contributed to research in topics: Fault detection and isolation & Soft sensor. The author has an hindex of 44, co-authored 241 publications receiving 6849 citations. Previous affiliations of Zhihuan Song include Chung Yuan Christian University & Ningbo Institute of Technology, Zhejiang University.


Papers
More filters
Journal ArticleDOI
TL;DR: The natures of different industrial processes are revealed with their data characteristics analyzed and a corresponding problem is defined and illustrated, with review conducted with detailed discussions on connection and comparison of different monitoring methods.
Abstract: Data-based process monitoring has become a key technology in process industries for safety, quality, and operation efficiency enhancement. This paper provides a timely update review on this topic. First, the natures of different industrial processes are revealed with their data characteristics analyzed. Second, detailed terminologies of the data-based process monitoring method are illustrated. Third, based on each of the main data characteristics that exhibits in the process, a corresponding problem is defined and illustrated, with review conducted with detailed discussions on connection and comparison of different monitoring methods. Finally, the relevant research perspectives and several promising issues are highlighted for future work.

788 citations

Journal ArticleDOI
TL;DR: The state-of-the-art of data mining and analytics are reviewed through eight unsupervisedLearning and ten supervised learning algorithms, as well as the application status of semi-supervised learning algorithms.
Abstract: Data mining and analytics have played an important role in knowledge discovery and decision making/supports in the process industry over the past several decades. As a computational engine to data mining and analytics, machine learning serves as basic tools for information extraction, data pattern recognition and predictions. From the perspective of machine learning, this paper provides a review on existing data mining and analytics applications in the process industry over the past several decades. The state-of-the-art of data mining and analytics are reviewed through eight unsupervised learning and ten supervised learning algorithms, as well as the application status of semi-supervised learning algorithms. Several perspectives are highlighted and discussed for future researches on data mining and analytics in the process industry.

657 citations

Journal ArticleDOI
TL;DR: A new monitoring method based on independent component analysis−principal component analysis (ICA−PCA) is proposed, where the Gaussian and non-Gaussian information can be extracted for fault detection and diagnosis and a new mixed similarity factor is proposed.
Abstract: Many of the current multivariate statistical process monitoring techniques (such as principal component analysis (PCA) or partial least squares (PLS)) do not utilize the non-Gaussian information of...

268 citations

Journal ArticleDOI
TL;DR: A systematic fault detection and isolation scheme is designed so that the whole large-scale process can be hierarchically monitored from the plant-wide level, unit block level, and variable level and the effectiveness of the proposed method is evaluated.
Abstract: In order to deal with the modeling and monitoring issue of large-scale industrial processes with big data, a distributed and parallel designed principal component analysis approach is proposed. To handle the high-dimensional process variables, the large-scale process is first decomposed into distributed blocks with a priori process knowledge. Afterward, in order to solve the modeling issue with large-scale data chunks in each block, a distributed and parallel data processing strategy is proposed based on the framework of MapReduce and then principal components are further extracted for each distributed block. With all these steps, statistical modeling of large-scale processes with big data can be established. Finally, a systematic fault detection and isolation scheme is designed so that the whole large-scale process can be hierarchically monitored from the plant-wide level, unit block level, and variable level. The effectiveness of the proposed method is evaluated through the Tennessee Eastman benchmark process.

221 citations

Journal ArticleDOI
TL;DR: This paper intends to develop a new sub-block principal component analysis (PCA) method for plant-wide process monitoring, which is named as distributed PCA model, and both of the monitoring and fault diagnosis schemes are developed based on the distributedPCA model.
Abstract: For plant-wide process monitoring, most traditional multiblock methods are under the assumption that some process knowledge should be incorporated for dividing the process into several sub-blocks. ...

194 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
01 Jul 2012
TL;DR: A taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based is proposed and a thorough empirical comparison is developed by the consideration of the most significant published approaches to show whether any of them makes a difference.
Abstract: Classifier learning with data-sets that suffer from imbalanced class distributions is a challenging problem in data mining community. This issue occurs when the number of examples that represent one class is much lower than the ones of the other classes. Its presence in many real-world applications has brought along a growth of attention from researchers. In machine learning, the ensemble of classifiers are known to increase the accuracy of single classifiers by combining several of them, but neither of these learning techniques alone solve the class imbalance problem, to deal with this issue the ensemble learning algorithms have to be designed specifically. In this paper, our aim is to review the state of the art on ensemble techniques in the framework of imbalanced data-sets, with focus on two-class problems. We propose a taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based. In addition, we develop a thorough empirical comparison by the consideration of the most significant published approaches, within the families of the taxonomy proposed, to show whether any of them makes a difference. This comparison has shown the good behavior of the simplest approaches which combine random undersampling techniques with bagging or boosting ensembles. In addition, the positive synergy between sampling techniques and bagging has stood out. Furthermore, our results show empirically that ensemble-based algorithms are worthwhile since they outperform the mere use of preprocessing techniques before learning the classifier, therefore justifying the increase of complexity by means of a significant enhancement of the results.

2,228 citations

Book ChapterDOI
11 Dec 2012

1,704 citations