scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Wind Turbine Gearbox Failure Identification With Deep Neural Networks

Long Wang1, Zijun Zhang1, Huan Long1, Jia Xu, Ruihua Liu 
01 Jun 2017-IEEE Transactions on Industrial Informatics (IEEE)-Vol. 13, Iss: 3, pp 1360-1368
TL;DR: The feasibility of monitoring the health of wind turbine (WT) gearboxes based on the lubricant pressure data in the supervisory control and data acquisition system is investigated and a deep neural network (DNN)-based framework is developed to monitor conditions of WT gearboxes and identify their impending failures.
Abstract: The feasibility of monitoring the health of wind turbine (WT) gearboxes based on the lubricant pressure data in the supervisory control and data acquisition system is investigated in this paper. A deep neural network (DNN)-based framework is developed to monitor conditions of WT gearboxes and identify their impending failures. Six data-mining algorithms, the k- nearest neighbors, least absolute shrinkage and selection operator, ridge regression (Ridge), support vector machines, shallow neural network, as well as DNN, are applied to model the lubricant pressure. A comparative analysis of developed data-driven models is conducted and the DNN model is the most accurate. To prevent the overfitting of the DNN model, a dropout algorithm is applied into the DNN training process. Computational results show that the prediction error will shift before the occurrences of gearbox failures. An exponentially weighted moving average control chart is deployed to derive criteria for detecting the shifts. The effectiveness of the proposed monitoring approach is demonstrated by examining real cases from wind farms in China and benchmarked against the gearbox monitoring based on the oil temperature data.
Citations
More filters
Journal ArticleDOI
TL;DR: This article presents a systematic review of artificial intelligence based system health management with an emphasis on recent trends of deep learning within the field and demonstrates plausible benefits for fault diagnosis and prognostics.

740 citations

Journal ArticleDOI
TL;DR: This paper reviews the recent literature on machine learning models that have been used for condition monitoring in wind turbines and shows that most models use SCADA or simulated data, with almost two-thirds of methods using classification and the rest relying on regression.

482 citations


Cites methods from "Wind Turbine Gearbox Failure Identi..."

  • ...[112] 2017 SCADA Lubricant pressure monitoring Regression/Normal...

    [...]

  • ...[112] built regression-based deep NN models of lubricant pressure based on SCADA data....

    [...]

  • ...They used ExponentiallyWeightedMoving Average (EWMA) charts to identify shifts in absolute percentage errors signifying failures and to prevent overfitting they use dropout layers in the network [112]; six wind farms were considered and the trained deep NNs achieved a MAPE between 2....

    [...]

Journal ArticleDOI
TL;DR: This paper focuses on data-driven methods for PdM, presents a comprehensive survey on its applications, and attempts to provide graduate students, companies, and institutions with the preliminary understanding of the existing works recently published.
Abstract: With the tremendous revival of artificial intelligence, predictive maintenance (PdM) based on data-driven methods has become the most effective solution to address smart manufacturing and industrial big data, especially for performing health perception (e.g., fault diagnosis and remaining life assessment). Moreover, because the existing PdM research is still in primary experimental stage, most works are conducted utilizing several open-datasets, and the combination with specific applications such as rotating machinery is especially rare. Hence, in this paper, we focus on data-driven methods for PdM, present a comprehensive survey on its applications, and attempt to provide graduate students, companies, and institutions with the preliminary understanding of the existing works recently published. Specifically, we first briefly introduce the PdM approach, illustrate our PdM scheme for automatic washing equipment , and demonstrate the challenges encountered when we conduct a PdM research. Second, we classify the specific industrial applications based on six algorithms of machine learning and deep learning (DL), and compare five performance metrics for each classification. Furthermore, the accuracy (a metric to evaluate the algorithm performance) of these PdM applications is analyzed in detail. There are some important conclusions: 1) the data used in the summarized literature are mostly from public datasets, such as case western reserve university (CWRU)/intelligent maintenance systems (IMS); and 2) in recent years, researchers seem to focus more on DL algorithms for PdM research. Finally, we summarize the common features regarding our surveyed PdM applications and discuss several potential directions.

266 citations


Cites methods from "Wind Turbine Gearbox Failure Identi..."

  • ...According to [99], a DNN was employed to develop the prediction model for monitoring the wind turbine gearbox health...

    [...]

Journal ArticleDOI
TL;DR: A novel face-pose estimation framework named multitask manifold deep learning, based on feature extraction with improved convolutional neural networks (CNNs) and multimodal mapping relationship with multitask learning is proposed.
Abstract: Face-pose estimation aims at estimating the gazing direction with two-dimensional face images. It gives important communicative information and visual saliency. However, it is challenging because of lights, background, face orientations, and appearance visibility. Therefore, a descriptive representation of face images and mapping it to poses are critical. In this paper, we use multimodal data and propose a novel face-pose estimation framework named multitask manifold deep learning ( $\text{M}^2\text{DL}$ ). It is based on feature extraction with improved convolutional neural networks (CNNs) and multimodal mapping relationship with multitask learning. In the proposed CNNs, manifold regularized convolutional layers learn the relationship between outputs of neurons in a low-rank space. Besides, in the proposed mapping relationship learning method, different modals of face representations are naturally combined by applying multitask learning with incoherent sparse and low-rank learning with a least-squares loss. Experimental results on three challenging benchmark datasets demonstrate the performance of $\text{M}^2\text{DL}$ .

206 citations

Journal ArticleDOI
TL;DR: This paper reviews the research progress of the deep transfer learning for the machinery fault diagnosis in recently years, summarizing, classifying and explaining many publications on this topic with discussing various deep transfer architectures and related theories.

193 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


"Wind Turbine Gearbox Failure Identi..." refers methods in this paper

  • ...The Lasso model [29] for predicting the lubricant pressure is written as (9), in this study, and the estimation of its parameters is as follows (10):...

    [...]

  • ...To validate the capability of DNN in modeling the lubricant pressure, the DNN model is compared with data-driven models developed by five famous algorithms, the k-Nearest Neighbors (kNN) [28], least absolute shrinkage and selection operator (Lasso) [29], ridge regression (Ridge) [30], support vector machines (SVM) [31], and shallow NN [32]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

36,760 citations


"Wind Turbine Gearbox Failure Identi..." refers background in this paper

  • ...The λ yields the smallest BIC considered in training the Lasso model....

    [...]

  • ...The optimal λ value is selected from a set {0.001, 0.002, . . . , 0.5} and the Bayes Information criterion (BIC) [37] defined in (11) is considered as the selection criteria BIC = −2 · lnL + v · ln(n) (11) where L and v are two BIC parameters, which can be estimated according to [38]....

    [...]

  • ...5} and the Bayes Information criterion (BIC) [37] defined in (11) is considered as the selection criteria...

    [...]

Journal Article
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Abstract: Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

33,597 citations


"Wind Turbine Gearbox Failure Identi..." refers methods in this paper

  • ...To prevent the overfitting, a dropout training procedure [27] is utilized to develop the DNN model....

    [...]

  • ...The dropout method is an alternative and more efficient option for addressing DNN overfitting [27]....

    [...]