A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests

doi:10.1115/1.4036350

Dazhong Wu

1

Department of Industrial and

Manufacturing Engineering,

National Science Foundation

Center for e-Design,

Pennsylvania State University,

University Park, PA 16802

e-mail: dxw279@psu.edu

Connor Jennings

Department of Industrial and

Manufacturing Engineering,

National Science Foundation

Center for e-Design,

Pennsylvania State University,

University Park, PA 16802

e-mail: connor@psu.edu

Janis Terpenny

Department of Industrial and

Manufacturing Engineering,

National Science Foundation

Center for e-Design,

Pennsylvania State University,

University Park, PA 16802

e-mail: jpt5311@psu.edu

Robert X. Gao

Department of Mechanical and

Aerospace Engineering,

Case Western Reserve University,

Cleveland, OH 44106

e-mail: robert.gao@case.edu

Soundar Kumara

Department of Industrial and

Manufacturing Engineering,

Pennsylvania State University,

University Park, PA 16802

e-mail: skumara@psu.edu

A Comparative Study on

Machine Learning Algorithms

for Smart Manufacturing: Tool

Wear Prediction Using

Random Forests

Manufacturers have faced an increasing need for the development of predictive models

that predict mechanical failures and the remaining useful life (RUL) of manufacturing

systems or components. Classical model-based or physics-based prognostics often

require an in-depth physical understanding of the system of interest to develop closed-

form mathematical models. However, prior knowledge of system behavior is not always

available, especially for complex manufacturing systems and processes. To complement

model-based prognostics, data-driven methods have been increasingly applied to machin-

ery prognostics and maintenance management, transforming legacy manufacturing sys-

tems into smart manufacturing systems with artiﬁcial intelligence. While previous

research has demonstrated the effectiveness of data-driven methods, most of these prog-

nostic methods are based on classical machine learning techniques, such as artiﬁcial

neural networks (ANNs) and support vector regression (SVR). With the rapid advance-

ment in artiﬁcial intelligence, various machine learning algorithms have been developed

and widely applied in many engineering ﬁelds. The objective of this research is to intro-

duce a random forests (RFs)-based prognostic method for tool wear prediction as well as

compare the performance of RFs with feed-forward back propagation (FFBP) ANNs and

SVR. Speciﬁcally, the performance of FFBP ANNs, SVR, and RFs are compared using an

experimental data collected from 315 milling tests. Experimental results have shown that

RFs can generate more accurate predictions than FFBP ANNs with a single hidden layer

and SVR. [DOI: 10.1115/1.4036350]

Keywords: tool wear prediction, predictive modeling, machine learning, random forests

(RFs), support vector machines (SVMs), artiﬁcial neural networks (ANNs), prognostics

and health management (PHM)

1 Introduction

Smart manufacturing aims to integrate big data, advanced ana-

lytics, high-performance computing, and Industrial Internet of

Things (IIoT) into traditional manufacturing systems and proc-

esses to create highly customizable products with higher quality at

lower costs. As opposed to traditional factories, a smart factory

utilizes interoperable information and communications technolo-

gies (ICT), intelligent automation systems, and sensor networks to

monitor machinery conditions, diagnose the root cause of failures,

and predict the remaining useful life (RUL) of mechanical sys-

tems or components. For example, almost all engineering systems

(e.g., aerospace systems, nuclear power plants, and machine tools)

are subject to mechanical failures resulting from deterioration

with usage and age or abnormal operating conditions [

1–3].

Some of the typical failure modes include excessive load, over-

heating, deﬂection, fracture, fatigue, corrosion, and wear. The

degradation and failures of engineering systems or components

will often incur higher costs and lower productivity due to unex-

pected machine downtime. In order to increase manufacturing

productivity while reducing maintenance costs, it is crucial to

develop and implement an intelligent maintenance strategy that

allows manufacturers to determine the condition of in-service sys-

tems in order to predict when maintenance should be performed.

Conventional maintenance strategies include reactive, preven-

tive, and proactive maintenance [

4–6]. The most basic approach

to maintenance is reactive, also known as run-to-failure mainte-

nance planning. In the reactive maintenance strategy, assets are

deliberately allowed to operate until failures actually occur. The

assets are maintained on an as-needed basis. One of the disadvan-

tages of reactive maintenance is that it is difﬁcult to anticipate the

maintenance resources (e.g., manpower, tools, and replacement

parts) that will be required for repairs. Preventive maintenance is

often referred to as use-based maintenance. In preventive mainte-

nance, maintenance activities are performed after a speciﬁed

period of time or amount of use based on the estimated probability

that the systems or components will fail in the speciﬁed time inter-

val. Although preventive maintenance allows for more consistent

and predictable maintenance schedules, more maintenance activ-

ities are needed as opposed to reactive maintenance. To improve

1

Corresponding author.

Manuscript received October 25, 2016; ﬁnal manuscript received March 13,

2017; published online April 18, 2017. Assoc. Editor: Laine Mears.

Journal of Manufacturing Science and Engineering JULY 2017, Vol. 139 / 071018-1

Copyright

V

C

2017 by ASME

Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/139/7/071018/6405639/manu_139_07_071018.pdf by guest on 20 August 2022

the efﬁciency and effectiveness of preventive maintenance, pre-

dictive maintenance is an alternative strategy in which mainte-

nance actions are scheduled based on equipment performance or

conditions instead of time. The objective of proactive mainte-

nance is to determine the condition of in-service equipment and

ultimately to predict the time at which a system or a component

will no longer meet desired functional requirements.

The discipline that predicts health condition and remaining use-

ful life (RUL) based on previous and current operating conditions

is often referred to as prognostics and health management (PHM).

Prognostic approaches fall into two categories: model-based and

data-driven prognostics [

7–12]. Model-based prognostics refer to

approaches based on mathematical models of system behavior

derived from physical laws or probability distribution. For exam-

ple, model-based prognostics include methods based on Wiener

and Gamma processes [

13], hidden Markov models (HMMs) [14],

Kalman ﬁlters [

15,16], and particle ﬁlters [17–20]. One of the

limitations of model-based prognostics is that an in-depth under-

standing of the underlying physical processes that lead to system

failures is required. Another limitation is that it is assumed that

underlying processes follow certain probability distributions, such

as gamma or normal distributions. While probability density func-

tions enable uncertainty quantiﬁcation, distributional assumptions

may not hold true in practice.

To complement model-based prognostics, data-driven prognos-

tics refer to approaches that build predictive models using learn-

ing algorithms and large volumes of training data. For example,

classical data-driven prognostics are based on autoregressive

(AR) models, multivariate adaptive regression, fuzzy set theory,

ANNs, and SVR. The unique beneﬁt of data-driven methods is

that an in-depth understanding of system physical behaviors is not

a prerequisite. In addition, data-driven methods do not assume

any underlying probability distributions which may not be practi-

cal for real-world applications. While ANNs and SVR have been

applied in the area of data-driven prognostics, little research has

been conducted to evaluate the performance of other machine

learning algorithms [

21]. Because RFs have the potential to han-

dle a large number of input variables without variable selection

and they do not overﬁt [

22–24], we investigate the ability of RFs

for the prediction of tool wear using an experimental dataset.

Further, the performance of RFs is compared with that of FFBP

ANNs and SVR using accuracy and training time.

The main contributions of this paper include the followings:

 Tool wear in milling operations is predicted using RFs along

with cutting force, vibration, and acoustic emission (AE) sig-

nals. Experimental results have shown that the predictive

model trained by RFs is very accurate. The mean squared

error (MSE) on the test tool wear data is up to 7.67. The

coefﬁcient of determination (R

2

) on the test tool wear data is

up to 0.992. To the best of our knowledge, the random forest

algorithm is applied to predict tool wear for the ﬁrst time.

 The performances of ANNs, support vector machines

(SVMs), and RFs are compared using an experimental data-

set with respect to the accuracy of regression (e.g., MSE and

R

2

) and training time. While the training time for RFs is lon-

ger than that of ANNs and SVMs, the predictive model built

by RFs is the most accurate for the application example.

The remainder of the paper is organized as follows: Section

2

reviews the related literature on data-driven methods for tool wear pre-

diction. Section

3 presents the methodology for tool wear prediction

using ANNs, SVMs, and RFs. Section

4 presents an experimental

setup and the experimental dataset acquired from different types of

sensors (e.g., cutting force sensor, vibration sensor, acoustic emis-

sion sensor) on a computer numerical control (CNC) milling

machine. Section

5 presents experimental results, demonstrates the

effectiveness of the three machine learning algorithms, and com-

pares the performance of each. Section

6 provides conclusions that

include a discussion of research contribution and future work.

2 Data-Driven Methods for Tool Wear Prediction

Tool wear is the most commonly observed and unavoidable

phenomenon in manufacturing processes, such as drilling, milling,

and turning [

25–27]. The rate of tool wear is typically affected by

process parameters (e.g., cutting speed and feed rate), cutting tool

geometry, and properties of workpiece and tool materials. Tay-

lor’s equation for tool life expectancy [

28] provides an approxi-

mation of tool wear. However, with the rapid advancement of

sensing technology and increasing number of sensors equipped on

modern CNC machines, it is possible to predict tool wear more

accurately using various measurement data. This section presents

a review of data-driven methods for tool wear prediction.

Schwabacher and Goebel [

29] conducted a review of data-

driven methods for prognostics. The most popular data-driven

approaches to prognostics include ANNs, decision trees, and

SVMs in the context of systems health management. ANNs are a

family of computational models based on biological neural net-

works which are used to estimate complex relationships between

inputs and outputs. Bukkapatnam et al. [

30–32] developed effec-

tive tool wear monitoring techniques using ANNs based on fea-

tures extracted from the principles of nonlinear dynamics.

€

Ozel

and Karpat [

33] presented a predictive modeling approach for sur-

face roughness and tool wear for hard turning processes using

ANNs. The inputs of the ANN model include workpiece hardness,

cutting speed, feed rate, axial cutting length, and mean values of

three force components. Experimental results have shown that the

model trained by ANNs provides accurate predictions of surface

roughness and tool ﬂank wear. Palanisamy et al. [

34] developed a

predictive model for predicting tool ﬂank wear in end milling

operations using feed-forward back propagation (FFBP) ANNs.

Experimental results have shown that the predictive model based

on ANNs can make accurate predictions of tool ﬂank wear using

cutting speeds, feed rates, and depth of cut. Sanjay et al. [

35]

developed a model for predicting tool ﬂank wear in drilling using

ANNs. The feed rates, spindle speeds, torques, machining times,

and thrust forces are used to train the ANN model. The experi-

mental results have demonstrated that ANNs can predict tool wear

accurately. Chungchoo and Saini [

36] developed an online fuzzy

neural network (FNN) algorithm that estimates the average width

of ﬂank wear and maximum depth of crater wear. A modiﬁed

least-square backpropagation neural network was built to estimate

ﬂank and crater wear based on cutting force and acoustic emission

signals. Chen and Chen [

37] developed an in-process tool wear

prediction system using ANNs for milling operations. A total of

100 experimental data were used for training the ANN model.

The input variables include feed rate, depth of cut, and average

peak cutting forces. The ANN model can predict tool wear with

an error of 0.037 mm on average. Paul and Varadarajan [

38] intro-

duced a multisensor fusion model to predict tool wear in turning

processes using ANNs. A regression model and an ANN were

developed to fuse the cutting force, cutting temperature, and

vibration signals. Experimental results showed that the coefﬁcient

of determination was 0.956 for the regression model trained by

the ANN. Karayel [

39] presented a neural network approach

for the prediction of surface roughness in turning operations. A

feed-forward back-propagation multilayer neural network was

developed to train a predictive model using the data collected

from 49 cutting tests. Experimental results showed that the predic-

tive model has an average absolute error of 2.29%.

Cho et al. [

40] developed an intelligent tool breakage detection

system with the SVM algorithm by monitoring cutting forces and

power consumption in end milling processes. Linear and polyno-

mial kernel functions were applied in the SVM algorithm. It has

been demonstrated that the predictive model built by SVMs can

recognize process abnormalities in milling. Benkedjouh et al. [

41]

presented a method for tool wear assessment and remaining useful

life prediction using SVMs. The features were extracted from

cutting force, vibration, and acoustic emission signals. The experi-

mental results have shown that SVMs can be used to estimate the

071018-2 / Vol. 139, JULY 2017 Transactions of the ASME

Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/139/7/071018/6405639/manu_139_07_071018.pdf by guest on 20 August 2022

wear progression and predict RUL of cutting tools effectively. Shi

and Gindy [

42] introduced a predictive modeling method by com-

bining least squares SVMs and principal component analysis

(PCA). PCA was used to extract statistical features from multiple

sensor signals acquired from broaching processes. Experimental

results showed that the predictive model trained by SVMs was

effective to predict tool wear using the features extracted by PCA.

Another data-driven method for prognostics is based on deci-

sion trees. Decision trees are a nonparametric supervised learning

method used for classiﬁcation and regression. The goal of deci-

sion tree learning is to create a model that predicts the value of a

target variable by learning decision rules inferred from data fea-

tures. A decision tree is a ﬂowchart-like structure in which each

internal node denotes a test on an attribute, each branch represents

the outcome of a test, and each leaf node holds a class label. Jiaa

and Dornfeld [

43] proposed a decision tree-based method for the

prediction of tool ﬂank wear in a turning operation using acoustic

emission and cutting force signals. The features characterizing the

AE root-mean-square and cutting force signals were extracted from

both time and frequency domains. The decision tree approach was

demonstrated to be able to make reliable inferences and decisions

on tool wear classiﬁcation. Elangovan et al. [

44] developed a deci-

sion tree-based algorithm for tool wear prediction using vibration

signals. Ten-fold cross-validation was used to evaluate the accuracy

of the predictive model created by the decision tree algorithm. The

maximum classiﬁcation accuracy was 87.5%. Arisoy and

€

Ozel [

45]

investigated the effects of machining parameters on surface micro-

hardness and microstructure such as grain size and fractions using a

random forests-based predictive modeling method along with

ﬁnite element simulations. Predicted microhardness proﬁles and

grain sizes were used to understand the effects of cutting speed,

tool coating, and edge radius on the surface integrity.

In summary, the related work presented in this section builds

on previous research to explore how the conditions of tool wear

can be monitored as well as how tool wear can be predicted using

predictive modeling. While earlier work focused on prediction of

tool wear using ANNs, SVMs, and decision trees, this paper

explores the potential of a new method, random forests, for tool

wear prediction. Further, the performance of RFs is compared

with that of ANNs and SVMs. Because RFs are an extension of

decision trees, the performance of RFs is not compared with that

of decision trees.

3 Methodology

This section presents the methodology for data-driven prognos-

tics for tool wear prediction using ANNs, SVR, and RFs. The

input of ANNs, SVR, and RFs is the following labeled training

data:

D ¼ðx

i

; y

i

Þ

where x

i

¼ðF

X

; F

Y

; F

Z

; V

X

; V

Y

; V

Z

; AEÞ, y

i

2 R. The description

of these input data can be found in Table

1.

3.1 Tool Wear Prediction Using ANNs. ANNs are a family

of models inspired by biological neural networks. An ANN is

deﬁned by three types of parameters: (1) the interconnection pat-

tern between different layers of neurons, (2) the learning process

for updating the weights of the interconnections, and (3) the acti-

vation function that converts a neuron’s weighted input to its out-

put activation. Among many types of ANNs, the feed-forward

neural network is the ﬁrst and the most popular ANN. Back-

propagation is a learning algorithm for training ANNs in conjunc-

tion with an optimization method such as gradient descent.

Figure

1 illustrates the architecture of the FFBP ANN with a

single hidden layer. In this research, the ANN has three layers,

including input layer i, hidden layer j, and output layer k. Each

layer consists of one or more neurons or units, represented by

the circles. The ﬂow of information is represented by the lines

between the units. The ﬁrst layer has input neurons which act as

buffers for distributing the extracted features (i.e., F

i

) from the

input data (i.e., x

i

). The number of the neurons in the input layer

is the same as that of extracted features from input variables.

Each value from the input layer is duplicated and sent to all

neurons in the hidden layer. The hidden layer is used to process

and connect the information from the input layer to the output

layer in a forward direction. Speciﬁcally, these values entering a

neuron in the hidden layer are multiplied by weights w

ij

. Initial

weights are randomly selected between 0 and 1. A neuron in the

hidden layer sums up the weighted inputs and generates a single

output. This value is the input of an activation function (sigmoid

function) in the hidden layer f

h

that converts the weighted input

to the output of the neuron. Similarly, the outputs of all the neu-

rons in the hidden layer are multiplied by weights w

jk

. A neural

in the output layer sums up the weighted inputs and generates a

single value. An activation function in the output layer f

o

con-

verts the weighted input to the predicted output y

k

of the ANN,

which is the predicted ﬂank wear VB. The output layer has only

one neuron because there is only one response variable. The per-

formance of ANNs depends on the topology or architecture of

ANNs (i.e., the number of layers) and the number of neurons in

each layer. However, there are no standard or well-accepted

methods or rules for determining the number of hidden layers

and neurons in each hidden layer. In this research, the single-

hidden-layer ANNs with 2, 4, 8, 16, and 32 neurons in the hid-

den layer are selected. The termination criterion of the training

algorithm is that training stops if the ﬁt criterion (i.e., least

squares) falls below 1.0  10

4

.

Table 1 Signal channel and data description

Signal channel Data description

Channel 1 F

X

: force (N) in X dimension

Channel 2 F

Y

: force (N) in Y dimension

Channel 3 F

Z

: force (N) in Z dimension

Channel 4 V

X

: vibration (g) in X dimension

Channel 5 V

Y

: vibration (g) in Y dimension

Channel 6 V

Z

: vibration (g) in Z dimension

Channel 7 AE: acoustic emission (V)

Fig. 1 Tool wear prediction using a feed-forward back-

propagation (FFBP) ANN

Journal of Manufacturing Science and Engineering JULY 2017, Vol. 139 / 071018-3

Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/139/7/071018/6405639/manu_139_07_071018.pdf by guest on 20 August 2022

3.2 Tool Wear Prediction Using SVR. The original SVM

for regression was developed by Vapnik and coworkers [

46,47]. A

SVM constructs a hyperplane or set of hyperplanes in a high- or

inﬁnite-dimensional space, which can be used for classiﬁcation

and regression.

The framework of SVR for linear cases is illustrated in Fig.

2.

Formally, SVR can be formulated as a convex optimization

problem

Minimize

1

2

kxk

2

þ C

X

‘

i¼1

n

i

þ n



i



Subject to

y

i

hx; x

i

ib  e þ n

i

hx; x

i

iþb  y

i

 e þ n



i

n

i

; n



i

 0

8

>

<

>

:

(3.1)

where x 2 v; C ¼ 1, e ¼ 0:1, and n

i

; n



i

¼ 0:001. b can be com-

puted as follows:

b ¼ y

i

hx; x

i

ie for a

i

2½0; C

b ¼ y

i

hx; x

i

iþe for a



i

2½0; C

(3.2)

For nonlinear SVR, the training patterns x

i

can be preprocessed

by a nonlinear kernel function kðx; x

0

Þ :¼hUðxÞ; UðxÞi

0

, where

UðxÞ is a transformation that maps x to a high-dimensional space.

These kernel functions need to satisfy the Mercer’s theorem.

Many kernels have been developed for various applications. The

most popular kernels include polynomial, Gaussian radial basis

function (RBF), and sigmoid. In many applications, a nonlinear

kernel function provides better accuracy. According to the litera-

ture [

32,33], the Gaussian RBF kernel is one of the most effective

kernel functions used in tool wear prediction. In this research, the

Gaussian RBF kernel is used to transform the input dataset

D ¼ðx

i

; y

i

Þ, where x

i

is the input vector and y

i

is the response

variable (i.e., ﬂank wear) into a new dataset in a high-dimensional

space. The new dataset is linearly separable by a hyperplane in a

higher-dimensional Euclidean space as illustrated in Fig.

2. The

slack variables n

i

and n



i

are introduced in the instances where the

constraints are infeasible. The slack variables denote the deviation

from predicted values with the error of e ¼ 0:1. The RBF kernel is

kðx

i

; x

j

Þ¼exp ðððkx

i

 x

j

k

2

Þ=2r

2

ÞÞ, where r

2

¼ 0:5. At the

optimal solution, we obtain

x ¼

X

‘

i¼1

ða

i

 a



i

ÞUðxÞ and f ðxÞ¼

X

‘

i¼1

ða

i

 a



i

Þkðx

i

; x

j

Þþb

(3.3)

3.3 Tool Wear Prediction Using RFs. The random forest

algorithm, developed by Breiman [

22,48], is an ensemble learning

method that constructs a forest of decision trees from bootstrap

samples of a training dataset. Each decision tree produces a

response, given a set of predictor values. In a decision tree, each

internal node represents a test on an attribute, each branch repre-

sents the outcome of the test, and each leaf node represents a class

label for classiﬁcation or a response for regression. A decision

tree in which the response is continuous is also referred to as a

regression tree. In the context of tool wear prediction, each indi-

vidual decision tree in a random forest is a regression tree because

tool wear describes the gradual failure of cutting tools. A compre-

hensive tutorial on RFs can be found in Refs. [

22,48,49]. Some of

the important concepts related to RFs, including bootstrap aggre-

gating or bagging, slipping, and stopping criterion, are introduced

in Secs.

3.3.1–3.3.4.

3.3.1 Bootstrap Aggregating or Bagging. Given a training

dataset D ¼fðx

1

; y

1

Þ; ðx

2

; y

2

Þ; …; ðx

N

; y

N

Þg, bootstrap aggregating

or bagging generates B new training datasets D

i

of size N by sam-

pling from the original training dataset D with replacement. D

i

is

referred to as a bootstrap sample. By sampling with replacement

or bootstrapping, some observations may be repeated in each D

i

.

Bagging helps reduce variance and avoid overﬁtting. The number

of regression trees B is a parameter speciﬁed by users. Typically,

a few hundred to several thousand trees are used in the random

forest algorithm.

3.3.2 Choosing Variables to Split On. For each of the boot-

strap samples, grow an un-pruned regression tree with the follow-

ing procedure: At each node, randomly sample m variables and

choose the best split among those variables rather than choosing

the best split among all predictors. This process is sometimes

called “feature bagging.” The reason why a random subset of the

predictors or features is selected is because the correlation of the

trees in an ordinary bootstrap sample can be reduced. For regres-

sion, the default m ¼ p=3.

3.3.3 Splitting Criterion. Suppose that a partition is divided

into M regions R

1

, R

2

,…, R

m

. The response is modeled as a con-

stant c

m

in each region

f ðxÞ¼

X

M

m¼1

c

m

Iðx  R

m

Þ

(3.4)

The splitting criterion at each node is to minimize the sum of

squares. Therefore, the best

c

m

is the average of y

i

in region R

m

c

m

¼ aveðy

i

jx

i

R

m

Þ

(3.5)

Consider a splitting variable j and split point s, and deﬁne the

pair of half-planes

R

1

ðj; sÞ¼fXjX

j

 sg and R

2

ðj; sÞ¼fXjX

j

 sg

(3.6)

The splitting variable j and split point s should satisfy

min

j;s

min

c1

X

x

i

2R

1

ðj;sÞ

ðy

i

 c1Þ

2

þ min

c2

X

x

i

2R

2

ðj;sÞ

ðy

i

 c2Þ

2



(3.7)

For any j and s, the inner minimization is solved by

b

c

1

¼ aveðy

i

jx

i

R

1

ðj; sÞÞ and

b

c

2

¼ aveðy

i

jx

i

R

2

ðj; sÞÞ

(3.8)

Having found the best split, the dataset is partitioned into two

resulting regions and repeat the splitting process on each of the

two regions. This splitting process is repeated until a predeﬁned

stopping criterion is satisﬁed.

3.3.4 Stopping Criterion. Tree size is a tuning parameter gov-

erning the complexity of a model. The stopping criterion is that

the splitting process proceeds until the number of records in D

i

falls below a threshold, and ﬁve is used as the threshold.

Fig. 2 Tool wear prediction using SVR

071018-4 / Vol. 139, JULY 2017 Transactions of the ASME

Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/139/7/071018/6405639/manu_139_07_071018.pdf by guest on 20 August 2022

After B such trees fT

b

g

B

1

are constructed, a prediction at a new

point x can be made by averaging the predictions from all the indi-

vidual B regression trees on x

b

f

B

rf

x

ðÞ

¼

1

B

X

B

b¼1

T

b

x

ðÞ

(3.9)

The random forest algorithm [48,49] for regression is as follows:

(1) Draw a bootstrap sample Z of size N from the training data.

(2) For each bootstrap sample, construct a regression tree by

splitting a node into two children nodes until the stopping

criterion is satisﬁed.

(3) Output the ensemble of trees fT

b

g

B

1

.

(4) Make a prediction at a new point x by aggregating the pre-

dictions of the B trees.

The framework of predicting ﬂank wear using an RF is illus-

trated in Fig.

3. In this research, a random forest is constructed

using B ¼ 500 regression trees. Given the labeled training dataset

D ¼ðx

i

; y

i

Þ, a bootstrap sample of size N ¼ 630 is drawn from

the training dataset. For each regression tree, m ¼ 9 ðm ¼ðp=3Þ;

p ¼ 28Þ variables are selected at random from the 28 variables/

features. The best variable/split-point is selected among the nine

variables. A regression tree progressively splits the training data-

set into two child nodes: left node (with samples <z) and right

node (with samples z). A splitting variable and split point are

selected by solving Eqs.

(3.7) and (3.8). The process is applied

recursively on the dataset in each child node. The splitting process

stops if the number of records in a node is less than 5. An

individual regression tree is built by starting at the root node of

the tree, performing a sequence of tests about the predictors, and

organizing the tests in a hierarchical binary tree structure as

shown in Fig.

4. After 500 regression trees are constructed, a pre-

diction at a new point can be made by averaging the predictions

from all the individual binary regression trees on this point.

4 Experimental Setup

The data used in this paper were obtained from Li et al. [

50].

Some details of the experiment are presented in this section. The

experimental setup is shown in Fig.

5.

The cutter material and workpiece material used in the experi-

ment are high-speed steel and stainless steel, respectively. The

detailed description of the operating conditions in the dry milling

operation can be found in Table

2. The spindle speed of the cutter

was 10,400 RPM. The feed rate was 1555 mm/min. The Y depth

of cut (radial) was 0.125 mm. The Z depth of cut (axial) was

0.2 mm.

315 cutting tests were conducted on a three-axis high-speed

CNC machine (R

€

oders Tech RFM 760). During each cutting test,

seven signal channels, including cutting force, vibration, and

acoustic emission data, were monitored in real-time. The sampling

rate was 50 kHz/channel. Each cutting test took about 15 s. A sta-

tionary dynamometer, mounted on the table of the CNC machine,

was used to measure cutting forces in three, mutually perpendicu-

lar axes (x, y, and z dimensions). Three piezo accelerometers,

mounted on the workpiece, were used to measure vibration in

three, mutually perpendicular axes (x, y, and z dimensions). An

acoustic emission (AE) sensor, mounted on the workpiece, was

Fig. 3 Tool wear prediction using an RF

Fig. 4 Binary regression tree growing process

Fig. 5 Experimental setup

Journal of Manufacturing Science and Engineering JULY 2017, Vol. 139 / 071018-5

Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/139/7/071018/6405639/manu_139_07_071018.pdf by guest on 20 August 2022

A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests

Citations

Deep learning for smart manufacturing: Methods and applications

Sustainable Industry 4.0 framework: A systematic literature review identifying the current trends and future perspectives

Predictive maintenance in the Industry 4.0: A systematic literature review

Smart manufacturing: Characteristics, technologies and enabling factors:

Long short-term memory for machine remaining life prediction

References

Random Forests

Support-Vector Networks

Classification and Regression by randomForest

Support Vector Regression Machines

Advanced monitoring of machining operations

Related Papers (5)

Random Forests

Deep learning for smart manufacturing: Methods and applications

Advanced monitoring of machining operations

Deep learning

Long short-term memory