Analysis of function of rectified linear unit used in deep learning
Citations
229 citations
Cites background from "Analysis of function of rectified l..."
...To solve this problem, the rectified linear unit (ReLU) became popular [31], since it accelerated the convergence of stochastic gradient descent compared to the sigmoid function....
[...]
116 citations
106 citations
85 citations
Cites methods from "Analysis of function of rectified l..."
...However, training of neural networks with a gradient‐based learning is not efficient when the activation function is sigmoid because the sigmoid function has a widespread saturation property.(53) To overcome this problem, ReLU, which is defined by FReLU = max(0,x), has been used in many studies....
[...]
84 citations
Cites background from "Analysis of function of rectified l..."
...The rectified linear unit (ReLU) [31] layer is added before and after the hidden layer to introduce non-linear factors, which makes the model's expression ability stronger....
[...]
References
15,055 citations
"Analysis of function of rectified l..." refers background in this paper
...In the field of neural network and its applications include object recognition and speech processing, deep learning [5] is attracting much attention....
[...]
6,790 citations
"Analysis of function of rectified l..." refers background in this paper
...There is a similar function called "softplus" [11] defined as In(l + exp(Yk' ))....
[...]
4,385 citations
"Analysis of function of rectified l..." refers background in this paper
...Key technology in deep learning is an automatic pre-training that extracts specifications of data while learning [5] , [6]....
[...]
2,504 citations
934 citations