Q-Learning based Maximum Power Extraction for Wind Energy Conversion System With Variable Wind Speed
Citations
45 citations
Cites background from "Q-Learning based Maximum Power Extr..."
...Thus, the manufacturers are persistently working on improving the reliability of WTs which leads to a longevity increase, a reduction in breakdowns during the operation, and thereby an increase of the total electrical energy production [4], [5]....
[...]
41 citations
18 citations
15 citations
14 citations
References
37,989 citations
"Q-Learning based Maximum Power Extr..." refers background in this paper
...In the reinforcement learning, the focus is on direct interaction of the individual (agent) with its environment, which learns from its own experience [17]....
[...]
2,861 citations
"Q-Learning based Maximum Power Extr..." refers methods in this paper
...been made by Watkins [18] who has suggested new algorithm called Q-learning and applied it to MDP....
[...]
...For optimum policy, selection of action is done such that the maximum discounted reward can be obtained for each state. π∗ (s) = arg max a ∑ s′ P (s, a, s′)V ∗ (s′) (6) For a transition from state s to state s’, the update for value is as, V π (s) = V π (s) + η (r (s) + Υ V π (s′)− V π (s)) (7) Most important breakthrough in reinforcement learning has been made by Watkins [18] who has suggested new algorithm called Q-learning and applied it to MDP. Q-learning is the first reinforcement learning algorithm whose convergence to optimal policy is proven for decision making problems involving cumulative cost....
[...]
...[18] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. Dissertation, Dept. Psychol., Cambridge Univ., Cambridge, England, 1989....
[...]
507 citations
408 citations