MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR, 136-143.

doi:10.2316/J.2019.201-2931

MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR, 136-143.

Jerry Sweaﬀord Jr. and Farbod Fahimi

References

[1] R. Bellman, Dynamic Programming (Princeton, NJ, USA:Princeton University Press, 1957).
[2] W.T. Miller, P.J. Werbos, and R.S. Sutton, Neural Networksfor Control (Cambridge, MA, USA: MIT Press, 1990).
[3] D.V. Prokhorov and D.C. Wunsch, Adaptive critic designs,IEEE Transactions on Neural Networks, 8(5), 1997, 997–1007.
[4] D. Liu, X. Xiong, and Y. Zhang, Action-dependent adaptivecritic designs, Proc. of Int. Joint Conf. on Neural Networks(IJCNN), Washington, DC, USA, Vol. 2, 2001, 990–995.
[5] P. He and S. Jagannathan, Reinforcement learning-based out-put feedback control of nonlinear systems with input constraints, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 35(1), 2005, 150–154.
[6] L. Yang, J. Si, K.S. Tsakalis, and A.A. Rodriguez, Directheuristic dynamic programming for nonlinear tracking controlwith ﬁltered tracking error, IEEE Transactions on Systems,Man, and Cybernetics, Part B (Cybernetics), 39(6), 2009,1617–1622.
[7] Y.-J. Liu, L. Tang, S. Tong, C.P. Chen, and D.-J. Li, Reinforcement learning design-based adaptive tracking control withless learning parameters for nonlinear discrete-time MIMOsystems, IEEE Transactions on Neural Networks and LearningSystems, 26(1), 2015, 165–176.
[8] Q. Yang and S. Jagannathan, Reinforcement learning controllerdesign for aﬃne nonlinear discrete-time systems using onlineapproximators, IEEE Transactions on Systems, Man, andCybernetics, Part B (Cybernetics), 42(2), 2012, 377–390.
[9] H. Deng, H.-X. Li, and Y.-H. Wu, Feedback-linearization-based neural adaptive control for unknown nonaﬃne nonlineardiscrete-time systems, IEEE Transactions on Neural Networks,19(9), 2008, 1615–1625.
[10] Q. Yang, J.B. Vance, and S. Jagannathan, Control of nonaﬃnenonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics),38(4), 2008, 994–1001.
[11] Z. Chen, Y. Zhang, and Z. Wang, Output feedback stabilizationof a class of uncertain non-aﬃne nonlinear systems in discretetime, Control and Intelligent Systems, 44(2), 2016.
[12] J.-H. Park, S.-H. Huh, S.-H. Kim, S.-J. Seo, and G.-T. Park,Direct adaptive controller for nonaﬃne nonlinear systems usingself-structuring neural networks, IEEE Transactions on NeuralNetworks, 16(2), 2005, 414–422.
[13] X. Yang, D. Liu, D. Wang, and Q. Wei, Discrete-time onlinelearning control for a class of unknown nonaﬃne nonlinearsystems using reinforcement learning, Neural Networks, 55,2014, 30–41.
[14] Z. Wang, P. Goldsmith, and J. Gu, Adaptive trajectory trackingcontrol for Euler-Lagrange systems with application to robotmanipulators, Control and Intelligent Systems, 37(1), 2009,46–56.
[15] F. Fahimi and S. Praneeth, A universal trajectory trackingcontroller for mobile robots via model-free online reinforcementlearning, Control and Intelligent Systems, 43(1), 2015.
[16] W. Gao, Y. Wang, and A. Homaifa, Discrete-time variablestructure control systems, IEEE Transactions on IndustrialElectronics, 42(2), 1995, 117–122.
[17] R. Bellman, Dynamic programming and stochastic controlprocesses, Information and Control, 1(3), 1958, 228–239.

Important Links:

Abstract
DOI: 10.2316/J.2019.201-2931
From Journal (201) Mechatronic Systems and Control - 2019

Go Back