MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR

Jerry Sweafford, Jr. and Farbod Fahimi

References

  1. [1] Richard Bellman. Dynamic Programming. PrincetonUniversity Press, 1957.
  2. [2] W Thomas Miller, Paul J Werbos, and Richard S Sut-ton. Neural Networks for Control. MIT Press, 1990.
  3. [3] Danil V Prokhorov and Donald C Wunsch. Adaptive critic designs. IEEE Transactions on Neural Net-works, 8(5):997{1007, 1997.
  4. [4] Derong Liu, Xiaoxu Xiong, and Yi Zhang. Action-dependent adaptive critic designs. In Proceedings ofInternational Joint Conference on Neural Networks(IJCNN), volume 2, pages 990{995. IEEE, 2001.
  5. [5] Pingan He and Sarangapani Jagannathan. Reinforcement learning-based output feedback control of non-linear systems with input constraints. IEEE Trans-actions on Systems, Man, and Cybernetics, Part B(Cybernetics), 35(1):150{154, 2005.
  6. [6] Lei Yang, Jennie Si, Konstantinos S Tsakalis, and Armando A Rodriguez. Direct heuristic dynamic programming for nonlinear tracking control with lteredtracking error. IEEE Transactions on Systems, Man,and Cybernetics, Part B (Cybernetics), 39(6):1617{1622, 2009.
  7. [7] Yan-Jun Liu, Li Tang, Shaocheng Tong, CL PhilipChen, and Dong-Juan Li. Reinforcement learningdesign-based adaptive tracking control with less learning parameters for nonlinear discrete-time mimo systems. IEEE Transactions on Neural Networks andLearning Systems, 26(1):165{176, 2015.
  8. [8] Qinmin Yang and Sarangapani Jagannathan. Reinforcement learning controller design for ane nonlin-ear discrete-time systems using online approximators.IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(2):377{390, 2012.
  9. [9] Hua Deng, Han-Xiong Li, and Yi-Hu Wu. Feedback-linearization-based neural adaptive control for un-known nonane nonlinear discrete-time systems.IEEE Transactions on Neural Networks, 19(9):1615{1625, 2008.
  10. [10] Qinmin Yang, Jonathan Blake Vance, and Sarangapani Jagannathan. Control of nonane nonlineardiscrete-time systems using reinforcement-learning-based linearly parameterized neural networks. IEEETransactions on Systems, Man, and Cybernetics, PartB (Cybernetics), 38(4):994{1001, 2008.
  11. [11] Zhenfeng Chen, Yun Zhang, and Zhongsheng Wang.Output feedback stabilization of a class of uncertainnon-ane nonlinear systems in discrete time. Controland Intelligent Systems, 44(2), 2016.
  12. [12] Jang-Hyun Park, Sung-Hoe Huh, Seong-Hwan Kim,Sam-Jun Seo, and Gwi-Tae Park. Direct adaptivecontroller for nonane nonlinear systems using self-structuring neural networks. IEEE Transactions onNeural Networks, 16(2):414{422, 2005.
  13. [13] Xiong Yang, Derong Liu, Ding Wang, and QinglaiWei. Discrete-time online learning control for a classof unknown nonane nonlinear systems using reinforcement learning. Neural Networks, 55:30{41, 2014.
  14. [14] Zheng Wang, P Goldsmith, and Jason Gu. Adaptivetrajectory tracking control for Euler-Lagrange systems with application to robot manipulators. Controland Intelligent Systems, 37(1):46{56, 2009.
  15. [15] Farbod Fahimi and Susheel Praneeth. A universaltrajectory tracking controller for mobile robots viamodel-free online reinforcement learning. Control andIntelligent Systems, 43(1), 2015.
  16. [16] Weibing Gao, Yufu Wang, and Abdollah Homaifa.Discrete-time variable structure control systems.IEEE Transactions on Industrial Electronics,42(2):117{122, 1995.
  17. [17] Richard Bellman. Dynamic programming andstochastic control processes. Information and Control, 1(3):228{239, 1958.

Important Links:

Go Back