REINFORCEMENT LEARNING: APPLICATION AND ADVANCES TOWARDS STABLE CONTROL STRATEGIES, 53-57. SI

Abhishek Kumar

References

  1. [1] R.C. Dorf and R.H. Bishop, Modern control systems. (Harlow:Pearson, 2011).
  2. [2] M. Fliess and C. Join, Model-free control, International Journalof Control, 86(12), 2013, 2228–2252.
  3. [3] P. Vas, Artificial-intelligence-based electrical machines anddrives: Application of fuzzy, neural, fuzzy-neural, and genetic-algorithm-based techniques, 45. (Oxford: Oxford UniversityPress, 1999).
  4. [4] L. Behera and I. Kar, Intelligent systems and control principlesand applications. (Oxford: Oxford University Press, 2010).
  5. [5] M. Wiering and M. van Otterlo, Reinforcement learning,adaptation, learning, and optimization, 12. (New York:Springer-Verlag, 2012).
  6. [6] J. Kober, J.A. Bagnell, and J. Peters, Reinforcement learning inrobotics: A survey, International Journal of Robotics Research,32(11), 2012, 579–610.
  7. [7] S.M. Shortreed, E. Laber, D.J. Lizotte, T.S. Stroup, J. Pineau,and S.A. Murphy, Informing sequential clinical decision-makingthrough reinforcement learning: An empirical study, MachineLearning, 84, 2011, 109–136
  8. [8] P. Kormushev, S. Calinon, and D.G. Caldwell, Reinforcementlearning in robotics: Applications and real-world challenges,Robotics, 2(3), 2013, 122–148.
  9. [9] R. Sharma, Fuzzy Q learning based UAV autopilot, Proc.Innovative Applications of Computational Intelligence onPower, Energy, and Controls with their Impact on Humanity(CIPECH), Ghaziabad, 2014, 29–33.
  10. [10] D. Han and S.N. Balakrishnan, State-constrained agile missilecontrol with adaptive-critic-based neural networks, IEEETransactions on Control Systems Technology, 10(4), 2002,481–489.
  11. [11] A. Kumar and R. Sharma, Fuzzy Lyapunov reinforcementlearning for non linear systems, ISA Transactions, 67, 2017,151–159.
  12. [12] S.N. Balakrishnan, J. Ding, and F.L. Lewis, Issues on stabilityof ADP feedback controllers for dynamical systems, IEEETransactions on Systems, Man, and Cybernetics—Part B:Cybernetics, 4(38) 2008, 913–917.
  13. [13] S. Bhasin, N. Sharma, P. Patre, and W. Dixon, Asymptotictracking by a reinforcement learning-based adaptive criticcontroller, Journal of Control Theory and Applications, 9(3),2011, 400–409.
  14. [14] L. Tang, Y.-J. Liu, and S. Tong, Adaptive neural control usingreinforcement learning for a class of robot manipulator, NeuralComputing and Applications, 25(1), 2014, 135–141.
  15. [15] X. Yang, D. Liu, and D. Wang, Reinforcement learningfor adaptive optimal control of unknown continuous-timenonlinear systems with input constraints, International Journalof Control, 87(3) 2014, 553–566.
  16. [16] J. Clempner and A. Poznyak, Analysis of best-reply strategiesin repeated finite Markov chains games, Proc. 52nd IEEE Conf.on Decision and Control, Firenze, 2013, 568–573.
  17. [17] D. Vrabie and K.G. Vamaoudakis, Optimal adaptive controland differential games by reinforcement learning principles.(Stevenage: IET Press, 2013).
  18. [18] Y. Ouyang, W. He, and X. Li, Reinforcement learning control ofa single-link flexible robotic manipulator, IET Control Theory& Applications, 11(9), 2017, 1426–1433.
  19. [19] A. Kumar and R. Sharma, Linguistic Lyapunov reinforcementlearning control for robotic manipulators, Neurocomputing,272, 2018, 84–95.
  20. [20] A. Kumar and R. Sharma, Neural/fuzzy self learning Lyapunovcontrol for non linear systems, International Journal ofInformation Technology, 14, 2018, 229–242.56
  21. [21] R. Sharma, Lyapunov theory based stable Markov game fuzzycontrol for non-linear systems, Engineering Applications ofArtificial Intelligence, 55, 2016, 119–127.
  22. [22] A. Kumar, R. Sharma, and P. Varshney, Lyapunov fuzzyMarkov game controller for two link robotic manipulator,Journal of Intelligent & Fuzzy Systems, 34(3), 2018, 1479–1490.
  23. [23] Z. Marvi and B. Kiumarsi, Safe reinforcement learning: Acontrol barrier function optimization approach, InternationalJournal of Robust and Nonlinear Control, 31(6), 2021,1923–1940.
  24. [24] R. Cheng, G. Orosz, R.M. Murray, and J.W. Burdick, End-to-end safe reinforcement learning through barrier functions forsafety-critical continuous control tasks, Proc. AAAI Conf. onArtificial Intelligence, 33, 2019, 3387–3395.
  25. [25] J. Choi, F. Castaneda, C.J. Tomlin, and K. Sreenath,Reinforcement learning for safety-critical control under modeluncertainty, using control Lyapunov functions and controlbarrier functions, arXiv preprint arXiv:2004.07584, 2020.
  26. [26] Z. Cai, H. Cao, W. Lu, L. Zhang, and H. Xiong, Safe multi-agentreinforcement learning through decentralized multiple controlbarrier functions, arXiv preprint arXiv:2103.12553, 2021.
  27. [27] H. Minghao, T. Yuan, Z. Lixian, W. Jun, and P. Wei, H∞ model-free reinforcement learning with robust stability guarantee,Proc. NeurIPS Workshop on Robot Learning: Control andInteraction in the Real World, Vancouver, 2019, 1–5.
  28. [28] J. Subramanian, A. Sinha, R. Seraj, and A. Mahajan,Approximate information state for approximate planning andreinforcement learning in partially observed systems, Journalof Machine Learning Research, 23, 2022, 1–83.
  29. [29] S.A. Khader, H. Yin, P. Falco, and D. Kragic, Stability-guaranteed reinforcement learning for contact-rich manipula-tion, IEEE Robotics and Automation Letters, 6(1), 2020, 1–8.
  30. [30] M. Jin and J. Lavaei, Stability-certified reinforcement learning:A control-theoretic perspective, IEEE Access, 8, 2020,229086–229100.
  31. [31] L. Brunke, M. Greeff, A.W. Hall, Z. Yuan, S. Zhou, J. Panerati,and A.P. Schoellig, Safe learning in robotics: From learning-based control to safe reinforcement learning, Annual Review ofControl, Robotics, and Autonomous Systems, 5, 2021, 411–444.
  32. [32] B.R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A.A. AlSallab, S. Yogamani, and P. P´erez, Deep reinforcement learningfor autonomous driving: A survey, IEEE Transactions onIntelligent Transportation Systems, 23(6), 2021, 4909–4926.
  33. [33] A. Kukker and R. Sharma, Genetic algorithm-optimizedfuzzy Lyapunov reinforcement learning for nonlinear systems,Arabian Journal for Science and Engineering, 45, 2020,1629–1638.
  34. [34] A. Kukker and R. Sharma, Stochastic genetic algorithm-assisted fuzzy Q-learning for robotic manipulators, ArabianJournal for Science and Engineering, 46, 2021, 9527–9539.
  35. [35] R.S. Sutton and A.G. Barto, Reinforcement learning: Anintroduction. (Cambridge: MIT Press, 2018).
  36. [36] W. Wu and Y. Wei, Guiding unmanned aerial vehicle pathplanning design based on improved ant colony algorithm,Mechatronic Systems and Control, 49, 2021, 48–54.
  37. [37] S. Murali, K. Lokesha, and R. George, Path following and flightcharacteristics evaluation of trajectory tracking algorithms inUAV navigation, Mechatronic Systems and Control, 50, 2022,22–27.
  38. [38] G. Shao, Intelligent vehicle control system based on cloudcomputing and Internet of Things, Mechatronic Systems andControl, 49, 2021, 245–250.
  39. [39] S. Raj, A.K.R. Nandula, A.K. Deb, and K. Cheruvu,Decentralized control for stabilization of coupled pendulumsusing H∞-based integral sliding mode control, MechatronicSystems and Control, 48, 2020, 10–16.
  40. [40] S. Raj, Decentralized adaptive control of nonlinear intercon-nected systems, Mechatronics Systems and Control, 49, 2021,41–47.

Important Links:

Go Back