RESEARCH ON OBSTACLE AVOIDANCE OF ROBOTIC MANIPULATORS BASED ON DDPG AND TRANSFER LEARNING

Shuhuan Wen,∗,∗∗ Wen Long Zhen,∗,∗∗ Tao Wang,∗,∗∗ Jianhua Chen,∗,∗∗ Hak-Keung Lam,∗∗∗ Qian Shi,∗∗∗ and Zekai Li∗,∗∗

References

  1. [1] F. Gao, L. Wang, K. Wang, W. Wu, B. Zhou, L. Han, and S.Shen, Optimal trajectory generation for quadrotor teach-and-repeat, IEEE Robotics and Automation Letters, 4(2), 2019,1493–1500.
  2. [2] H. Ghariblu, and M. Shahabi, Path planning of complex pipejoints welding with redundant robotic systems, Robotica, 37(6),2019, 1020–1032.
  3. [3] L. Chang, L. Shan, C. Jiang, and Y. Dai, Reinforcement basedmobile robot path planning with improved dynamic windowapproach in unknown environment, Autonomous Robots, 45,2021, 51–76.
  4. [4] A. Fabris, L. Parolini, S. Schneider, and A. Cenedese, Use ofprobabilistic graphical methods for online map validation, Proc.IEEE Intelligent Vehicles Symposium Workshops, Nagoya,2021, 43–48.
  5. [5] G. Tang, C. Tang, C. Claramunt, X. Hu, and P. Zhou, GeometricA-Star algorithm: An improved A-Star algorithm for AGVpath planning in a port environment, IEEE Access, 9, 2021,59196–59210.
  6. [6] M. Akram, A. Habib, and J.C.R. Alcantud, An optimizationstudy based on Dijkstra algorithm for a network withtrapezoidal picture fuzzy numbers, Neural Computing andApplications, 33, 2021, 1329–1342.
  7. [7] E.R. Bachmann, E. Hodgson, C. Hoffbauer, and J. Messinger,Multi-user redirected walking and resetting using artificialpotential fields, IEEE Transactions on Visualization andComputer Graphics, 25(5), 2019, 2022–2031.
  8. [8] W. Sun, Y. Wu, and X. Lv, Adaptive neural network controlfor full-state constrained robotic manipulator with actuatorsaturation and time-varying delays, IEEE Transactions onNeural Networks and Learning Systems, 33(8), 2022, 3331–3342.
  9. [9] S. Wen, W. Zheng, J. Zhu, X. Li, and S. Chen, Elman fuzzyadaptive control for obstacle avoidance of mobile robots usinghybrid force/position incorporation, IEEE Transactions onSystems, Man, and Cybernetics, 42(4), 2011, 603–608.
  10. [10] S. Wen, X. Chen, C. Ma, H.K. Lam, and S. Hua, The Q-learning obstacle avoidance algorithm based on EKF-SLAMfor NAO autonomous walking under unknown environments,Robotics and Autonomous Systems, 72, 2015, 29–36.
  11. [11] C.P. Andriotis and K.G. Papakonstantinou, Managing engi-neering systems with large state and action spaces throughdeep reinforcement learning, Reliability Engineering & SystemSafety, 191, 2019, 106483–106500.
  12. [12] V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness,M.G. Bellemare, A. Graves, et al, Human-level control throughdeep reinforcement learning, Nature, 518(7540), 2015, 529–533.
  13. [13] K. Azizzadenesheli, E. Brunskill, and A. Anandkumar,Efficient exploration through Bayesian deep Q-networks, Proc.Information. Theory and Applications Workshop, San Diego,CA, 2018, 1–9.
  14. [14] T. Schaul, J. Quan, I. Antonoglou, and D. Silver, Prioritizedexperience replay, arXiv preprint arXiv:1511.05952, 2015, 1–21.
  15. [15] J. Schulman, S. Levine, P. Abbeel, M.I. Jordan, and P. Moritz,Trust region policy optimization, Computer Science, 37, 2015,1889–1897.
  16. [16] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, andM. Riedmiller, Deterministic policy gradient algorithms, Proc.31st Int. Conf. on Machine Learning, Beijing, 2014, 387–395.
  17. [17] J. Hauswald, M.A. Laurenzano, Y. Zhang, C. Li, A. Rovinski,A. Khurana, R.G. Dreslinski, et al, Sirius: An open end-to-end voice and vision personal assistant and its implicationsfor future warehouse scale computers, Proc. 20th Int. Conf.on Architectural Support for Programming Languages andOperating Systems (ASPLOS), New York, NY, March, 2015,223–238.
  18. [18] M. Gutoski, M. Ribeiro, L.T. Hattori, M. Romero, A.E.Lazzaretti, and H.S. Lopes, A comparative study of transferlearning approaches for video anomaly detection, InternationalJournal of Pattern Recognition and Artificial Intelligence,35(5), 2021, 1–27
  19. [19] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong,and Q. He, A comprehensive survey on transfer learning,Proceedings of the IEEE, 109(1), 2021, 43–76.
  20. [20] B. Tekin, S.N. Sinha, and P. Fua, Real-time seamless singleshot 6D object pose prediction, Proc. IEEE Conf. on ComputerVision and Pattern Recognition, Salt Lake City, UT, 2018,292–301.
  21. [21] A. Amiranashvili, A. Dosovitskiy, V. Koltun, and T. Brox,TD or not TD: Analyzing the role of temporal differencing indeep reinforcement learning, arXiv preprint arXiv:1806.01175,2018, 1–15.
  22. [22] A.T. Miller, S. Knoop, H.I. Christensen, and P.K. Allen, Auto-matic grasp planning using shape primitives, Proc. IEEE Int.Conf. on Robotics and Automation, Taipei, 2003, 1824–1829.
  23. [23] K. Lee, J. Lee, B. Woo, J.-W. Lee, Y.-J. Lee, and S.Ra, Modeling and control of articulated robot arm withembedded joint actuators, Proc. Int. Conf. on Information andCommunication Technology Robotics, Busan, 2018, 1–4.
  24. [24] A.K. Singh, and G.C. Nandi, NAO humanoid robot: Analysisof calibration techniques for robot sketch drawing, Roboticsand Autonomous Systems, 79, 2016, 108–121.
  25. [25] S. Wen, J. Chen, S. Wang, H. Zhang, and X. Hu, Path planningof humanoid arm based on deep deterministic policy gradient,Proc. IEEE Int. Conf. on Robotics and Biomimetics (ROBIO),Kuala Lumpur, December 2018, 1755–1760..

Important Links:

Go Back