RESEARCH ON OBSTACLE AVOIDANCE OF ROBOTIC MANIPULATORS BASED ON DDPG AND TRANSFER LEARNING, 136-147.

doi:10.2316/J.2023.206-0773

RESEARCH ON OBSTACLE AVOIDANCE OF ROBOTIC MANIPULATORS BASED ON DDPG AND TRANSFER LEARNING, 136-147.

Shuhuan Wen, Wen Long Zhen, Tao Wang, Jianhua Chen, Hak-Keung Lam, Qian Shi, and Zekai Li

References

[1] F. Gao, L. Wang, K. Wang, W. Wu, B. Zhou, L. Han, and S.Shen, Optimal trajectory generation for quadrotor teach-and-repeat, IEEE Robotics and Automation Letters, 4(2), 2019,1493–1500.
[2] H. Ghariblu, and M. Shahabi, Path planning of complex pipejoints welding with redundant robotic systems, Robotica, 37(6),2019, 1020–1032.
[3] L. Chang, L. Shan, C. Jiang, and Y. Dai, Reinforcement basedmobile robot path planning with improved dynamic windowapproach in unknown environment, Autonomous Robots, 45,2021, 51–76.
[4] A. Fabris, L. Parolini, S. Schneider, and A. Cenedese, Use ofprobabilistic graphical methods for online map validation, Proc.IEEE Intelligent Vehicles Symposium Workshops, Nagoya,2021, 43–48.
[5] G. Tang, C. Tang, C. Claramunt, X. Hu, and P. Zhou, GeometricA-Star algorithm: An improved A-Star algorithm for AGVpath planning in a port environment, IEEE Access, 9, 2021,59196–59210.
[6] M. Akram, A. Habib, and J.C.R. Alcantud, An optimizationstudy based on Dijkstra algorithm for a network withtrapezoidal picture fuzzy numbers, Neural Computing andApplications, 33, 2021, 1329–1342.
[7] E.R. Bachmann, E. Hodgson, C. Hoﬀbauer, and J. Messinger,Multi-user redirected walking and resetting using artiﬁcialpotential ﬁelds, IEEE Transactions on Visualization andComputer Graphics, 25(5), 2019, 2022–2031.
[8] W. Sun, Y. Wu, and X. Lv, Adaptive neural network controlfor full-state constrained robotic manipulator with actuatorsaturation and time-varying delays, IEEE Transactions onNeural Networks and Learning Systems, 33(8), 2022, 3331–3342.
[9] S. Wen, W. Zheng, J. Zhu, X. Li, and S. Chen, Elman fuzzyadaptive control for obstacle avoidance of mobile robots usinghybrid force/position incorporation, IEEE Transactions onSystems, Man, and Cybernetics, 42(4), 2011, 603–608.
[10] S. Wen, X. Chen, C. Ma, H.K. Lam, and S. Hua, The Q-learning obstacle avoidance algorithm based on EKF-SLAMfor NAO autonomous walking under unknown environments,Robotics and Autonomous Systems, 72, 2015, 29–36.
[11] C.P. Andriotis and K.G. Papakonstantinou, Managing engi-neering systems with large state and action spaces throughdeep reinforcement learning, Reliability Engineering & SystemSafety, 191, 2019, 106483–106500.
[12] V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness,M.G. Bellemare, A. Graves, et al, Human-level control throughdeep reinforcement learning, Nature, 518(7540), 2015, 529–533.
[13] K. Azizzadenesheli, E. Brunskill, and A. Anandkumar,Eﬃcient exploration through Bayesian deep Q-networks, Proc.Information. Theory and Applications Workshop, San Diego,CA, 2018, 1–9.
[14] T. Schaul, J. Quan, I. Antonoglou, and D. Silver, Prioritizedexperience replay, arXiv preprint arXiv:1511.05952, 2015, 1–21.
[15] J. Schulman, S. Levine, P. Abbeel, M.I. Jordan, and P. Moritz,Trust region policy optimization, Computer Science, 37, 2015,1889–1897.
[16] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, andM. Riedmiller, Deterministic policy gradient algorithms, Proc.31st Int. Conf. on Machine Learning, Beijing, 2014, 387–395.
[17] J. Hauswald, M.A. Laurenzano, Y. Zhang, C. Li, A. Rovinski,A. Khurana, R.G. Dreslinski, et al, Sirius: An open end-to-end voice and vision personal assistant and its implicationsfor future warehouse scale computers, Proc. 20th Int. Conf.on Architectural Support for Programming Languages andOperating Systems (ASPLOS), New York, NY, March, 2015,223–238.
[18] M. Gutoski, M. Ribeiro, L.T. Hattori, M. Romero, A.E.Lazzaretti, and H.S. Lopes, A comparative study of transferlearning approaches for video anomaly detection, InternationalJournal of Pattern Recognition and Artiﬁcial Intelligence,35(5), 2021, 1–27
[19] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong,and Q. He, A comprehensive survey on transfer learning,Proceedings of the IEEE, 109(1), 2021, 43–76.
[20] B. Tekin, S.N. Sinha, and P. Fua, Real-time seamless singleshot 6D object pose prediction, Proc. IEEE Conf. on ComputerVision and Pattern Recognition, Salt Lake City, UT, 2018,292–301.
[21] A. Amiranashvili, A. Dosovitskiy, V. Koltun, and T. Brox,TD or not TD: Analyzing the role of temporal diﬀerencing indeep reinforcement learning, arXiv preprint arXiv:1806.01175,2018, 1–15.
[22] A.T. Miller, S. Knoop, H.I. Christensen, and P.K. Allen, Auto-matic grasp planning using shape primitives, Proc. IEEE Int.Conf. on Robotics and Automation, Taipei, 2003, 1824–1829.
[23] K. Lee, J. Lee, B. Woo, J.-W. Lee, Y.-J. Lee, and S.Ra, Modeling and control of articulated robot arm withembedded joint actuators, Proc. Int. Conf. on Information andCommunication Technology Robotics, Busan, 2018, 1–4.
[24] A.K. Singh, and G.C. Nandi, NAO humanoid robot: Analysisof calibration techniques for robot sketch drawing, Roboticsand Autonomous Systems, 79, 2016, 108–121.
[25] S. Wen, J. Chen, S. Wang, H. Zhang, and X. Hu, Path planningof humanoid arm based on deep deterministic policy gradient,Proc. IEEE Int. Conf. on Robotics and Biomimetics (ROBIO),Kuala Lumpur, December 2018, 1755–1760..

Important Links:

Abstract
DOI: 10.2316/J.2023.206-0773
From Journal (206) International Journal of Robotics and Automation - 2023

Go Back