A LARGE-SCALE PATH PLANNING ALGORITHM FOR UNDERWATER ROBOTS BASED ON DEEP REINFORCEMENT LEARNING, 204-210.

doi:10.2316/J.2024.206-1035

A LARGE-SCALE PATH PLANNING ALGORITHM FOR UNDERWATER ROBOTS BASED ON DEEP REINFORCEMENT LEARNING, 204-210.

Wenhui Wang, Leqing Li, Fumeng Ye, Yumin Peng, and Yiming Ma

References

[1] T. Zhang, H. Zhou, J. Wang, Z. Liu, J. Xin, and Y. Pang,Optimum design of a small intelligent ocean explorationunderwater vehicle, Ocean Engineering, 184, 2019, 40–58.
[2] L.T. Aloba, Synthesis of intelligent automatic control systemof an autonomous underwater vehicle as a group agent,Shipbuilding and Marine Infrastructure, 1(11), 2019, 74–84.
[3] G.S. Lima, S. Trimpe, and W.M. Bessa, Sliding mode controlwith Gaussian process regression for underwater robots, Journalof Intelligent & Robotic Systems, 99(3), 2020, 487–498.
[4] H.S. Lim, S. Fan, C.K.H. Chin, S. Chai, and E. Kim, Con-strained path planning of autonomous underwater vehicle usingselectively-hybridized particle swarm optimization algorithms,IFAC-PapersOnLine, 52(21), 2019, 315–322.
[5] Y. Zhuang, H. Huang, S. Sharma, D. Xu, and Q. Zhang,Cooperative path planning of multiple autonomous underwatervehicles operating in dynamic ocean environment, ISATransactions, 94, 2019, 174–186.
[6] A. Krishnan, K. Parvathy, and V. Donekal, ML androbotics integrated with AUVs for sub-aquatic applications,International Journal of Robotics and Automation, 6(2), 2020,1–16.
[7] Z. Wang, S. Zhang, X. Feng, and Y. Sui, Autonomousunderwater vehicle path planning based on actor-multi-criticreinforcement learning, Proceedings of the Institution ofMechanical Engineers, Part I: Journal of Systems and ControlEngineering, 235(10), 2021, 1787–1796.
[8] X.-W. Ma, Y.-L. Chen, G.-Q. Bai, Y.-B. Sha, and J. Liu, Multi-autonomous underwater vehicles collaboratively search forintelligent targets in an unknown environment in the presenceof interception, Proceedings of the Institution of MechanicalEngineers, Part C: Journal of Mechanical Engineering Science,235(9), 2021, 1539–1554.
[9] V.S. Bykova, A.I. Mashoshin, and I.V. Pashkevich, Safenavigation algorithm for autonomous underwater vehicles,Giroskopiya i Navigatsiya, 29(1), 2021, 97–110.
[10] X. Cao and F. Zuo, A fuzzy-based potential ﬁeld hierarchicalreinforcement learning approach for target hunting by multi-AUV in 3-D underwater environments, International Journalof Control, 94(5), 2021, 1334–1343.
[11] D.C. Cicek, E. Duran, B. Saglam, F.B. Mutlu, and S.S. Kozat,Oﬀ-policy correction for deep deterministic policy gradientalgorithms via batch prioritized experience replay, Proc.2021 IEEE 33rd International Conf. on Tools with ArtiﬁcialIntelligence (ICTAI), Washington, DC, 2021, 1255–1262.
[12] Z. Shi, Z. Jin, and H. Wang, Satellite attitude trackingdecision method based on deep deterministic policy gradientfor moving target observation, Proc. 2021 IEEE 5th AdvancedInformation Technology, Electronic and Automation ControlConf. (IAEAC), Chongqing, China, 2021, 868–872.
[13] L. Chen, Y. Zhao, H. Zhao, and B. Zheng, Non-communicationdecentralized multi-robot collision avoidance in grid mapworkspace with double deep Q-network, Sensors, 21(3), 2021,841–849.
[14] S. Sanaye and A. Sarraﬁ, A novel energy management methodbased on deep Q network algorithm for low operating costof an integrated hybrid system, Energy Reports, 7(3), 2021,2647–2663.
[15] H. Ahmadi, M. Raﬁei, M.A. Igder, M. Gheisarnejad, M.-H. Khooban, An energy eﬃcient solution for fuel cell heatrecovery in zero-emission ferry boats: Deep deterministic policygradient, IEEE Transactions on Vehicular Technology, 70(8),2021, 7571–7581.
[16] Z. Ma, Q. Huo, T. Zhang, J. Hao, and W. Wang, Deepdeterministic policy gradient based energy managementstrategy for hybrid electric tracked vehicle with online updatingmechanism, IEEE Access, 9, 2021, 7280–7292.
[17] Q. Shen, Seeking for passenger under dynamic prices: AMarkov decision process approach, Journal of Computer andCommunications, 9(12), 2021, 80–97.
[18] D. Qiao, Y. Xie, Q. Jia, and T. Yao, Research on ﬂeetcontrol based on Markov random channel allocation, ComputerSimulation, 38(9), 2021, 138–144.
[19] L. Zeng, Z. Yang, S. Liao, C. Yang, and D. Li, Metamorphosisrelationship generation based on ﬁxed memory step gradientdescent method with noise, Proc. 2021 IEEE 12th InternationalConf. on Software Engineering and Service Science (ICSESS),Beijing, China, 2021, 282–286.
[20] Y. Wang, Y. He, and Z. Zhu, Study on fast speed fractionalorder gradient descent method and its application in neuralnetworks, Neurocomputing, 489, 2022, 366–376.

Important Links:

Abstract
DOI: 10.2316/J.2024.206-1035
From Journal (206) International Journal of Robotics and Automation - 2024

Go Back