Buqing Nie,∗ Yue Gao,∗∗ Yidong Mei,∗ and Feng Gao∗∗∗


  1. [1] T.T. Mac, C. Copot, D.T. Tran, and R. De Keyser, Heuristicapproaches in robot path planning: A survey, Robotics andAutonomous Systems, 86, 2016, 13–28.
  2. [2] E. Plaku, L.E. Kavraki, and M.Y. Vardi, Motion planning withdynamics by a synergistic combination of layers of planning,IEEE Transactions on Robotics, 26(3), 2010, 469–482.
  3. [3] P.E. Hart, N.J. Nilsson, and B. Raphael, A formal basis forthe heuristic determination of minimum cost paths, IEEETransactions on Systems Science and Cybernetics, 4(2), 1968,100–107.
  4. [4] O. Khatib, Real-time obstacle avoidance for manipulators andmobile robots, in I.J. Cox and G.T. Wilfong (eds.), Autonomousrobot vehicles (Berlin: Springer, 1986), 396–404.
  5. [5] V. Mnih, K. Kavukcuoglu, D. Silver, et al., Human-level controlthrough deep reinforcement learning, Nature, 518(7540), 2015,529.
  6. [6] M. Sadeghzadeh, D. Calvert, and H.A. Abdullah, Autonomousvisual servoing of a robot manipulator using reinforcementlearning, International Journal of Robotics and Automation,31(1), 2016, 26–28.
  7. [7] T. Yan, W. Zhang, S.X. Yang, and L. Yu, Soft actor-criticreinforcement learning for robotic manipulator with hindsightexperience replay, International Journal of Robotics Automa-tion, 34(5), 2019, 536–543.
  8. [8] Y. Liu, M. Cong, H. Dong, and D. Liu, Reinforcement learn-ing and ega-based trajectory planning for dual robots, Inter-national Journal of Robotics and Automation, 33(4), 2018,367–378.
  9. [9] M. Pflueger, A. Agha, and G.S. Sukhatme, Rover-IRL: Inversereinforcement learning with soft value iteration networks forplanetary rover path planning, IEEE Robotics and AutomationLetters, 4(2), 2019, 1387–1394.
  10. [10] K. Arulkumaran, M.P. Deisenroth, M. Brundage, and A.A.Bharath, A brief survey of deep reinforcement learning,arXiv:1708.05866, 2017.
  11. [11] A. Tamar, Y. Wu, G. Thomas, S. Levine, and P. Abbeel, Valueiteration networks, Advances in Neural Information ProcessingSystems, Barcelona, Spain, 2016, 2154–2162.
  12. [12] R. Bellman, Dynamic programming, Science, 153(3731), 1966,34–37.
  13. [13] M. Riedmiller, R. Hafner, T. Lampe, et al., Learning byplaying solving sparse reward tasks from scratch, Int. Conf.on Machine Learning, Stockholm, Sweden, 2018, 4344–4353.
  14. [14] B. Paden, M. ˇC´ap, S.Z. Yong, D. Yershov, and E. Fraz-zoli, A survey of motion planning and control techniques forself-driving urban vehicles, IEEE Transactions on IntelligentVehicles, 1(1), 2016, 33–55.
  15. [15] G. Che, L. Liu, and Z. Yu, An improved ant colony optimizationalgorithm based on particle swarm optimization algorithm forpath planning of autonomous underwater vehicle, Journal ofAmbient Intelligence and Humanized Computing, 11(8), 2020,3349–3354.
  16. [16] P.K.Y. Yap, N. Burch, R.C. Holte, and J. Schaeffer, Any-angle path planning for computer games, Seventh ArtificialIntelligence and Interactive Digital Entertainment Conf., PaloAlto, CA, USA, 2011.
  17. [17] S. Koenig and M. Likhachev, D lite, Eighteenth nationalConf. on Artificial intelligence, CA, USA, 2002, 476–483.
  18. [18] J. Wang, X.-J. Lin, H.-Y. Zhang, G.-D. Lu, Q.-L. Pan, andH. Li, Path planning of manipulator using potential fieldcombined with sphere tree model, International Journal ofRobotics and Automation, 35(2), 2020, 148–161.
  19. [19] S.M. LaValle, Rapidly-exploring random trees: A new tool forpath planning, Technical Report, Computer Science Depart-ment, Iowa State University, Ames, USA, 1998.271
  20. [20] Y. Liu, M. Cong, H. Dong, D. Liu, and Y. Du, Time-optimalmotion planning for robot manipulators based on elitist geneticalgorithm, International Journal of Robotics and Automation,32(4), 2017, 396–405.
  21. [21] J. Xin, J. Zhong, J. Sheng, P. Li, and Y. Cui, Improved geneticalgorithms based on data-driven operators for path planningof unmanned surface vehicle, International Journal of Roboticsand Automation, 34(6), 2019, 713–722.
  22. [22] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous controlwith deep reinforcement learning, arXiv:1509.02971, 2015.
  23. [23] V. Mnih, K. Kavukcuoglu, D. Silver, et al., Playing atari withdeep reinforcement learning, arXiv:1312.5602, 2013.
  24. [24] M. Deisenroth and C.E. Rasmussen, PILCO: A model-basedand data-efficient approach to policy search, Proc. of the28th Int. Conf. on Machine Learning (ICML-11), Bellevue,Washington, 2011, 465–472.
  25. [25] R. McAllister and C.E. Rasmussen, Data-efficient reinforce-ment learning in continuous state-action gaussian-pomdps, Ad-vances in Neural Information Processing Systems, Long Beach,CA, USA, 2017, 2040–2049.
  26. [26] D. Ha and J. Schmidhuber, Recurrent world models facilitatepolicy evolution, in Advances in Neural Information ProcessingSystems, Montreal, Canada, 2018, 2450–2462.
  27. [27] S. Niu, S. Chen, H. Guo, C. Targonski, M.C. Smith, andJ. Kovaˇcevi´c, Generalized value iteration networks: Life beyondlattices, Thirty-Second AAAI Conf. on Artificial Intelligence,New Orleans, LA, USA 2018.
  28. [28] L. Lee, E. Parisotto, D.S. Chaplot, E. Xing, and R. Salakhut-dinov, Gated path planning networks, Int. Conf. on MachineLearning, Stockholm, Sweden, 2018, 2947–2955.
  29. [29] T. Chen and C. Guestrin, XGBoost: A scalable tree boostingsystem, Proceedings of the 22nd ACM SIGKDD Int. Conf.on Knowledge Discovery and Data Mining, CA, USA, 2016,785–794.
  30. [30] P. Abbeel and A.Y. Ng, Apprenticeship learning via inversereinforcement learning, Proc. of the Twenty-first Int. Conf. onMachine Learning, Alberta, Canada, 2004, 1.
  31. [31] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learningfor image recognition, Proc. of the IEEE Conf. on ComputerVision and Pattern Recognition, WA, USA, 2016, 770–778.
  32. [32] E. Rohmer, S.P. Singh, and M. Freese, V-REP: A versatileand scalable robot simulation framework, 2013 IEEE/RSJ Int.Conf. Intelligent Robots and Systems, Tokyo, Japan, 2013,1321–1326.

Important Links:

Go Back