Buqing Nie,∗ Yue Gao,∗∗ Yidong Mei,∗ and Feng Gao∗∗∗


  1. [1] T.T. Mac, C. Copot, D.T. Tran, and R. De Keyser, Heuristic approaches in robot path planning: A survey, Robotics and Autonomous Systems, 86, 2016, 13–28.
  2. [2] E. Plaku, L.E. Kavraki, and M.Y. Vardi, Motion planning with dynamics by a synergistic combination of layers of planning, IEEE Transactions on Robotics, 26(3), 2010, 469–482.
  3. [3] P.E. Hart, N.J. Nilsson, and B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Transactions on Systems Science and Cybernetics, 4(2), 1968, 100–107.
  4. [4] O. Khatib, Real-time obstacle avoidance for manipulators and mobile robots, in I.J. Cox and G.T. Wilfong (eds.), Autonomous robot vehicles (Berlin: Springer, 1986), 396–404.
  5. [5] V. Mnih, K. Kavukcuoglu, D. Silver, et al., Human-level control through deep reinforcement learning, Nature, 518(7540), 2015, 529.
  6. [6] M. Sadeghzadeh, D. Calvert, and H.A. Abdullah, Autonomous visual servoing of a robot manipulator using reinforcement learning, International Journal of Robotics and Automation, 31(1), 2016, 26–28.
  7. [7] T. Yan, W. Zhang, S.X. Yang, and L. Yu, Soft actor-critic reinforcement learning for robotic manipulator with hindsight experience replay, International Journal of Robotics Automation, 34(5), 2019, 536–543.
  8. [8] Y. Liu, M. Cong, H. Dong, and D. Liu, Reinforcement learning and ega-based trajectory planning for dual robots, International Journal of Robotics and Automation, 33(4), 2018, 367–378.
  9. [9] M. Pflueger, A. Agha, and G.S. Sukhatme, Rover-IRL: Inverse reinforcement learning with soft value iteration networks for planetary rover path planning, IEEE Robotics and Automation Letters, 4(2), 2019, 1387–1394.
  10. [10] K. Arulkumaran, M.P. Deisenroth, M. Brundage, and A.A. Bharath, A brief survey of deep reinforcement learning, arXiv:1708.05866, 2017.
  11. [11] A. Tamar, Y. Wu, G. Thomas, S. Levine, and P. Abbeel, Value iteration networks, Advances in Neural Information Processing Systems, Barcelona, Spain, 2016, 2154–2162.
  12. [12] R. Bellman, Dynamic programming, Science, 153(3731), 1966, 34–37.
  13. [13] M. Riedmiller, R. Hafner, T. Lampe, et al., Learning by playing solving sparse reward tasks from scratch, Int. Conf. on Machine Learning, Stockholm, Sweden, 2018, 4344–4353.
  14. [14] B. Paden, M. ˇCáp, S.Z. Yong, D. Yershov, and E. Frazzoli, A survey of motion planning and control techniques for self-driving urban vehicles, IEEE Transactions on Intelligent Vehicles, 1(1), 2016, 33–55.
  15. [15] G. Che, L. Liu, and Z. Yu, An improved ant colony optimization algorithm based on particle swarm optimization algorithm for path planning of autonomous underwater vehicle, Journal of Ambient Intelligence and Humanized Computing, 11(8), 2020, 3349–3354.
  16. [16] P.K.Y. Yap, N. Burch, R.C. Holte, and J. Schaeffer, Anyangle path planning for computer games, Seventh Artificial Intelligence and Interactive Digital Entertainment Conf., Palo Alto, CA, USA, 2011.
  17. [17] S. Koenig and M. Likhachev, D lite, Eighteenth national Conf. on Artificial intelligence, CA, USA, 2002, 476–483.
  18. [18] J. Wang, X.-J. Lin, H.-Y. Zhang, G.-D. Lu, Q.-L. Pan, and H. Li, Path planning of manipulator using potential field combined with sphere tree model, International Journal of Robotics and Automation, 35(2), 2020, 148–161.
  19. [19] S.M. LaValle, Rapidly-exploring random trees: A new tool for path planning, Technical Report, Computer Science Department, Iowa State University, Ames, USA, 1998. 6
  20. [20] Y. Liu, M. Cong, H. Dong, D. Liu, and Y. Du, Time-optimal motion planning for robot manipulators based on elitist genetic algorithm, International Journal of Robotics and Automation, 32(4), 2017, 396–405.
  21. [21] J. Xin, J. Zhong, J. Sheng, P. Li, and Y. Cui, Improved genetic algorithms based on data-driven operators for path planning of unmanned surface vehicle, International Journal of Robotics and Automation, 34(6), 2019, 713–722.
  22. [22] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous control with deep reinforcement learning, arXiv:1509.02971, 2015.
  23. [23] V. Mnih, K. Kavukcuoglu, D. Silver, et al., Playing atari with deep reinforcement learning, arXiv:1312.5602, 2013.
  24. [24] M. Deisenroth and C.E. Rasmussen, PILCO: A model-based and data-efficient approach to policy search, Proc. of the 28th Int. Conf. on Machine Learning (ICML-11), Bellevue, Washington, 2011, 465–472.
  25. [25] R. McAllister and C.E. Rasmussen, Data-efficient reinforcement learning in continuous state-action gaussian-pomdps, Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, 2040–2049.
  26. [26] D. Ha and J. Schmidhuber, Recurrent world models facilitate policy evolution, in Advances in Neural Information Processing Systems, Montreal, Canada, 2018, 2450–2462.
  27. [27] S. Niu, S. Chen, H. Guo, C. Targonski, M.C. Smith, and J. Kovaˇcevi´c, Generalized value iteration networks: Life beyond lattices, Thirty-Second AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA 2018.
  28. [28] L. Lee, E. Parisotto, D.S. Chaplot, E. Xing, and R. Salakhutdinov, Gated path planning networks, Int. Conf. on Machine Learning, Stockholm, Sweden, 2018, 2947–2955.
  29. [29] T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, Proceedings of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, CA, USA, 2016, 785–794.
  30. [30] P. Abbeel and A.Y. Ng, Apprenticeship learning via inverse reinforcement learning, Proc. of the Twenty-first Int. Conf. on Machine Learning, Alberta, Canada, 2004, 1.
  31. [31] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, WA, USA, 2016, 770–778.
  32. [32] E. Rohmer, S.P. Singh, and M. Freese, V-REP: A versatile and scalable robot simulation framework, 2013 IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Tokyo, Japan, 2013, 1321–1326.

Important Links:

Go Back