DATA-EFFICIENT MODEL-BASED REINFORCEMENT LEARNING FOR ROBOT CONTROL

Ming Sun,∗ Yue Gao,∗∗ Wei Liu,∗ and Shaoyuan Li∗

References

  1. [1] H. Hasselt, A. Guez, and D. Silver, Deep reinforcement learningwith double q-learning, Thirtieth AAAI Conf. on ArtificialIntelligence, Phoenix, AZ, 2016.
  2. [2] S. Levine, C. Finn, T. Darrell, et al., End-to-end trainingof deep visuomotor policies, Journal of Machine LearningResearch, 17(1), 2016, 1334–1373.
  3. [3] T. Yan, W. Zhang, S.X. Yang, et al., Soft actor-critic reinforce-ment learning for robotic manipulator with hindsight experi-ence replay, International Journal of Robotics and Automation,34(5), 2019, 536–543.
  4. [4] M. Sadeghzadeh, D. Calvert, and H.A. Abdullah, Autonomousvisual servoing of a robot manipulator using reinforcementlearning, International Journal of Robotics and Automation,31(1), 2016, 26–38.
  5. [5] X.B. Peng, G. Berseth, K.K. Yin, et al., Deeploco: Dynamiclocomotion skills using hierarchical deep reinforcement learning,ACM Transactions on Graphics, 36(4), 2017, 1–13.
  6. [6] Y. Liu, M. Cong, H. Dong, et al., Reinforcement learning andEGA-based trajectory planning for dual robots, InternationalJournal of Robotics and Automation, 33(4), 2018, 367–378.7
  7. [7] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous controlwith deep reinforcement learning, Int. Conf. on LearningRepresentations, San Juan, Puerto Rico, 2016.
  8. [8] J. Schulman, F. Wolski, P. Dhariwal, et al., Proximal policyoptimization algorithms, arXiv:1707.06347, 2017.
  9. [9] L. Pinto, J. Davidson, R. Sukthankar, et al., Robust adversarialreinforcement learning, Proceedings of the 34th Int. Conf. onMachine Learning, Sydney, Australia, 2017.
  10. [10] A. Nagabandi, G. Kahn, R.S. Fearing,et al., Neural networkdynamics for model-based deep reinforcement learning withmodel-free fine-tuning, Int. Conf. on Robotics and Automation,Brisbane, Australia, 2018.
  11. [11] S. Ross and J.A. Bagnell, Agnostic system identification formodel-based reinforcement learning, Int. Conf. on MachineLearning, Edinburgh, Scotland, 2012.
  12. [12] S. Levine and V. Koltun, Guided policy search, Int. Conf. onMachine Learning, Bari, Italy, 2013.
  13. [13] S. Levine and P. Abbeel, Learning neural network policieswith guided policy search under unknown dynamics, Advancesin Neural Information Processing Systems, Montreal, Quebec,2014.
  14. [14] M.P. Deisenroth and C.E. Rasmussen, PILCO: A model-basedand data-efficient approach to policy search, Proc. of the 28thInt. Conf. on Machine Learning, Washington, DC, 2011.
  15. [15] E. Kaiser, J.N. Kutz, and S.L. Brunton, Sparse identificationof nonlinear dynamics for model predictive control in the low-data limit, Proceedings of the Royal Society A, 474(2219),2018, 1–25.
  16. [16] H. Schaeffer, Learning partial differential equations via datadiscovery and sparse optimization, Proceedings of the RoyalSociety A, 473(2197), 2017, 1–20.
  17. [17] J.C. Loiseau, B.R. Noack, and S.L. Brunton, Sparse reduced-order modelling: sensor-based dynamics to full-state estima-tion, Journal of Fluid Mechanics, 844, 2018, 459–490.
  18. [18] R.S. Sutton, Dyna, an integrated architecture for learning,planning, and reacting, ACM Sigart Bulletin, 2(4), 1991,160–163.
  19. [19] B. Bischoff, D. Nguyen-Tuong, H. Hoof, et al., Policy searchfor learning robot control using sparse data, Int. Conf. onRobotics and Automation, Hong Kong, China, 2014.
  20. [20] D. Bruder, C.D. Remy, and R. Vasudevan, Nonlinear systemidentification of soft robot dynamics using Koopman operatortheory, Int. Conf. on Robotics and Automation, Montreal, QC,2019.
  21. [21] Z. Gai, D. Liu, F. Chang, et al., Abnormal crowd behaviourdetection based on deep learning and sparse representation,International Journal of Robotics and Automation, 35(4), 2020,322–331.
  22. [22] R. Chinniah and S.S. Rani, A sparse based rain removal algo-rithm for image sequences, International Journal of Roboticsand Automation, 29(4), 2014, 441–447.
  23. [23] N. Kalouptsidis, G. Mileounis, B. Babadi, et al., Adaptivealgorithms for sparse system identification, Signal Processing,91(8), 2011, 1910–1919.
  24. [24] Y. Wang, X. Yan, M. Jiang, et al., 3D non-rigid structure frommotion based on sparse approximation in trajectory space,International Journal of Robotics and Automation, 33(2), 2018,111–117.
  25. [25] S.L. Brunton, J.L. Proctor, and J.N. Kutz, Discovering gov-erning equations from data: Sparse identification of nonlin-ear dynamical systems, Proceedings of the National Academyof Sciences of the United States of America, 113(15), 2016,3932–3937.
  26. [26] S.L. Brunton, J.L. Proctor, and J.N. Kutz, Sparse identificationof nonlinear dynamics with control (SINDYc), IFAC Symp. onNonlinear Control Systems, CA, USA, 2016.
  27. [27] C. Devin, A. Gupta, T. Darrell, et al., Learning modular neuralnetwork policies for multi-task and multi-robot transfer, IEEEInt. Conf. on Robotics and Automation, Singapore, 2017.

Important Links:

Go Back