SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY

Li Yu, Tao Yan, Wen-An Zhang and Simon X. Yang

Keywords

Reinforcement learning, maximum entropy, robotic manipulation, hindsight experience replay

Abstract

The key challenges in applying reinforcement learning to complex robotic control tasks is the fragile convergence property, very high sample complexity, and the need to shape a reward function. In this work, we present a soft actor-critic (SAC) style algorithm, an off-policy actor-critic RL method based on the maximum entropy reinforcement learning framework, where the objective of the actor is to maximize the expected reward while also maximizing the entropy. This effectively improve the stability of the performance of algorithm and the robustness to modeling and estimate error. Moreover, we combine SAC with a new transition replay scheme called hindsight experience replay (HER) so as to make policy learning more efficiently from sparse rewards. Finally, the effectiveness of the proposed method is verified on a range of manipulation tasks in simulated environment.

Important Links:



Go Back