J. Li and T. Duckett (Sweden)
Q-learning, Resource allocating network, Robot behaviour
The use of artificial neural networks for approximating value functions in reinforcement learning is a common practice, but usually requires much work on designing the network architecture and refining of the network parame ters. In this paper we present a simple learning system that uses Q-learning with a resource allocating network (RAN) for behaviour learning in mobile robotics. The resource allocating network is used as a function approximator to dynamically represent the continuous sensory space, thus acquiring the sensorimotor mapping for generalization; and Q-learning is used to learn the control policy in ‘off-policy’ fashion that enables the human operator to guide the initial learning process, thus speeding up the reinforcement learn ing. We illustrate our approach using a PeopleBot robot to acquire a wall-following behaviour, and discuss some ob servations on the convergence and online training of our learning algorithm in the experiments.
Important Links:
Go Back