S.P.M. Choi, D.Y. Yeung, and N.L. Zhang (PRC)
Reinforcement Learning, Non-stationary Environment
This paper proposes a novel alogrithm for a class of non stationary reinforcement learning problems in which the environmental changes are rare and finite. Through dis carding corrupted models and combining similar ones, the proposed algorithm maintains a collection of frequently encountered environment models and enables an effective adaptation when a similar environment recurs. The algorithm has empirically compared with the finite window approach, a widely-used method for non-stationary RL problems. Results have shown that our algorithm consistently outperforms the finite window approach in various empirical setups.
Important Links:
Go Back