Policy Control in Multiagent System

A. Damba and S. Watanabe (Japan)


Optimal Control, Multiagent System, Reinforcement Learning, Monte Carlo Approach


This paper presents a method for simulating optimal poli cies which allow for action planning and optimal control in non-communicating multiagent system. In this system ho mogeneous agents have the same structure and domain but act and are situated differently in the world. Lack of information about each other’s internal state and observation inputs may lead to non-desirable predic tion in planning and control, since they may not be able to predict the world change. To cope with missing knowl edge, agents simulate each other’s behavior as part of envi ronment dynamic and build their own policy based on the local data from the simulation episodes. Reinforcement learning method is applied to derive the policy of future actions, where agents compute and up date the result repeatedly towards the goal. The Monte Carlo approach is used in solving the reinforcement prob lem from simulated experiences. Since each agent learns to adapt it’s policy to environ ment changes, the global picture is supposed to appear as multiagent coordination. Multiple vehicles domain, where there is no communication among the vehicles and sensing of vehicle is limited, is considered under simulation model.

Important Links:

Go Back