Model-Based Sparse Communication in Multi-Agent Reinforcement Learning

Publication date

2023

Authors

Han, ShuaiISNI 0000000523493781
Dastani, MehdiISNI 0000000043464658
Wang, ShihanISNI 0000000492960219

Editors

Advisors

Supervisors

Document Type

Part of book
Open Access logo

License

taverne

Abstract

Learning to communicate efficiently is central to multi-agent reinforcement learning (MARL). Existing methods often require agents to exchange messages intensively, which abuses communication channels and leads to high communication overhead. Only a few methods target on learning sparse communication, but they allow limited information to be shared, which affects the efficiency of policy learning. In this work, we propose model-based communication (MBC), a learning framework with a decentralized communication scheduling process. The MBC framework enables multiple agents to make decisions with sparse communication. In particular, the MBC framework introduces a model-based message estimator to estimate the up-to-date global messages using past local data. A decentralized message scheduling mechanism is also proposed to determine whether a message shall be sent based on the estimation. We evaluated our method in a variety of mixed cooperative-competitive environments. The experiment results show that the MBC method shows better performance and lower channel overhead than the state-of-art baselines.

Keywords

Communication Learning, Message Scheduling, Multi-Agent Reinforcement Learning, Multi-Agent System, Taverne, Software, Artificial Intelligence, Control and Systems Engineering

Citation

Han, S, Dastani, M & Wang, S 2023, Model-Based Sparse Communication in Multi-Agent Reinforcement Learning. in Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. vol. 2023-May, Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), pp. 439–447. https://doi.org/10.5555/3545946.3598669