Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Varricchione, Giovanni; Alechina, Natasha; Dastani, Mehdi; Logan, Brian

doi:https://doi.org/10.1007/978-3-031-43264-4_21

Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Files

978-3-031-43264-4_21.pdf (667.52 KB)

Publication date

2023-09-07

Authors

Varricchione, Giovanni

Alechina, Natasha

Dastani, Mehdi

Logan, Brian

Editors

Malvone, Vadim

Murano, Aniello

DOI

https://doi.org/10.1007/978-3-031-43264-4_21

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

taverne

Abstract

Reward machines have recently been proposed as a means of encoding team tasks in cooperative multi-agent reinforcement learning. The resulting multi-agent reward machine is then decomposed into individual reward machines, one for each member of the team, allowing agents to learn in a decentralised manner while still achieving the team task. However, current work assumes the multi-agent reward machine to be given. In this paper, we show how reward machines for team tasks can be synthesised automatically from an Alternating-Time Temporal Logic specification of the desired team behaviour and a high-level abstraction of the agents’ environment. We present results suggesting that our automated approach has comparable, if not better, sample efficiency than reward machines generated by hand for multi-agent tasks.

Keywords

multi-agent reinforcement learning, reward machines, automatic synthesis, Taverne

Citation

Varricchione, G, Alechina, N, Dastani, M & Logan, B 2023, Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning. in V Malvone & A Murano (eds), Multi-Agent Systems - 20th European Conference, EUMAS 2023, Proceedings : 20th European Conference, EUMAS 2023, Naples, Italy, September 14–15, 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14282 LNAI, pp. 328–344. https://doi.org/10.1007/978-3-031-43264-4_21

URI

https://dspace.library.uu.nl/handle/1874/431803

Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI