On learning soccer strategies

Salustowicz, R.; Wiering, M.A.; Schmidhuber, J.

On learning soccer strategies

Files

Wiering_97_onlearningsoccer.pdf (206.33 KB)

Publication date

1997

Authors

Salustowicz, R.

Wiering, M.A.

Schmidhuber, J.

Document Type

Article in proceedings

Metadata

Show full item record

Collections

Utrecht University Repository

Abstract

We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Probabilistic Incremental Program Evolution (PIPE). TD-Q is based on evaluation functions (EFs) mapping input/action pairs to expected reward, while PIPE searches policy space directly. PIPE uses an adaptive probability distribution to synthesize programs that calculate action probabilities from current inputs. Our results show that TD-Q has difficulties to learn appropriate shared EFs. PIPE, however, does not depend on EFs and finds good policies faster and more reliably.

URI

https://dspace.library.uu.nl/handle/1874/25434

On learning soccer strategies

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI