Speeding Up Q(lambda)-learning

Wiering, M.A.; Schmidhuber, J.

Speeding Up Q(lambda)-learning

Files

Wiering_98_speedingup.pdf (1.6 MB)

Publication date

1998

Authors

Wiering, M.A.

Schmidhuber, J.

Document Type

Article in proceedings

Metadata

Show full item record

Collections

Utrecht University Repository

Abstract

Q(lambda)-learning uses TD(lambda)-methods to accelerate Q learning. The worst case complexity for a single update step of previous online Q(lambda)implementations based on lookup tables is bounded by the size of the state action space. Our faster algorithm's worst case complexity is bounded by the number of actions. The algorithm is based on the observation that Q value updates may be postponed until they are needed.

Keywords

Reinforcement learning, Q-learning, TD(lambda), online Q(lambda), lazy learning

URI

https://dspace.library.uu.nl/handle/1874/25457

Speeding Up Q(lambda)-learning

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI