Fast Online Q(lambda)

Publication date

1998

Authors

Wiering, M.A.
Schmidhuber, J.

Editors

Advisors

Supervisors

DOI

Document Type

Article
Open Access logo

License

Abstract

Q(lambda)-learning uses TD(lambda)-methods to accelerate Q-learning. The update complexity of previous online Q(lambda)implementations based on lookup-tables is bounded by the size of the state-action space. Our faster algorithm's update complexity is bounded by the number of actions. The method is based on the observation that Q-value updates may be postponed until they are needed.

Keywords

Reinforcement learning, Q-learning, TD(lambda), online Q(lambda), lazy learning

Citation