Fast Online Q(lambda)
Publication date
1998
Authors
Wiering, M.A.
Schmidhuber, J.
Editors
Advisors
Supervisors
DOI
Document Type
Article
Metadata
Show full item recordCollections
License
Abstract
Q(lambda)-learning uses TD(lambda)-methods to accelerate Q-learning. The update complexity of previous online Q(lambda)implementations based on lookup-tables is bounded by the size of the
state-action space. Our faster algorithm's update complexity is bounded by the number of actions. The method is based on the observation that Q-value updates may be postponed until they are
needed.
Keywords
Reinforcement learning, Q-learning, TD(lambda), online Q(lambda), lazy learning