• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning.
Shimon
Whiteson and Peter Stone.
In Proceedings of the Twenty-First National
Conference on Artificial Intelligence, pp. 518–23, July 2006.
AAAI
2006
[PDF]316.7kB [postscript]2.7MB
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world problems, learning this value function requires a function approximator, which maps state-action pairs to values via a concise, parameterized function. In practice, the success of function approximators depends on the ability of the human designer to select an appropriate representation for the value function. A recently developed approach called evolutionary function approximation uses evolutionary computation to automate the search for effective representations. While this approach can substantially improve the performance of TD methods, it requires many sample episodes to do so. We present an enhancement to evolutionary function approximation that makes it much more sample-efficient by exploiting the off-policy nature of certain TD methods. Empirical results in a server job scheduling domain demonstrate that the enhanced method can learn better policies than evolution or TD methods alone and can do so in many fewer episodes than standard evolutionary function approximation.
@InProceedings{AAAI06-shimon,
author="Shimon Whiteson and Peter Stone",
title="Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning",
booktitle="Proceedings of the Twenty-First National Conference on Artificial Intelligence",
month="July",year="2006",
pages="518--23",
abstract={
Reinforcement learning problems are commonly tackled
with temporal difference methods, which attempt to
estimate the agent's optimal value function. In
most real-world problems, learning this value
function requires a function approximator, which
maps state-action pairs to values via a concise,
parameterized function. In practice, the success of
function approximators depends on the ability of the
human designer to select an appropriate
representation for the value function. A recently
developed approach called evolutionary function
approximation uses evolutionary computation to
automate the search for effective representations.
While this approach can substantially improve the
performance of TD methods, it requires many sample
episodes to do so. We present an enhancement to
evolutionary function approximation that makes it
much more sample-efficient by exploiting the
off-policy nature of certain TD methods. Empirical
results in a server job scheduling domain
demonstrate that the enhanced method can learn
better policies than evolution or TD methods alone
and can do so in many fewer episodes than standard
evolutionary function approximation.
},
wwwnote={<a href="http://www.aaai.org/Conferences/AAAI/aaai06.php">AAAI 2006</a>},
}
Generated by bib2html.pl (written by Patrick Riley ) on Sat Nov 01, 2025 23:25:01