Function Approximation |   |   | Partial Observability |   |   | Learning Methods |   |   | Ensembles |   |   |
Stochastic Optimisation |   |   | General RL |   |   | General ML |   |   | Multiagent Learning |   |   |
Comparison/Integration |   |   | Bandits |   |   | Applications |   |   | Robot Soccer |   |   |
Humanoids |   |   | Parameter |   |   | MDP |   |   | Empirical |   |   |
Failure Warning |   |   | Representation |   |   | General AI |   |   | Neural Networks |   |   |
All |   |   |
Exploiting Best-Match Equations for Efficient Reinforcement Learning
Harm van Seijen, Shimon Whiteson, Hado van Hasselt, and Marco Wiering, 2011
Details
Insights in Reinforcement Learning: formal analysis and empirical evaluation of temporal-difference learning algorithms
Hado Philip van Hasselt, 2011
Details
Relative Entropy Policy Search
Jan Peters, Katharina Mülling, and Yasemin Altün, 2010
Details
Model-based reinforcement learning with nearly tight exploration complexity bounds
István Szita and Csaba Szepesvári, 2010
Details
Reinforcement learning of motor skills in high dimensions: A path integral approach
Evangelos Theodorou, Jonas Buchli, and Stefan Schaal, 2010
Details
The CMA Evolution Strategy: A Tutorial
Nikolaus Hansen, 2009
Details
Learning motor primitives for robotics
Jens Kober and Jan Peters, 2009
Details
Efficient covariance matrix update for variable metric evolution strategies
Thorsten Suttorp, Nikolaus Hansen, and Christian Igel, 2009
Details
A Theoretical and Empirical Analysis of Expected Sarsa
Harm van Seijen, Hado van Hasselt, Shimon Whiteson, and Marco Wiering, 2009
Details
Incremental Natural Actor-Critic Algorithms
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, and Mark Lee, 2008
Details
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
Faustino Gomez, Jürgen Schmidhuber, and Risto Miikkulainen, 2008
Details
Similarities and differences between policy gradient methods and evolution strategies
Verena Heidrich-Meisner and Christian Igel, 2008
Details
Evolution Strategies for Direct Policy Search
Verena Heidrich-Meisner and Christian Igel, 2008
Details
Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications
William B. Langdon, Riccardo Poli, Nicholas Freitag McPhee, and John R. Koza, 2008
Details
Analysis of an Evolutionary Reinforcement Learning Method in a Multiagent Domain
Jan Hendrik Metzen, Mark Edgington, Yohannes Kassahun, and Frank Kirchner, 2008
Details
Reinforcement learning of motor skills with policy gradients
Jan Peters and Stefan Schaal, 2008
Details
Natural Actor-Critic
Jan Peters and Stefan Schaal, 2008
Details
Sample-based Learning and Search with Permanent and Transient Memories
David Silver, Richard S. Sutton, and Martin Müller, 2008
Details
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, and Michael Bowling, 2008
Details
Sample Complexity of Policy Search with Known Dynamics
Peter L. Bartlett and Ambuj Tewari, 2007
Details
Bayesian actor-critic algorithms
Mohammad Ghavamzadeh and Yaakov Engel, 2007
Details
Bayesian Policy Gradient Algorithms
Mohammad Ghavamzadeh and Yaakov Engel, 2007
Details
Batch Reinforcement Learning in a Complex Domain
Shivaram Kalyanakrishnan and Peter Stone, 2007
Details
Large Scale Reinforcement Learning using Q-Sarsa($łambda$) and Cascading Neural Networks
Steffen Nissen, 2007
Details
Representation Transfer for Reinforcement Learning
Matthew E. Taylor and Peter Stone, 2007
Details
Adaptive Representations for Reinforcement Learning
Shimon Azariah Whiteson, 2007
Details
Evolutionary Function Approximation for Reinforcement Learning
Shimon Whiteson and Peter Stone, 2006
Details
On-line evolutionary computation for reinforcement learning in stochastic domains
Shimon Whiteson and Peter Stone, 2006
Details
Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
Details
A Tutorial on the Cross-Entropy Method
Pieter-Tjerk de Boer, Dirk P. Kroese, Shie Mannor, and Reuven Y. Rubinstein, 2005
Details
Machine Learning for Fast Quadrupedal Locomotion
Nate Kohl and Peter Stone, 2004
Details
Efficient Evolution of Neural Networks Through Complexification
Kenneth Owen Stanley, 2004
Details
On Actor-Critic Algorithms
Vijay R. Konda and John N. Tsitsiklis, 2003
Details
Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail G. Lagoudakis and Ronald Parr, 2003
Details
Scaling Internal-State Policy-Gradient Methods for POMDPs
Douglas Aberdeen and Jonathan Baxter, 2002
Details
Approximately Optimal Approximate Reinforcement Learning
Sham Kakade and John Langford, 2002
Details
Learning from Scarce Experience
Leonid Peshkin and Christian R. Shelton, 2002
Details
Infinite-Horizon Policy-Gradient Estimation
Jonathan Baxter and Peter L. Bartlett, 2001
Details
A Natural Policy Gradient
Sham Kakade, 2001
Details
Reinforcement Learning in POMDP's via Direct Gradient Ascent
Jonathan Baxter and Peter L. Bartlett, 2000
Details
Policy Search via Density Estimation
Andrew Y. Ng, Ronald Parr, and Daphne Koller, 2000
Details
PEGASUS: A policy search method for large MDPs and POMDPs
Andrew Y. Ng and Michael Jordan, 2000
Details
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour, 2000
Details
Gradient Descent for General Reinforcement Learning
Leemon Baird and Andrew Moore, 1999
Details
Solving Non-Markovian Control Tasks with Neuro-Evolution
Faustino J. Gomez and Risto Miikkulainen, 1999
Details
Evolutionary Algorithms for Reinforcement Learning
David E. Moriarty, Alan C. Schultz, and John J. Grefenstette, 1999
Details
Robot Shaping: An Experiment in Behavior Engineering
Marco Dorigo and Marco Colombetti, 1998
Details
Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto, 1998
Details
Neuro-Dynamic Programming
Dimitri P. Bertsekas and John N. Tsitsiklis, 1996
Details
Reinforcement learning with replacing eligibility traces
Satinder P. Singh and Richard S. Sutton, 1996
Details
On-line Q-learning using connectionist systems
G. A. Rummery and M. Niranjan, 1994
Details
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time
Andrew W. Moore and Christopher G. Atkeson, 1993
Details
Efficient learning and planning within the Dyna framework
Jing Peng and Ronald J. Williams, 1993
Details
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
Long-Ji Lin, 1992
Details
Q-Learning
Christopher J. C. H. Watkins and Peter Dayan, 1992
Details
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
Ronald J. Williams, 1992
Details
Learning Sequential Decision Rules Using Simulation Models and Competition
John J. Grefenstette, Connie Loggia Ramsey, and Alan C. Schultz, 1990
Details
Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
Richard S. Sutton, 1990
Details
Neuronlike adaptive elements that can solve difficult learning control problems
Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson, 1983
Details