| Function Approximation |   |   | Partial Observability |   |   | Learning Methods |   |   | Ensembles |   |   | 
| Stochastic Optimisation |   |   | General RL |   |   | General ML |   |   | Multiagent Learning |   |   | 
| Comparison/Integration |   |   | Bandits |   |   | Applications |   |   | Robot Soccer |   |   | 
| Humanoids |   |   | Parameter |   |   | MDP |   |   | Empirical |   |   | 
| Failure Warning |   |   | Representation |   |   | General AI |   |   | Neural Networks |   |   | 
| All |   |   | 
 A Brief Survey of Parametric Value Function Approximation
 Matthieu Geist and  Olivier Pietquin, 2010
    Details   
 Finite-Sample Analysis of LSTD
 Alessandro Lazaric,  Mohammad Ghavamzadeh, and  Rémi Munos, 2010
    Details   
 Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
 Marek Petrik,  Gavin Taylor,  Ron Parr, and  Shlomo Zilberstein, 2010
    Details   
 The adaptive $k$-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
 Carlos Diuk,  Lihong Li, and  Bethany R. Leffler, 2009
    Details   
 Feature Selection for Value Function Approximation Using Bayesian Model Selection
 Tobias Jung and  Peter Stone, 2009
    Details   
 Regularization and feature selection in least-squares temporal difference learning
 J. Zico Kolter and  Andrew Y. Ng, 2009
    Details   
 Feature Discovery in Approximate Dynamic Programming
 Philippe Preux,  Sertan Girgin, and  Manuel Loth, 2009
    Details   
 Fast gradient-descent methods for temporal-difference learning with linear function approximation
 Richard S. Sutton,  Hamid Reza Maei,  Doina Precup,  Shalabh Bhatnagar,  David Silver,  Csaba Szepesvári, and  Eric Wiewiora, 2009
    Details   
 Feature Discovery in Reinforcement Learning Using Genetic Programming
 Sertan Girgin and  Philippe Preux, 2008
    Details   
 Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications
 William B. Langdon,  Riccardo Poli,  Nicholas Freitag McPhee, and  John R. Koza, 2008
    Details   
 A worst-case comparison between temporal difference and residual gradient with linear function approximation
 Lihong Li, 2008
    Details   
 An analysis of reinforcement learning with function approximation
 Francisco S. Melo,  Sean P. Meyn, and  M. Isabel Ribeiro, 2008
    Details   
 An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
 Ronald Parr,  Lihong Li,  Gavin Taylor,  Christopher Painter-Wakefield, and  Michael L. Littman, 2008
    Details   
 Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
 Richard S. Sutton,  Csaba Szepesvári,  Alborz Geramifard, and  Michael Bowling, 2008
    Details   
 Learning RoboCup-Keepaway with Kernels
 Tobias Jung and  Daniel Polani, 2007
    Details   
 Learning classifier systems: a survey
 Olivier Sigaud and  Stewart W. Wilson, 2007
    Details   
 Adaptive Representations for Reinforcement Learning
 Shimon Azariah Whiteson, 2007
    Details   
 Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
 Thomas Degris,  Olivier Sigaud, and  Pierre-Henri Wuillemin, 2006
    Details   
 Tree-Based Batch Mode Reinforcement Learning
 Damien Ernst,  Pierre Geurts, and  Louis Wehenkel, 2005
    Details   
 Basis Function Adaptation in Temporal Difference Reinforcement Learning
 Ishai Menache,  Shie Mannor, and  Nahum Shimkin, 2005
    Details   
 Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
 Martin Riedmiller, 2005
    Details   
 Sparse cooperative Q-learning
 Jelle R. Kok and  Nikos Vlassis, 2004
    Details   
 Convergence of synchronous reinforcement learning with linear function approximation
 Artur Merke and  Ralf Schoknecht, 2004
    Details   
 Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning
 Bohdana Ratitch and  Doina Precup, 2004
    Details   
 Least-Squares Policy Iteration
 Michail G. Lagoudakis and  Ronald Parr, 2003
    Details   
 Reinforcement Learning as Classification: Leveraging Modern Classifiers
 Michail G. Lagoudakis and  Ronald Parr, 2003
    Details   
 Least Squares Policy Evaluation Algorithms with Linear Function Approximation
 A. Nedić and  D. P. Bertsekas, 2003
    Details   
 A Convergent Form of Approximate Policy Iteration
 Theodore J. Perkins and  Doina Precup, 2003
    Details   
 Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
 Ralf Schoknecht, 2003
    Details   
 Technical Update: Least-Squares Temporal Difference Learning
 Justin A. Boyan, 2002
    Details   
 Variable Resolution Discretization in Optimal Control
 Rémi Munos and  Andrew Moore, 2002
    Details   
 Kernel-Based Reinforcement Learning
 Dirk Ormoneit and  Śaunak Sen, 2002
    Details   
 Batch Value Function Approximation via Support Vectors
 Thomas G. Dietterich and  Xin Wang, 2001
    Details   
 Max-norm Projections for Factored MDPs
 Carlos Guestrin,  Daphne Koller, and  Ronald Parr, 2001
    Details   
 Off-Policy Temporal Difference Learning with Function Approximation
 Doina Precup,  Richard S. Sutton, and  Sanjoy Dasgupta, 2001
    Details   
 On the Convergence of Temporal-Difference Learning with Linear Function Approximation
 Vladislav Tadić, 2001
    Details   
 Policy Iteration for Factored MDPs
 Daphne Koller and  Ronald Parr, 2000
    Details   
 Policy Gradient Methods for Reinforcement Learning with Function Approximation
 Richard S. Sutton,  David A. McAllester,  Satinder P. Singh, and  Yishay Mansour, 2000
    Details   
 Convergence of Reinforcement Learning With General Function Approximators
 Vassilis A. Papavassiliou and  Stuart Russell, 1999
    Details   
 Reinforcement Learning: An Introduction
 Richard S. Sutton and  Andrew G. Barto, 1998
    Details   
 Learning and Value Function Approximation in Complex Decision Processes
 Benjamin Van Roy, 1998
    Details   
 An analysis of temporal-difference learning with function approximation
 John N. Tsitsiklis and  Benjamin Van Roy, 1997
    Details   
 Linear Least-Squares Algorithms for Temporal Difference Learning
 Steven J. Bradtke and  Andrew G. Barto, 1996
    Details   
 Stable Fitted Reinforcement Learning
 Geoffrey J. Gordon, 1996
    Details   
 Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
 Richard S. Sutton, 1996
    Details   
 Feature-based methods for large scale dynamic programming
 John N. Tsitsiklis and  Benjamin Van Roy, 1996
    Details   
 Residual Algorithms: Reinforcement Learning with Function Approximation
 Leemon Baird, 1995
    Details   
 A Counterexample to Temporal Differences Learning
 Dimitri P. Bertsekas, 1995
    Details   
 Generalization in Reinforcement Learning: Safely Approximating the Value Function
 Justin A. Boyan and  Andrew W. Moore, 1995
    Details   
 Stable Function Approximation in Dynamic Programming
 Geoffrey J. Gordon, 1995
    Details   
 The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
 Andrew W. Moore and  Christopher G. Atkeson, 1995
    Details   
 Reinforcement Learning with Soft State Aggregation
 Satinder P. Singh,  Tommi Jaakkola, and  Michael I. Jordan, 1995
    Details   
 TD($łambda$) Converges with Probability 1
 Peter Dayan and  Terrence J. Sejnowski, 1994
    Details   
 An Upper Bound on the Loss from Approximate Optimal-Value Functions
 Satinder P. Singh and  Richard C. Yee, 1994
    Details   
 Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions
 Ronald J. Williams and  Leemon C. Baird III, 1994
    Details   
 Reinforcement Learning Applied to Linear Quadratic Regulation
 Steven J. Bradtke, 1993
    Details   
 Approximating Q-Values with Basis Function Representations
 Philip Sabes, 1993
    Details   
 Online Learning with Random Representations
 Richard S. Sutton and  Steven D. Whitehead, 1993
    Details   
 Issues in Using Function Approximation for Reinforcement Learning
 Sebastian Thrun and  Anton Schwartz, 1993
    Details   
 The Convergence of TD($łambda$) for General $łambda$
 Peter Dayan, 1992
    Details   
 Practical Issues in Temporal Difference Learning
 Gerald Tesauro, 1992
    Details   
 Learning to Predict By the Methods of Temporal Differences
 Richard S. Sutton, 1988
    Details