Shivaram's Reading List

Function Approximation	Partial Observability	Learning Methods	Ensembles
Stochastic Optimisation	General RL	General ML	Multiagent Learning
Comparison/Integration	Bandits	Applications	Robot Soccer
Humanoids	Parameter	MDP	Empirical
Failure Warning	Representation	General AI	Neural Networks
All

Function Approximation

A Brief Survey of Parametric Value Function Approximation
Matthieu Geist and Olivier Pietquin, 2010
Details

Finite-Sample Analysis of LSTD
Alessandro Lazaric, Mohammad Ghavamzadeh, and Rémi Munos, 2010
Details

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Marek Petrik, Gavin Taylor, Ron Parr, and Shlomo Zilberstein, 2010
Details

The adaptive $k$-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
Carlos Diuk, Lihong Li, and Bethany R. Leffler, 2009
Details

Feature Selection for Value Function Approximation Using Bayesian Model Selection
Tobias Jung and Peter Stone, 2009
Details

Regularization and feature selection in least-squares temporal difference learning
J. Zico Kolter and Andrew Y. Ng, 2009
Details

Feature Discovery in Approximate Dynamic Programming
Philippe Preux, Sertan Girgin, and Manuel Loth, 2009
Details

Fast gradient-descent methods for temporal-difference learning with linear function approximation
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, and Eric Wiewiora, 2009
Details

Feature Discovery in Reinforcement Learning Using Genetic Programming
Sertan Girgin and Philippe Preux, 2008
Details

Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications
William B. Langdon, Riccardo Poli, Nicholas Freitag McPhee, and John R. Koza, 2008
Details

A worst-case comparison between temporal difference and residual gradient with linear function approximation
Lihong Li, 2008
Details

An analysis of reinforcement learning with function approximation
Francisco S. Melo, Sean P. Meyn, and M. Isabel Ribeiro, 2008
Details

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L. Littman, 2008
Details

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, and Michael Bowling, 2008
Details

Learning RoboCup-Keepaway with Kernels
Tobias Jung and Daniel Polani, 2007
Details

Learning classifier systems: a survey
Olivier Sigaud and Stewart W. Wilson, 2007
Details

Adaptive Representations for Reinforcement Learning
Shimon Azariah Whiteson, 2007
Details

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
Thomas Degris, Olivier Sigaud, and Pierre-Henri Wuillemin, 2006
Details

Tree-Based Batch Mode Reinforcement Learning
Damien Ernst, Pierre Geurts, and Louis Wehenkel, 2005
Details

Basis Function Adaptation in Temporal Difference Reinforcement Learning
Ishai Menache, Shie Mannor, and Nahum Shimkin, 2005
Details

Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
Details

Sparse cooperative Q-learning
Jelle R. Kok and Nikos Vlassis, 2004
Details

Convergence of synchronous reinforcement learning with linear function approximation
Artur Merke and Ralf Schoknecht, 2004
Details

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning
Bohdana Ratitch and Doina Precup, 2004
Details

Least-Squares Policy Iteration
Michail G. Lagoudakis and Ronald Parr, 2003
Details

Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail G. Lagoudakis and Ronald Parr, 2003
Details

Least Squares Policy Evaluation Algorithms with Linear Function Approximation
A. Nedić and D. P. Bertsekas, 2003
Details

A Convergent Form of Approximate Policy Iteration
Theodore J. Perkins and Doina Precup, 2003
Details

Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
Ralf Schoknecht, 2003
Details

Technical Update: Least-Squares Temporal Difference Learning
Justin A. Boyan, 2002
Details

Variable Resolution Discretization in Optimal Control
Rémi Munos and Andrew Moore, 2002
Details

Kernel-Based Reinforcement Learning
Dirk Ormoneit and Śaunak Sen, 2002
Details

Batch Value Function Approximation via Support Vectors
Thomas G. Dietterich and Xin Wang, 2001
Details

Max-norm Projections for Factored MDPs
Carlos Guestrin, Daphne Koller, and Ronald Parr, 2001
Details

Off-Policy Temporal Difference Learning with Function Approximation
Doina Precup, Richard S. Sutton, and Sanjoy Dasgupta, 2001
Details

On the Convergence of Temporal-Difference Learning with Linear Function Approximation
Vladislav Tadić, 2001
Details

Policy Iteration for Factored MDPs
Daphne Koller and Ronald Parr, 2000
Details

Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour, 2000
Details

Convergence of Reinforcement Learning With General Function Approximators
Vassilis A. Papavassiliou and Stuart Russell, 1999
Details

Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto, 1998
Details

Learning and Value Function Approximation in Complex Decision Processes
Benjamin Van Roy, 1998
Details

An analysis of temporal-difference learning with function approximation
John N. Tsitsiklis and Benjamin Van Roy, 1997
Details

Linear Least-Squares Algorithms for Temporal Difference Learning
Steven J. Bradtke and Andrew G. Barto, 1996
Details

Stable Fitted Reinforcement Learning
Geoffrey J. Gordon, 1996
Details

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
Richard S. Sutton, 1996
Details

Feature-based methods for large scale dynamic programming
John N. Tsitsiklis and Benjamin Van Roy, 1996
Details

Residual Algorithms: Reinforcement Learning with Function Approximation
Leemon Baird, 1995
Details

A Counterexample to Temporal Differences Learning
Dimitri P. Bertsekas, 1995
Details

Generalization in Reinforcement Learning: Safely Approximating the Value Function
Justin A. Boyan and Andrew W. Moore, 1995
Details

Stable Function Approximation in Dynamic Programming
Geoffrey J. Gordon, 1995
Details

The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
Andrew W. Moore and Christopher G. Atkeson, 1995
Details

Reinforcement Learning with Soft State Aggregation
Satinder P. Singh, Tommi Jaakkola, and Michael I. Jordan, 1995
Details

TD($łambda$) Converges with Probability 1
Peter Dayan and Terrence J. Sejnowski, 1994
Details

An Upper Bound on the Loss from Approximate Optimal-Value Functions
Satinder P. Singh and Richard C. Yee, 1994
Details

Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions
Ronald J. Williams and Leemon C. Baird III, 1994
Details

Reinforcement Learning Applied to Linear Quadratic Regulation
Steven J. Bradtke, 1993
Details

Approximating Q-Values with Basis Function Representations
Philip Sabes, 1993
Details

Online Learning with Random Representations
Richard S. Sutton and Steven D. Whitehead, 1993
Details

Issues in Using Function Approximation for Reinforcement Learning
Sebastian Thrun and Anton Schwartz, 1993
Details

The Convergence of TD($łambda$) for General $łambda$
Peter Dayan, 1992
Details

Practical Issues in Temporal Difference Learning
Gerald Tesauro, 1992
Details

Learning to Predict By the Methods of Temporal Differences
Richard S. Sutton, 1988
Details