Shivaram's Reading List


Function Approximation     Partial Observability     Learning Methods     Ensembles    
Stochastic Optimisation     General RL     General ML     Multiagent Learning    
Comparison/Integration     Bandits     Applications     Robot Soccer    
Humanoids     Parameter     MDP     Empirical    
Failure Warning     Representation     General AI     Neural Networks    
All    

Function Approximation

A Brief Survey of Parametric Value Function Approximation
Matthieu Geist and Olivier Pietquin, 2010
Details   

Finite-Sample Analysis of LSTD
Alessandro Lazaric, Mohammad Ghavamzadeh, and Rémi Munos, 2010
Details   

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Marek Petrik, Gavin Taylor, Ron Parr, and Shlomo Zilberstein, 2010
Details   

The adaptive $k$-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
Carlos Diuk, Lihong Li, and Bethany R. Leffler, 2009
Details   

Feature Selection for Value Function Approximation Using Bayesian Model Selection
Tobias Jung and Peter Stone, 2009
Details   

Regularization and feature selection in least-squares temporal difference learning
J. Zico Kolter and Andrew Y. Ng, 2009
Details   

Feature Discovery in Approximate Dynamic Programming
Philippe Preux, Sertan Girgin, and Manuel Loth, 2009
Details   

Fast gradient-descent methods for temporal-difference learning with linear function approximation
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, and Eric Wiewiora, 2009
Details   

Feature Discovery in Reinforcement Learning Using Genetic Programming
Sertan Girgin and Philippe Preux, 2008
Details   

Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications
William B. Langdon, Riccardo Poli, Nicholas Freitag McPhee, and John R. Koza, 2008
Details   

A worst-case comparison between temporal difference and residual gradient with linear function approximation
Lihong Li, 2008
Details   

An analysis of reinforcement learning with function approximation
Francisco S. Melo, Sean P. Meyn, and M. Isabel Ribeiro, 2008
Details   

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L. Littman, 2008
Details   

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, and Michael Bowling, 2008
Details   

Learning RoboCup-Keepaway with Kernels
Tobias Jung and Daniel Polani, 2007
Details   

Learning classifier systems: a survey
Olivier Sigaud and Stewart W. Wilson, 2007
Details   

Adaptive Representations for Reinforcement Learning
Shimon Azariah Whiteson, 2007
Details   

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
Thomas Degris, Olivier Sigaud, and Pierre-Henri Wuillemin, 2006
Details   

Tree-Based Batch Mode Reinforcement Learning
Damien Ernst, Pierre Geurts, and Louis Wehenkel, 2005
Details   

Basis Function Adaptation in Temporal Difference Reinforcement Learning
Ishai Menache, Shie Mannor, and Nahum Shimkin, 2005
Details   

Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
Details   

Sparse cooperative Q-learning
Jelle R. Kok and Nikos Vlassis, 2004
Details   

Convergence of synchronous reinforcement learning with linear function approximation
Artur Merke and Ralf Schoknecht, 2004
Details   

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning
Bohdana Ratitch and Doina Precup, 2004
Details   

Least-Squares Policy Iteration
Michail G. Lagoudakis and Ronald Parr, 2003
Details   

Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail G. Lagoudakis and Ronald Parr, 2003
Details   

Least Squares Policy Evaluation Algorithms with Linear Function Approximation
A. Nedić and D. P. Bertsekas, 2003
Details   

A Convergent Form of Approximate Policy Iteration
Theodore J. Perkins and Doina Precup, 2003
Details   

Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
Ralf Schoknecht, 2003
Details   

Technical Update: Least-Squares Temporal Difference Learning
Justin A. Boyan, 2002
Details   

Variable Resolution Discretization in Optimal Control
Rémi Munos and Andrew Moore, 2002
Details   

Kernel-Based Reinforcement Learning
Dirk Ormoneit and Śaunak Sen, 2002
Details   

Batch Value Function Approximation via Support Vectors
Thomas G. Dietterich and Xin Wang, 2001
Details   

Max-norm Projections for Factored MDPs
Carlos Guestrin, Daphne Koller, and Ronald Parr, 2001
Details   

Off-Policy Temporal Difference Learning with Function Approximation
Doina Precup, Richard S. Sutton, and Sanjoy Dasgupta, 2001
Details   

On the Convergence of Temporal-Difference Learning with Linear Function Approximation
Vladislav Tadić, 2001
Details   

Policy Iteration for Factored MDPs
Daphne Koller and Ronald Parr, 2000
Details   

Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour, 2000
Details   

Convergence of Reinforcement Learning With General Function Approximators
Vassilis A. Papavassiliou and Stuart Russell, 1999
Details   

Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto, 1998
Details   

Learning and Value Function Approximation in Complex Decision Processes
Benjamin Van Roy, 1998
Details   

An analysis of temporal-difference learning with function approximation
John N. Tsitsiklis and Benjamin Van Roy, 1997
Details   

Linear Least-Squares Algorithms for Temporal Difference Learning
Steven J. Bradtke and Andrew G. Barto, 1996
Details   

Stable Fitted Reinforcement Learning
Geoffrey J. Gordon, 1996
Details   

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
Richard S. Sutton, 1996
Details   

Feature-based methods for large scale dynamic programming
John N. Tsitsiklis and Benjamin Van Roy, 1996
Details   

Residual Algorithms: Reinforcement Learning with Function Approximation
Leemon Baird, 1995
Details   

A Counterexample to Temporal Differences Learning
Dimitri P. Bertsekas, 1995
Details   

Generalization in Reinforcement Learning: Safely Approximating the Value Function
Justin A. Boyan and Andrew W. Moore, 1995
Details   

Stable Function Approximation in Dynamic Programming
Geoffrey J. Gordon, 1995
Details   

The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
Andrew W. Moore and Christopher G. Atkeson, 1995
Details   

Reinforcement Learning with Soft State Aggregation
Satinder P. Singh, Tommi Jaakkola, and Michael I. Jordan, 1995
Details   

TD($łambda$) Converges with Probability 1
Peter Dayan and Terrence J. Sejnowski, 1994
Details   

An Upper Bound on the Loss from Approximate Optimal-Value Functions
Satinder P. Singh and Richard C. Yee, 1994
Details   

Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions
Ronald J. Williams and Leemon C. Baird III, 1994
Details   

Reinforcement Learning Applied to Linear Quadratic Regulation
Steven J. Bradtke, 1993
Details   

Approximating Q-Values with Basis Function Representations
Philip Sabes, 1993
Details   

Online Learning with Random Representations
Richard S. Sutton and Steven D. Whitehead, 1993
Details   

Issues in Using Function Approximation for Reinforcement Learning
Sebastian Thrun and Anton Schwartz, 1993
Details   

The Convergence of TD($łambda$) for General $łambda$
Peter Dayan, 1992
Details   

Practical Issues in Temporal Difference Learning
Gerald Tesauro, 1992
Details   

Learning to Predict By the Methods of Temporal Differences
Richard S. Sutton, 1988
Details