CS394R: Reinforcement Learning: Theory and Practice -- Spring 2011: Assignments Page

Assignments for Reinforcement Learning: Theory and Practice


Week 1 (1/18,20): Class Overview, Introduction

Jump to the resources page.

  • Chapter 1 of the textbook (due Thursday)
  • For each reading, be sure to submit a question or comment about the reading by 9pm on the day before class as an email in plain ascii text. I prefer that is be sent in the body of the email, rather than as an attachment. Please use the subject line "class readings for [due date]" and send to Peter and Doran (pstone@cs and doran.chakraborty@gmail). Please include your name in the response. And if you refer explicitly to the reading, please include page numbers.

  • Week 2 (1/25,27): Evaluative Feedback

    Jump to the resources page.

  • Chapter 2 of the textbook (due Tuesday)

  • Week 3 (2/1,3): The Reinforcement Learning Problem

    Jump to the resources page.

  • Chapter 3 of the textbook (due Tuesday)

  • Week 4 (2/8,10): Dynamic Programming

    Jump to the resources page.

  • Chapter 4 of the textbook (due Tuesday)

  • Week 5 (2/15,17): Monte Carlo Methods

    Jump to the resources page.

  • Chapter 5 of the textbook (due Tuesday)

  • Week 6 (2/22,24): Temporal Difference Learning

    Jump to the resources page.

  • Chapter 6 of the textbook (due Tuesday)

  • Week 7 (3/1,3): Eligibility Traces

    Jump to the resources page.

  • Chapter 7 of the textbook (due Tuesday)

  • Week 8 (3/8,10): Generalization and Function Approximation

    Jump to the resources page.

  • Chapter 8 of the textbook (due Tuesday)
  • Class project proposal due at 12:30pm on Thursday. Please send an email with subject "Project Proposal" with a proposed topic for your class project. I anticipate projects taking one of two forms.
  • Practice (preferred): An implemenation of RL in some domain of your choice - ideally one that you are using for research or in some other class. In this case, please describe the domain and your initial plans on how you intend to implement learning. What will the states and actions be? What algorithm(s) do you expect will be most effective?
  • Theory: A proposal, implementation and testing of an algorithmic modification to an RL algorithm presented in the book. In this case, please describe the modification you propose to investigate and on what type of domain (possibly a toy domain) it is likely to show an improvement over things considered in the book.

  • Week 9 (3/22,24): Planning and Learning

    Jump to the resources page.

  • Chapter 9 of the textbook (due Tuesday)

  • Week 10 (3/29,31): Game Playing

    Jump to the resources page.

  • Due Tuesday:
  • Tesauro, G., Temporal Difference Learning and TD-Gammon. Communication of the ACM, 1995
  • Pollack, J.B., & Blair, A.D. Co-evolution in the successful learning of backgammon strategy. Machine Learning, 1998
  • Tesauro, G. Comments on Co-Evolution in the Successful Learning of Backgammon Strategy. Machine Learning, 1998.
  • Due Thursday:
  • Bandit based Monte-Carlo Planning Levente Kocsis , Csaba Szepesvari In: ECML-06. Number 4212 in LNCS
  • S. Gelly and D. Silver. Achieving Master-Level Play in 9x9 Computer Go. In Proceedings of the 23rd Conference on Artificial Intelligence, Nectar Track (AAAI-08), 2008. Also available from here.

  • Week 11 (4/5,7): Efficient model-based learning

    Jump to the resources page.

    Due Tuesday:
  • R-Max - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
    Ronen Brafman and Moshe Tenenholtz
    The Journal of Machine Learning Research
  • Due Thursday:
  • Efficient Structure Learning in Factored-state MDPs
    Alexander L. Strehl, Carlos Diuk, and Michael L. Littman
    AAAI'2007

  • Week 12 (4/12,19): Abstraction: Options and Hierarchy

    Jump to the resources page.

    Due Tuesday:
  • Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.
    Sutton, R.S., Precup, D., Singh, S.
    Artificial Intelligence 112:181-211, 1999.
  • Due Thursday:
  • The MAXQ Method for Hierarchical Reinforcement Learning.
    Thomas G. Dietterich
    Proceedings of the 15th International Conference on Machine Learning, 1998.

  • Week 13 (4/19,21): Robotics applications

    Jump to the resources page.

    Due Tuesday:
  • Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion.
    Nate Kohl and Peter Stone
    In Proceedings of the IEEE International Conference on Robotics and Automation, May 2004.
  • Making a Robot Learn to Play Soccer Using Reward and Punishment.
    Heiko Müller, Martin Lauer, Roland Hafner, Sascha Lange, Artur Merke and Martin Riedmiller.
    30th Annual German Conference on AI, KI 2007.
  • Due Thursday:
  • Autonomous helicopter flight via reinforcement learning.
    Andrew Ng, H. Jin Kim, Michael Jordan and Shankar Sastry.
    In S. Thrun, L. Saul, and B. Schoelkopf (Eds.), Advances in Neural Information Processing Systems (NIPS) 17, 2004.

  • Week 14 (4/26,28): Least squares methods

    Jump to the resources page.

    due Tuesday:

  • Technical update: Least-squares temporal difference learning Justin A. Boyan
  • Model-Free Least-Squares Policy Iteration Michail G. Lagoudakis and Ronald Parr Proceedings of NIPS*2001: Neural Information Processing Systems: Natural and Synthetic Vancouver, BC, December 2001, pp. 1547-1554.

  • Week 15 (5/3,5): Multiagent RL

    Jump to the resources page.

    due Tuesday:
  • Kok, J.R. and Vlassis, N., Collaborative multiagent reinforcement learning by payoff propagation, The Journal of Machine Learning Research, 7, 1828, 2006.
  • due Thursday:
  • Michael Littman, Markov Games as a Framework for Multi-Agent Reinforcement Learning, ICML, 1994.
  • Final Project: due at 12:30pm on Thursday, 5/5


    [Back to Department Homepage]

    Page maintained by Peter Stone
    Questions? Send me mail