CS395T: Reinforcement Learning: Theory and Practice -- Fall 2004: Assignments Page
Assignments for
Reinforcement Learning: Theory and Practice
Week 0 (8/26): Class Overview
For each class (after the first), be sure to submit a question or comment about the reading by
10pm
on the day before any class that has a new reading assignment due
as an email in plain ascii text
. I prefer that is be sent in the body of the email, rather than as an attachment. Please use the subject line "class readings for [due date]".
Week 1 (8/31,9/2): Introduction
Chapter 1 of the textbook
Week 2 (9/7,9/9): Evaluative Feedback
Discussion leader:
Chendi
on Thursday.
Chapter 2 of the textbook
Week 3 (9/14,16): The Reinforcement Learning Problem
Chapter 3 of the textbook
Week 4 (9/21,23): Dynamic Programming
Discussion leader:
Susan
on Tuesday.
Chapter 4 of the textbook
Week 5 (9/28,9/30): Monte Carlo Methods
Discussion leader:
Lily
on Tuesday.
Chapter 5 of the textbook
Week 6 (10/5,7): Temporal Difference Learning
Discussion leader:
Matt
on Tuesday,
Mazda
on Thursday.
Chapter 6 of the textbook
Week 7 (10/12,14): Eligibility Traces
Discussion leader:
Sit
on Tuesday.
Chapter 7 of the textbook
Week 8 (10/19,21): Generalization and Function Approximation
Discussion leader:
Igor
on Tuesday.
Chapter 8 of the textbook
Class project proposal due at
12:30pm on Thursday.
Please send an email with subject "Project Proposal" with a proposed topic for your class project. I anticipate projects taking one of two forms.
Practice (preferred): An implemenation of RL in some domain of your choice - ideally one that you are using for research or in some other class. In this case, please describe the domain and your initial plans on how you intend to implement learning. What will the states and actions be? What algorithm(s) do you expect will be most effective?
Theory: A proposal, implementation and testing of an algorithmic modification to an RL algorithm presented in the book. In this case, please describe the modification you propose to investigate and on what type of domain (possibly a toy domain) it is likely to show an improvement over things considered in the book.
Week 9 (10/26,28): Planning and Learning
Discussion leader:
Greg
on Tuesday.
Chapter 9 of the textbook
Week 10 (11/2,4): Case Studies
Discussion leader:
Kurt
on Thursday.
Chapters 10 and 11 of the textbook
Week 11 (11/9,11): Abstraction: Options and Hierarchy
Discussion leader:
Alex
on Tuesday,
Jon
on Thursday.
Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
.
Sutton, R.S.
, Precup, D., Singh, S.
Artificial Intelligence 112:181-211, 1999.
Due
Tuesday
.
The MAXQ Method for Hierarchical Reinforcement Learning
.
Thomas G. Dietterich
Proceedings of the 15th International Conference on Machine Learning, 1998.
Due
Thursday
.
Week 12 (11/16,18): Helicopter and Robot Control
Discussion leader:
Michael
on Tuesday.
Autonomous Helicopter Control using Reinforcement Learning Policy Search Methods
.
J. Bagnell
and J. Schneider
Proceedings of the International Conference on Robotics and Automation 2001, IEEE, May, 2001.
Due
Tuesday
.
Inverted autonomous helicopter flight via reinforcement learning
.
Andrew Y. Ng
, Adam Coates, Mark Diel, Varun Ganapathi, Jamie Schulte, Ben Tse, Eric Berger and Eric Liang.
International Symposium on Experimental Robotics, 2004.
Due
Tuesday
.
Learning from Observation and Practice Using Primitives
.
Darrin Bentivegna
, Christopher Atkeson, and Gordon Cheng.
AAAI Fall Symposium on Real Life Reinforcement Learning, 2004.
Due
Thursday
.
Week 13 (11/23): Robot Soccer
Scaling Reinforcement Learning toward RoboCup Soccer
.
Peter Stone
and Richard S. Sutton.
Proceedings of the Eighteenth International Conference on Machine Learning, pp. 537-544, Morgan Kaufmann, San Francisco, CA, 2001.
Due
Tuesday
.
Reinforcement Learning for Sensing Strategies
.
C. Kwok and
D. Fox
.
Proceedings of IROS, 2004.
Due
Tuesday
.
Week 14 (11/30,12/2): Incorporating Advice
Creating Advice-Taking Reinforcement Learners
.
R. Maclin &
J. Shavlik
.
Machine Learning, 22, pp. 251-281, 1996.
Due
Tuesday
.
Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer
.
Gregory Kuhlmann
, Peter Stone, Raymond Mooney, and Jude Shavlik.
The AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems, July 2004.
Due
Thursday
.
[
Back to Department Homepage
]
Page maintained by
Peter Stone
Questions? Send me
mail