CS3934: Reinforcement Learning -- Final Project

Final Project


 
Your final programming project can take one of two forms.
  • Practice (preferred): An implemenation of RL in some domain of your choice - ideally one that you are using for research or in some other class. In this case, please describe the domain and your initial plans on how you intend to implement learning. What will the states and actions be? What algorithm(s) do you expect will be most effective?
  • Theory: A proposal, implementation and testing of an algorithmic modification to an RL algorithm presented in the book. In this case, please describe the modification you propose to investigate and on what type of domain (possibly a toy domain) it is likely to show an improvement over things considered in the book.
  • You may try to build on some of the research papers you have read in the class; you may try to reimplement something you've found interesting that others have done; you can try to do something that has never been done before; you may write new code from scratch; you may modify existing code. It's up to you!

    You are encouraged to work in teams of 2. You may work individually if you prefer. Teams should only turn in one submission of the programming portion of the assignment. However, each person must turn in an independently-written proposal, survey, and final report.

    The schedule is as follows.


  • Project Proposal due on Thursday, October 13th at 11:59pm.
  •     Submit a proposal as plain text by email to Sanmit and Peter, including:  
       
    The proposal should be written with the goal of convincing us that what you are proposing to do is interesting and non-trivial (though not necessarily completely original - see below). Members of 2-person teams should clearly identify what their roles will be with relation to the overall project.

    It is completely legitimate to propose to do something based on something you read about provided that you are going to do the coding yourself. Just make sure to acknowledge any ideas (and code) that you borrow and be sure to clearly identify what you are going to do.

    Be as specific as you can at this point. The more specific you are, the more detailed feedback you will get.

  • Literature Survey due on Thursday, November 10th at 11:59pm.
  •     Submit a literature survey of the work most closely related to your project to Sanmit and Peter. It should begin with a summary of your current plans for the final project. If nothing has changed since the proposal, you can use the same text. If you have changed your plans in some way, please specify how and why. Then the survey should include at least 10 references, some of which can be from the class readings. For each, you should discuss how it differs from or is similar to the work you plan to turn in for your final project. A good survey discusses each of the references at a technical level - not just what is done, but also how. Please put the references at the end, with full reference information (authors, title, date, publication venue, etc.). I suggest writing this as if it were a section of a research paper, so that you can then use it directly in your final report. 

  • Project demos in class on Tuesday and Thursday, November 29th and December 1st at 9:30am.

  • Prepare to describe and demonstrate your project to your classmates.

  • Your final Project is due on Thursday, December 8th at 9:30am.
       Submit your final project including:

  • Source code and executable. 
  • A README providing a brief guide to following your code (including which files are most relevant to look at, etc.). Make sure it is absolutely clear how to run your code.
  • A detailed written report describing your project, including its merits, and its deficincies. As much as possible, you should relate your approach to the readings from throughout the course. View this report as a term paper. It is in place of a final exam and will be a large factor in your final grade for the project and for the course. The report should be roughly in the style of a conference paper, including introduction, motivation, related work, etc.
  • Members of teams should clearly identify what their roles have been in relation to the overall project. All writing should be your own -- even if you are describing a component produced by your partner.
  • Include at least 10 citations with full bibliographic references to acknowledge where your ideas came from.
  • Be very clear about what code you've used from other sources, if any. Clear citations are essential. Failure to credit ideas and code from external sources is cheating.
  • Make sure you evaluate both the good and bad points of your approach.
  • Show results of at least one experiment evaluating some aspect of or your entire approach, preferably showing error bars or some sort of statistical measure of the significance. Even if you didn't accomplish your goal, evaluate what you did do.
  • If any parameteres are mentioned in the report, be sure to mention how you arrived at their values. Was it the first thing you tried? Trial and error? Roughly how many trials? etc.
  • Remember to proofread and spell-check!

  • [Back to Class Homepage]

    Page maintained by Peter Stone and Sanmit Narvekar
    Questions? Send me mail