PILLAR Project in UTCS

Project for Interactive Learning from Language Advice and Reinforcements

The goal of the PILLAR project is to broaden the communication channel between machine learners and their human teachers. This can be achieved (1) by allowing human users to give natural language advice to help a reinforcement learning agent improve performance; and (2) by allowing agents to actively solicit advice and other forms of tutorial feedback when it is needed.

This is joint work with Prof. Jude Shavlik's research group in Department of Computer Sciences at the University of Wisconsin at Madison.

The project is supported by a grant HR0011-04-1-0007 from the DARPA Information Processing Technology Office. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA or the US Government.

Meetings

The PILLAR researchers at UT meet biweekly to discuss papers in the areas of reinforcement learning and natural language learning. Our next scheduled meeting will be on 6/1/2005 (Wed) at 11:00 am in ACES 3.408, where we will discuss:

Richard Maclin, Jude Shavlik, Lisa Torrey, Trevor Walker and Edward Wild
Giving Advice about Preferred Actions to Reinforcement Learners via Knowledge-Based Kernel Regression
To appear in Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-2005), 2005.

Previously discussed papers

Lilyana Mihalkova and Raymond Mooney, Using Active Relocation to Aid Reinforcement Learning. Under Review, 2005.
Trevor Walker, Jude Shavlik and Richard Maclin, Relational Reinforcement Learning via Sampling the Space of First-Order Conjunctive Features. In Proceedings of the ICML Workshop on Relational Reinforcement Learning, Banff, Canada, 2004.
Ashwin Srinivasan, A Study of Two Probabilistic Methods for Searching Large Spaces with ILP. Technical Report PRG-TR-16-00, Oxford University Computing Laboratory, Oxford, 2000.
Kurt Driessens and Saso Dzeroski, Integrating Guidance into Relational Reinforcement Learning. Machine Learning, 57 (3): 271-304, December 2004.
Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko and Alexander Yates, Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability. In Proceedings of the 20th International Conference on Computational Linguistics (COLING), 2004.
Ana-Maria Popescu, Oren Etzioni and Henry Kautz, Towards a Theory of Natural Language Interfaces to Databases. In Proceedings of the International Conference on Intelligent User Interfaces, 2003.
Pieter Abbeel and Andrew Y. Ng, Apprenticeship Learning via Inverse Reinforcement Learning. In Proceedings of the Twenty-first International Conference on Machine Learning, 2004.
Andrew Y. Ng and Stuart Russell, Algorithms for Inverse Reinforcement Learning. In Proceedings of the Seventeenth International Conference on Machine Learning, 2000.
Eduardo Morales and Claude Sammut, Learning to Fly by Combining Reinforcement with Behavioural Cloning. In Proceedings of the Twenty-first International Conference on Machine Learning, 2004.
David Andre and Stuart Russell, Programmable Reinforcement Learning Agents. In Advances in Neural Information Processing Systems 13, 2001.
David Andre and Stuart Russell, State Abstraction for Programmable Reinforcement Learning Agents. In Proceedings of AAAI-02, 2002.
Vinay Papudesi and Manfred Huber, Learning from Reinforcement and Advice Using Composite Reward Functions. In Proceedings of the Sixteenth International FLAIRS Conference, pp. 361-365, 2003.
Vinay Papudesi, Y. Wang, Manfred Huber and Diane Cook, Integrating User Commands and Autonomous Task Performance in a Reinforcement Learning Framework. In Proceedings of AAAI Spring Symposium on Human Interaction with Autonomous Systems in Complex Environments, 2003.
Scott Huffman and John Laird, Flexibly Instructable Agents. Journal of Artificial Intelligence Research 3, pp. 271-324, 1995.
Paul Utgoff and Jeffery Clouse, Two Kinds of Training Information for Evaluation Function Learning. In Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 596-600, 1991.
Jeffery Clouse and Paul Utgoff, A Teaching Method for Reinforcement Learning. In Proceedings of the Ninth International Conference on Machine Learning, pp. 92-101, 1992.

Publications

Rohit J. Kate, Yuk Wah Wong and Raymond J. Mooney, Learning to Transform Natural to Formal Languages. In the Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-2005), July 2005.
Richard Maclin, Jude Shavlik, Lisa Torrey, Trevor Walker and Edward Wild, Giving Advice about Preferred Actions to Reinforcement Learners via Knowledge-Based Kernel Regression. In the Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-2005), July 2005.
Ruifang Ge and Raymond J. Mooney, A Statistical Semantic Parser that Integrates Syntax and Semantics. To appear in Proceedings of the Ninth Conference on Computational Natural Language Learning (CONLL-05), June 2005.
Gregory Kuhlmann, Peter Stone, Raymond Mooney and Jude Shavlik, Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer. In the AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems, July 2004.

Researchers at UT

Prof. Ray Mooney - (mooney@cs.utexas.edu)
Prof. Risto Miikkulainen - (risto@cs.utexas.edu)
Prof. Peter Stone - (pstone@cs.utexas.edu)
Ruifang Ge - (grf@cs.utexas.edu)
Rohit Kate - (rjkate@cs.utexas.edu)
Greg Kuhlmann - (kuhlmann@cs.utexas.edu)
Prem Melville - (melville@cs.utexas.edu)
Lilyana Mihalkova - (lilyanam@cs.utexas.edu)
Ken Stanley - (kstanley@cs.utexas.edu)
Matt Taylor - (mtaylor@cs.utexas.edu)
Yuk Wah (John) Wong - (ywwong@cs.utexas.edu)
Chern Han Yong - (yongch@cs.utexas.edu)

Researchers at U-Wisc

Prof. Jude Shavlik - (shavlik@cs.wisc.edu)
Prof. Olvi Mangasarian - (olvi@cs.wisc.edu)
Rich Maclin - (rmaclin@d.umn.edu)
Lisa Torrey - (ltorrey@cs.wisc.edu)
Trevor Walker - (twalker@cs.wisc.edu)
Ted Wild - (wildt@cs.wisc.edu)

Yuk Wah Wong

Last modified: Wed Feb 2 12:56:26 CST 2005