Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Monte Carlo Hierarchical Model Learning

Monte Carlo Hierarchical Model Learning.
Jacob Menashe and Peter Stone.
In Proceedings of the 14th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2015.

Download

[PDF]693.2kB  [postscript]18.4MB  

Abstract

Reinforcement learning (RL) is a well-established paradigm for enablingautonomous agents to learn from experience. To enable RL to scale toany but the smallest domains, it is necessary to make use ofabstraction and generalization of the state-action space, for example with a factored representation. However, to make effective use ofsuch a representation, it is necessary to determine which statevariables are relevant in which situations. In this work, we introduceT-UCT, a novel model-based RL approach for learning and exploiting thedynamics of structured hierarchical environments. When learning thedynamics while acting, a partial or inaccurate model may do more harm than good. T-UCT uses graph-based planning and Monte Carlo simulations to exploit models that may be incomplete or inaccurate, allowing it to both maximize cumulative rewards and ignore trajectories that are unlikely to succeed. T-UCTincorporates new experiences in the form of more accurate plans that span a greater area of the state space. T-UCT is fully implemented and compared empirically against B-VISA, the best known prior approach to the same problem. We show that T-UCT learns hierarchical models with fewer samples than B-VISAand that this effect is magnified at deeper levels of hierarchical complexity.

BibTeX Entry

@InProceedings{AAMAS15-Menashe,
  author = {Jacob Menashe and Peter Stone},
  title = {Monte Carlo Hierarchical Model Learning},
  booktitle = {Proceedings of the 14th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},
  location = {Istanbul, Turkey},
  month = {May},
  year = {2015},
  abstract = {
Reinforcement learning (RL) is a well-established paradigm for enabling
autonomous agents to learn from experience.  To enable RL to scale to
any but the smallest domains, it is necessary to make use of
abstraction and generalization of the state-action space, for example with a 
factored representation.  However, to make effective use of
such a representation, it is necessary to determine which state
variables are relevant in which situations.  In this work, we introduce
T-UCT, a novel model-based RL approach for learning and exploiting the
dynamics of structured hierarchical environments. When learning the
dynamics while acting, a partial or inaccurate model may do more harm than 
good. T-UCT uses graph-based planning and 
Monte Carlo simulations to exploit models that may be incomplete or 
inaccurate, allowing it to both maximize cumulative rewards and 
ignore trajectories that are unlikely to succeed. T-UCT
incorporates new experiences in the form of more accurate plans that span a 
greater area of the state space. T-UCT is fully implemented and compared 
empirically against B-VISA, the best known prior approach to the same problem. 
We show that T-UCT learns hierarchical models with fewer samples than B-VISA
and that this effect is magnified at deeper levels of hierarchical complexity.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Sun Nov 24, 2024 20:24:55